All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] virtio-pmem: PMEM device spec
@ 2021-08-13  9:52 Pankaj Gupta
  2021-08-17 15:26 ` Stefan Hajnoczi
  2021-08-18  9:22 ` Cornelia Huck
  0 siblings, 2 replies; 9+ messages in thread
From: Pankaj Gupta @ 2021-08-13  9:52 UTC (permalink / raw)
  To: virtio-dev
  Cc: stefanha, dan.j.williams, david, mst, cohuck, tstark,
	pankaj.gupta, Pankaj Gupta

Posting virtio specification for virtio pmem device. Virtio pmem is a
paravirtualized device which allows the guest to bypass page cache.
Virtio pmem kernel driver is merged in Upstream Kernel 5.3. Also, Qemu
device is merged in Qemu 4.1.

Signed-off-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
---
changes from v1 -> v2
-----------------------
Thanks to Cornelia, Stefan & David for the v1 review.
- Use device & driver name instead of host & guest.
- Remove implementation details from the spec.
- Define FLUSH_REQUEST.
- Other suggested changes.

 conformance.tex |  18 ++++++-
 content.tex     |   1 +
 virtio-pmem.tex | 128 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 145 insertions(+), 2 deletions(-)
 create mode 100644 virtio-pmem.tex

diff --git a/conformance.tex b/conformance.tex
index 94d7a06..822eaa5 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -31,7 +31,8 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \ref{sec:Conformance / Driver Conformance / Sound Driver Conformance},
 \ref{sec:Conformance / Driver Conformance / Memory Driver Conformance},
 \ref{sec:Conformance / Driver Conformance / I2C Adapter Driver Conformance} or
-\ref{sec:Conformance / Driver Conformance / SCMI Driver Conformance}.
+\ref{sec:Conformance / Driver Conformance / SCMI Driver Conformance},
+\ref{sec:Conformance / Driver Conformance / PMEM Driver Conformance}.
 
     \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}.
   \end{itemize}
@@ -55,7 +56,8 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \ref{sec:Conformance / Device Conformance / Sound Device Conformance},
 \ref{sec:Conformance / Device Conformance / Memory Device Conformance},
 \ref{sec:Conformance / Device Conformance / I2C Adapter Device Conformance} or
-\ref{sec:Conformance / Device Conformance / SCMI Device Conformance}.
+\ref{sec:Conformance / Device Conformance / SCMI Device Conformance},
+\ref{sec:Conformance / Device Conformance / PMEM Driver Conformance}.
 
     \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}.
   \end{itemize}
@@ -301,6 +303,18 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Device Types / SCMI Device / Device Operation / Setting Up eventq Buffers}
 \end{itemize}
 
+\conformance{\subsection}{PMEM Driver Conformance}\label{sec:Conformance / Driver Conformance / PMEM Driver Conformance}
+
+A PMEM driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / PMEM Device / Device Initialization}
+\item \ref{drivernormative:Device Types / PMEM Driver / Driver Initialization / Virtio flush}
+\item \ref{drivernormative:Device Types / PMEM Driver / Driver Operation / Virtqueue command}
+\item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue flush}
+\item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
+\end{itemize}
+
 \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
 
 A device MUST conform to the following normative statements:
diff --git a/content.tex b/content.tex
index 31b02e1..08d4a92 100644
--- a/content.tex
+++ b/content.tex
@@ -6583,6 +6583,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
 \input{virtio-mem.tex}
 \input{virtio-i2c.tex}
 \input{virtio-scmi.tex}
+\input{virtio-pmem.tex}
 
 \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
diff --git a/virtio-pmem.tex b/virtio-pmem.tex
new file mode 100644
index 0000000..247313a
--- /dev/null
+++ b/virtio-pmem.tex
@@ -0,0 +1,128 @@
+\section{PMEM Device}\label{sec:Device Types / PMEM Device}
+
+The virtio pmem device is a persistent memory (NVDIMM) device
+provide a virtio based asynchronous flush mechanism. This avoids the
+need of a separate page cache in the guest and keeps the page cache
+only in the host. Under memory pressure, the host makes use of
+efficient memory reclaim decisions for page cache pages of all the
+guests. This helps to reduce the memory footprint and fit more guests
+in the host system.
+
+The virtio pmem device provides access to byte-addressable persistent
+memory. The persist memory is directly accessible as a Shared Memory Region.
+Data written to this memory is made persistent by separately sending a
+flush command. Writes that have been flushed are preserved across device
+reset and power failure.
+
+\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
+  27
+
+\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
+\begin{description}
+\item[0] req_vq
+\end{description}
+
+\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature bits}
+
+There are currently no feature bits defined for this device.
+
+\subsection{Device configuration layout}\label{sec:Device Types / PMEM Device / Device configuration layout}
+
+\begin{lstlisting}
+struct virtio_pmem_config {
+	le64 start;
+	le64 size;
+};
+\end{lstlisting}
+
+\begin{description}
+\item[\field{start}] contains the start address from the device physical address range
+to be hotplugged into the driver address space.
+
+\item[\field{size}] contains the length of this address range.
+\end{description}
+
+\begin{enumerate}
+\item Driver vpmem start is read from \field{start}.
+\item Driver vpmem end is read from \field{size}.
+\end{enumerate}
+
+\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver / Driver Initialization}
+
+The driver determines the start address and size of the persist memory region in preparation for reading or writing data.
+
+The driver initializes req_vq in preparation for making flush requests.
+
+\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
+
+The driver MUST implement a virtio based flushing interface.
+
+\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver / Driver Operation}
+\drivernormative{\subsubsection}{Driver Operation: Virtqueue command}{Device Types / PMEM Driver / Driver Operation / Virtqueue command}
+
+\begin{lstlisting}
+struct virtio_pmem_req {
+        __le32 type;
+};
+\end{lstlisting}
+
+Virtio pmem flush request:
+\begin{lstlisting}
+#define VIRTIO_PMEM_REQ_TYPE_FLUSH      0
+\end{lstlisting}
+
+The driver MUST send VIRTIO_PMEM_REQ_TYPE_FLUSH command on request virtqueue.
+
+The driver SHOULD be able to handle concurrent FLUSH requests.
+
+\subsection{Device Operations}\label{sec:Device Types / PMEM Driver / Device Operation}
+\devicenormative{\subsubsection}{Device Operation: Virtqueue flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
+
+The device MUST ensure that all writes made before a flush request will persist across device reset and power failure before completing the flush request.
+
+\devicenormative{\subsubsection}{Device Operation: Virtqueue return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
+
+The device MUST return integer "0" for success and "-1" for failure.
+
+\subsection{Possible security implications}\label{sec:Device Types / PMEM Device / Possible Security Implications}
+
+There could be potential security implications depending on how
+memory mapped device backing file is used. By default device emulation
+is done with SHARED mapping. There is a contract between driver and device
+process to access same backing file for read or write operations.
+
+If a malicious driver or device map the same backing file, attacking
+process can make use of known cache side channel attacks to predict
+the current state of shared page cache page. If both attacker and 
+victim somehow execute same shared code after a flush or evict call, 
+with difference in execution timing attacker could infer another driver
+local data or device data. Though this is not easy and same challenges
+exist as with bare metal device system when userspace share same backing file.
+
+\subsection{Countermeasures}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures}
+
+\subsubsection{ With SHARED mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / SHARED}
+
+If device backing backing file is shared with multiple driver or device, 
+this may act as a metric for page cache side channel attack. As a counter
+measure every driver should have its own(not shared with another driver)
+SHARED backing file and gets populated a per device page cache pages.
+
+\subsubsection{ With PRIVATE mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / PRIVATE}
+There maybe be chances of side channels attack with PRIVATE
+memory mapping similar to SHARED with read-only shared mappings.
+PRIVATE is not used for virtio pmem making this usecase
+irrelevant.
+
+\subsubsection{ Workload specific mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Workload}
+For SHARED mapping, if workload is single application inside
+the driver and there is no risk with sharing of data between the devices.
+Driver sharing same backing file with SHARED mapping can be
+used as a valid configuration.
+
+\subsubsection{ Prevent cache eviction}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Cache eviction}
+Don't allow cache evict from driver filesystem trim or discard command
+with virtio pmem. This rules out any possibility of evict-reload
+page cache side channel attacks if backing disk is shared(SHARED)
+with mutliple drivers. Though if we use per device backing file with
+shared mapping this countermeasure is not required.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-13  9:52 [PATCH v2] virtio-pmem: PMEM device spec Pankaj Gupta
@ 2021-08-17 15:26 ` Stefan Hajnoczi
  2021-08-17 20:11   ` Pankaj Gupta
  2021-08-18  9:22 ` Cornelia Huck
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2021-08-17 15:26 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: virtio-dev, dan.j.williams, david, mst, cohuck, tstark, pankaj.gupta

[-- Attachment #1: Type: text/plain, Size: 5020 bytes --]

On Fri, Aug 13, 2021 at 11:52:16AM +0200, Pankaj Gupta wrote:
> +\section{PMEM Device}\label{sec:Device Types / PMEM Device}
> +
> +The virtio pmem device is a persistent memory (NVDIMM) device
> +provide a virtio based asynchronous flush mechanism. This avoids the

s/provide/that provides/

> +need of a separate page cache in the guest and keeps the page cache
> +only in the host. Under memory pressure, the host makes use of
> +efficient memory reclaim decisions for page cache pages of all the
> +guests. This helps to reduce the memory footprint and fit more guests

s/fit/fits/

> +in the host system.
> +
> +The virtio pmem device provides access to byte-addressable persistent
> +memory. The persist memory is directly accessible as a Shared Memory Region.

This patch doesn't describe the Shared Memory Region:

s/Shared Memory Region/range of system memory/

> +Data written to this memory is made persistent by separately sending a
> +flush command. Writes that have been flushed are preserved across device
> +reset and power failure.
> +
> +\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
> +  27
> +
> +\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
> +\begin{description}
> +\item[0] req_vq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature bits}
> +
> +There are currently no feature bits defined for this device.
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / PMEM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_pmem_config {
> +	le64 start;
> +	le64 size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{start}] contains the start address from the device physical address range
> +to be hotplugged into the driver address space.

"device physical address" is not a term defined in the specification.
Addresses are either physical memory addresses or platform-specific
translated addresses (VIRTIO_F_ACCESS_PLATFORM). Both are called
"physical address" in the spec. Hotplugging into the "driver address
space" is also undefined.

I suggest:

  contains the physical address of the first byte of the persistent
  memory region.

> +
> +\item[\field{size}] contains the length of this address range.
> +\end{description}
> +
> +\begin{enumerate}
> +\item Driver vpmem start is read from \field{start}.
> +\item Driver vpmem end is read from \field{size}.
> +\end{enumerate}
> +
> +\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver / Driver Initialization}
> +
> +The driver determines the start address and size of the persist memory region in preparation for reading or writing data.

s/persist/persistent/

> +
> +The driver initializes req_vq in preparation for making flush requests.
> +
> +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> +
> +The driver MUST implement a virtio based flushing interface.

I think this can be dropped. The only way to make writes persistent is
by sending flush requests. There is no need to say that the driver has
to use this interface. It doesn't really have to, but then it won't be
able to guarantee persistence.

> +
> +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver / Driver Operation}
> +\drivernormative{\subsubsection}{Driver Operation: Virtqueue command}{Device Types / PMEM Driver / Driver Operation / Virtqueue command}
> +
> +\begin{lstlisting}
> +struct virtio_pmem_req {
> +        __le32 type;

The spec uses le32 instead of __le32.

> +};
> +\end{lstlisting}
> +
> +Virtio pmem flush request:
> +\begin{lstlisting}
> +#define VIRTIO_PMEM_REQ_TYPE_FLUSH      0
> +\end{lstlisting}
> +
> +The driver MUST send VIRTIO_PMEM_REQ_TYPE_FLUSH command on request virtqueue.

What does this mean?

> +
> +The driver SHOULD be able to handle concurrent FLUSH requests.

I think this statement can be dropped. The driver submits the flush
requests, so it doesn't make sense to say it has to handle concurrent
flush requests. Nothing forces the driver to have multiple requests in
flight at the same time. If the driver doesn't want to have concurrent
flush requests it can submit them sequentially.

> +
> +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver / Device Operation}
> +\devicenormative{\subsubsection}{Device Operation: Virtqueue flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
> +
> +The device MUST ensure that all writes made before a flush request will persist across device reset and power failure before completing the flush request.
> +
> +\devicenormative{\subsubsection}{Device Operation: Virtqueue return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
> +
> +The device MUST return integer "0" for success and "-1" for failure.

What type of integer? le32?

How is this returned, I don't see a struct definition, diagram, or table
showing the request layout?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-17 15:26 ` Stefan Hajnoczi
@ 2021-08-17 20:11   ` Pankaj Gupta
  2021-08-18  8:45     ` [virtio-dev] " Cornelia Huck
  0 siblings, 1 reply; 9+ messages in thread
From: Pankaj Gupta @ 2021-08-17 20:11 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: virtio-dev, Dan Williams, David Hildenbrand, Michael S . Tsirkin,
	Cornelia Huck, Taylor Stark, Pankaj Gupta

> > +\section{PMEM Device}\label{sec:Device Types / PMEM Device}
> > +
> > +The virtio pmem device is a persistent memory (NVDIMM) device
> > +provide a virtio based asynchronous flush mechanism. This avoids the
>
> s/provide/that provides/

o.k

>
> > +need of a separate page cache in the guest and keeps the page cache
> > +only in the host. Under memory pressure, the host makes use of
> > +efficient memory reclaim decisions for page cache pages of all the
> > +guests. This helps to reduce the memory footprint and fit more guests
>
> s/fit/fits/

o.k

>
> > +in the host system.
> > +
> > +The virtio pmem device provides access to byte-addressable persistent
> > +memory. The persist memory is directly accessible as a Shared Memory Region.
>
> This patch doesn't describe the Shared Memory Region:
>
> s/Shared Memory Region/range of system memory/

Sure. Copy-pasted. Will correct.

>
> > +Data written to this memory is made persistent by separately sending a
> > +flush command. Writes that have been flushed are preserved across device
> > +reset and power failure.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
> > +  27
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] req_vq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature bits}
> > +
> > +There are currently no feature bits defined for this device.
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / PMEM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_config {
> > +     le64 start;
> > +     le64 size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{start}] contains the start address from the device physical address range
> > +to be hotplugged into the driver address space.
>
> "device physical address" is not a term defined in the specification.
> Addresses are either physical memory addresses or platform-specific
> translated addresses (VIRTIO_F_ACCESS_PLATFORM). Both are called
> "physical address" in the spec. Hotplugging into the "driver address
> space" is also undefined.
>
> I suggest:
>
>   contains the physical address of the first byte of the persistent
>   memory region.

Sure.
>
> > +
> > +\item[\field{size}] contains the length of this address range.
> > +\end{description}
> > +
> > +\begin{enumerate}
> > +\item Driver vpmem start is read from \field{start}.
> > +\item Driver vpmem end is read from \field{size}.
> > +\end{enumerate}
> > +
> > +\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver / Driver Initialization}
> > +
> > +The driver determines the start address and size of the persist memory region in preparation for reading or writing data.
>
> s/persist/persistent/

Sure.

>
> > +
> > +The driver initializes req_vq in preparation for making flush requests.
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> > +
> > +The driver MUST implement a virtio based flushing interface.
>
> I think this can be dropped. The only way to make writes persistent is
> by sending flush requests. There is no need to say that the driver has
> to use this interface. It doesn't really have to, but then it won't be
> able to guarantee persistence.

I am not 100 percent sure on this.

>
> > +
> > +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver / Driver Operation}
> > +\drivernormative{\subsubsection}{Driver Operation: Virtqueue command}{Device Types / PMEM Driver / Driver Operation / Virtqueue command}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_req {
> > +        __le32 type;
>
> The spec uses le32 instead of __le32.

o.k

>
> > +};
> > +\end{lstlisting}
> > +
> > +Virtio pmem flush request:
> > +\begin{lstlisting}
> > +#define VIRTIO_PMEM_REQ_TYPE_FLUSH      0
> > +\end{lstlisting}
> > +
> > +The driver MUST send VIRTIO_PMEM_REQ_TYPE_FLUSH command on request virtqueue.
>
> What does this mean?

Will delete this as well.

>
> > +
> > +The driver SHOULD be able to handle concurrent FLUSH requests.
>
> I think this statement can be dropped. The driver submits the flush
> requests, so it doesn't make sense to say it has to handle concurrent
> flush requests. Nothing forces the driver to have multiple requests in
> flight at the same time. If the driver doesn't want to have concurrent
> flush requests it can submit them sequentially.

o.k. Sure.

>
> > +
> > +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver / Device Operation}
> > +\devicenormative{\subsubsection}{Device Operation: Virtqueue flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
> > +
> > +The device MUST ensure that all writes made before a flush request will persist across device reset and power failure before completing the flush request.
> > +
> > +\devicenormative{\subsubsection}{Device Operation: Virtqueue return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
> > +
> > +The device MUST return integer "0" for success and "-1" for failure.
>
> What type of integer? le32?

yes. will add struct definition as well.

>
> How is this returned, I don't see a struct definition, diagram, or table
> showing the request layout?

Thank you!


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [virtio-dev] Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-17 20:11   ` Pankaj Gupta
@ 2021-08-18  8:45     ` Cornelia Huck
  2021-08-19  5:53       ` Pankaj Gupta
  0 siblings, 1 reply; 9+ messages in thread
From: Cornelia Huck @ 2021-08-18  8:45 UTC (permalink / raw)
  To: Pankaj Gupta, Stefan Hajnoczi
  Cc: virtio-dev, Dan Williams, David Hildenbrand, Michael S . Tsirkin,
	Taylor Stark, Pankaj Gupta

On Tue, Aug 17 2021, Pankaj Gupta <pankaj.gupta.linux@gmail.com> wrote:

>> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
>> > +
>> > +The driver MUST implement a virtio based flushing interface.
>>
>> I think this can be dropped. The only way to make writes persistent is
>> by sending flush requests. There is no need to say that the driver has
>> to use this interface. It doesn't really have to, but then it won't be
>> able to guarantee persistence.
>
> I am not 100 percent sure on this.

I don't think we want this in a normative section, but maybe there's a
place for it in a section that describes the general operation?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [virtio-dev] Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-13  9:52 [PATCH v2] virtio-pmem: PMEM device spec Pankaj Gupta
  2021-08-17 15:26 ` Stefan Hajnoczi
@ 2021-08-18  9:22 ` Cornelia Huck
  2021-08-19  6:03   ` Pankaj Gupta
  1 sibling, 1 reply; 9+ messages in thread
From: Cornelia Huck @ 2021-08-18  9:22 UTC (permalink / raw)
  To: Pankaj Gupta, virtio-dev
  Cc: stefanha, dan.j.williams, david, mst, tstark, pankaj.gupta, Pankaj Gupta

On Fri, Aug 13 2021, Pankaj Gupta <pankaj.gupta.linux@gmail.com> wrote:

> Posting virtio specification for virtio pmem device. Virtio pmem is a
> paravirtualized device which allows the guest to bypass page cache.
> Virtio pmem kernel driver is merged in Upstream Kernel 5.3. Also, Qemu
> device is merged in Qemu 4.1.
>
> Signed-off-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> ---
> changes from v1 -> v2
> -----------------------
> Thanks to Cornelia, Stefan & David for the v1 review.
> - Use device & driver name instead of host & guest.
> - Remove implementation details from the spec.
> - Define FLUSH_REQUEST.
> - Other suggested changes.
>
>  conformance.tex |  18 ++++++-
>  content.tex     |   1 +
>  virtio-pmem.tex | 128 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 145 insertions(+), 2 deletions(-)
>  create mode 100644 virtio-pmem.tex

(...)

> +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver / Driver Operation}
> +\drivernormative{\subsubsection}{Driver Operation: Virtqueue command}{Device Types / PMEM Driver / Driver Operation / Virtqueue command}
> +
> +\begin{lstlisting}
> +struct virtio_pmem_req {
> +        __le32 type;
> +};
> +\end{lstlisting}
> +
> +Virtio pmem flush request:
> +\begin{lstlisting}
> +#define VIRTIO_PMEM_REQ_TYPE_FLUSH      0
> +\end{lstlisting}
> +
> +The driver MUST send VIRTIO_PMEM_REQ_TYPE_FLUSH command on request virtqueue.
> +
> +The driver SHOULD be able to handle concurrent FLUSH requests.

I'm wondering what part of all of this should go into a normative
section, and what into a more descriptive paragraph.

Contants, structure definitions etc. can go into a descriptive
paragraph; I don't think we need to spell out that a conforming device
or driver actually needs to use those...

We maybe should have a non-normative paragraph that describes the normal
operation; i.e. the driver queues a flush request on the request queue
when it wants to persist something, the device acts on the flush request
and persists the writes made before the flush. That section could also
give some helpful hints to implementors that should not be binding (if
you have anything like that.)

The driver normative section could then be something like "The driver
MUST NOT consider any writes prior to a flush command to be persistent
until after the device successfully processed that flush command" or
something like that. The device normative section for the flush
operation looks fine to me.

> +
> +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver / Device Operation}
> +\devicenormative{\subsubsection}{Device Operation: Virtqueue flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
> +
> +The device MUST ensure that all writes made before a flush request will persist across device reset and power failure before completing the flush request.
> +
> +\devicenormative{\subsubsection}{Device Operation: Virtqueue return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
> +
> +The device MUST return integer "0" for success and "-1" for failure.
> +
> +\subsection{Possible security implications}\label{sec:Device Types / PMEM Device / Possible Security Implications}
> +
> +There could be potential security implications depending on how
> +memory mapped device backing file is used. By default device emulation
> +is done with SHARED mapping. There is a contract between driver and device
> +process to access same backing file for read or write operations.
> +
> +If a malicious driver or device map the same backing file, attacking
> +process can make use of known cache side channel attacks to predict
> +the current state of shared page cache page. If both attacker and 
> +victim somehow execute same shared code after a flush or evict call, 
> +with difference in execution timing attacker could infer another driver
> +local data or device data. Though this is not easy and same challenges
> +exist as with bare metal device system when userspace share same backing file.

I'm not 100% sure about this section. Is it more or less Linux-specific,
or can it be made a bit more implementation agnostic? As this is not my
area of expertise, maybe others can chime in.

> +
> +\subsection{Countermeasures}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures}
> +
> +\subsubsection{ With SHARED mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / SHARED}
> +
> +If device backing backing file is shared with multiple driver or device, 
> +this may act as a metric for page cache side channel attack. As a counter
> +measure every driver should have its own(not shared with another driver)
> +SHARED backing file and gets populated a per device page cache pages.
> +
> +\subsubsection{ With PRIVATE mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / PRIVATE}
> +There maybe be chances of side channels attack with PRIVATE
> +memory mapping similar to SHARED with read-only shared mappings.
> +PRIVATE is not used for virtio pmem making this usecase
> +irrelevant.
> +
> +\subsubsection{ Workload specific mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Workload}
> +For SHARED mapping, if workload is single application inside
> +the driver and there is no risk with sharing of data between the devices.
> +Driver sharing same backing file with SHARED mapping can be
> +used as a valid configuration.
> +
> +\subsubsection{ Prevent cache eviction}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Cache eviction}
> +Don't allow cache evict from driver filesystem trim or discard command
> +with virtio pmem. This rules out any possibility of evict-reload
> +page cache side channel attacks if backing disk is shared(SHARED)
> +with mutliple drivers. Though if we use per device backing file with
> +shared mapping this countermeasure is not required.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-18  8:45     ` [virtio-dev] " Cornelia Huck
@ 2021-08-19  5:53       ` Pankaj Gupta
  2021-08-19  9:38         ` [virtio-dev] " Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Pankaj Gupta @ 2021-08-19  5:53 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Stefan Hajnoczi, virtio-dev, Dan Williams, David Hildenbrand,
	Michael S . Tsirkin, Taylor Stark, Pankaj Gupta

> >> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> >> > +
> >> > +The driver MUST implement a virtio based flushing interface.
> >>
> >> I think this can be dropped. The only way to make writes persistent is
> >> by sending flush requests. There is no need to say that the driver has
> >> to use this interface. It doesn't really have to, but then it won't be
> >> able to guarantee persistence.
> >
> > I am not 100 percent sure on this.
>
> I don't think we want this in a normative section, but maybe there's a
> place for it in a section that describes the general operation?

I was saying because this is the must property for virtio-pmem device
which is emulated
on regular block device. virtio-pmem above real persistent devices
don't need any explicit flush.

This can be enhanced later for real persistent devices which don't
need explicit flush
and thus driver writers don't have to worry about it.

>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [virtio-dev] Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-18  9:22 ` Cornelia Huck
@ 2021-08-19  6:03   ` Pankaj Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Pankaj Gupta @ 2021-08-19  6:03 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: virtio-dev, Stefan Hajnoczi, Dan Williams, David Hildenbrand,
	Michael S . Tsirkin, Taylor Stark, Pankaj Gupta

> > Posting virtio specification for virtio pmem device. Virtio pmem is a
> > paravirtualized device which allows the guest to bypass page cache.
> > Virtio pmem kernel driver is merged in Upstream Kernel 5.3. Also, Qemu
> > device is merged in Qemu 4.1.
> >
> > Signed-off-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> > ---
> > changes from v1 -> v2
> > -----------------------
> > Thanks to Cornelia, Stefan & David for the v1 review.
> > - Use device & driver name instead of host & guest.
> > - Remove implementation details from the spec.
> > - Define FLUSH_REQUEST.
> > - Other suggested changes.
> >
> >  conformance.tex |  18 ++++++-
> >  content.tex     |   1 +
> >  virtio-pmem.tex | 128 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 145 insertions(+), 2 deletions(-)
> >  create mode 100644 virtio-pmem.tex
>
> (...)
>
> > +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver / Driver Operation}
> > +\drivernormative{\subsubsection}{Driver Operation: Virtqueue command}{Device Types / PMEM Driver / Driver Operation / Virtqueue command}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_req {
> > +        __le32 type;
> > +};
> > +\end{lstlisting}
> > +
> > +Virtio pmem flush request:
> > +\begin{lstlisting}
> > +#define VIRTIO_PMEM_REQ_TYPE_FLUSH      0
> > +\end{lstlisting}
> > +
> > +The driver MUST send VIRTIO_PMEM_REQ_TYPE_FLUSH command on request virtqueue.
> > +
> > +The driver SHOULD be able to handle concurrent FLUSH requests.
>
> I'm wondering what part of all of this should go into a normative
> section, and what into a more descriptive paragraph.
>
> Contants, structure definitions etc. can go into a descriptive
> paragraph; I don't think we need to spell out that a conforming device
> or driver actually needs to use those...
>
> We maybe should have a non-normative paragraph that describes the normal
> operation; i.e. the driver queues a flush request on the request queue
> when it wants to persist something, the device acts on the flush request
> and persists the writes made before the flush. That section could also
> give some helpful hints to implementors that should not be binding (if
> you have anything like that.)
>
> The driver normative section could then be something like "The driver
> MUST NOT consider any writes prior to a flush command to be persistent
> until after the device successfully processed that flush command" or

I am not sure about this as this comes under implementation details and can vary
as per operating system.

I can move struct to descriptive section. Only reason for keeping it
in normative section
was its very small and corresponding. But i will move.

> something like that. The device normative section for the flush
> operation looks fine to me.

Thanks.
>
> > +
> > +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver / Device Operation}
> > +\devicenormative{\subsubsection}{Device Operation: Virtqueue flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
> > +
> > +The device MUST ensure that all writes made before a flush request will persist across device reset and power failure before completing the flush request.
> > +
> > +\devicenormative{\subsubsection}{Device Operation: Virtqueue return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
> > +
> > +The device MUST return integer "0" for success and "-1" for failure.
> > +
> > +\subsection{Possible security implications}\label{sec:Device Types / PMEM Device / Possible Security Implications}
> > +
> > +There could be potential security implications depending on how
> > +memory mapped device backing file is used. By default device emulation
> > +is done with SHARED mapping. There is a contract between driver and device
> > +process to access same backing file for read or write operations.
> > +
> > +If a malicious driver or device map the same backing file, attacking
> > +process can make use of known cache side channel attacks to predict
> > +the current state of shared page cache page. If both attacker and
> > +victim somehow execute same shared code after a flush or evict call,
> > +with difference in execution timing attacker could infer another driver
> > +local data or device data. Though this is not easy and same challenges
> > +exist as with bare metal device system when userspace share same backing file.
>
> I'm not 100% sure about this section. Is it more or less Linux-specific,
> or can it be made a bit more implementation agnostic? As this is not my
> area of expertise, maybe others can chime in.

Honestly, I thought a'lot about this and wanted text to make sense in
some familiar context.
I will try to improve it more as suggested but not sure how much I would.

Thank you for your valuable inputs.

Best regards,
Pankaj

> > +
> > +\subsection{Countermeasures}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures}
> > +
> > +\subsubsection{ With SHARED mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / SHARED}
> > +
> > +If device backing backing file is shared with multiple driver or device,
> > +this may act as a metric for page cache side channel attack. As a counter
> > +measure every driver should have its own(not shared with another driver)
> > +SHARED backing file and gets populated a per device page cache pages.
> > +
> > +\subsubsection{ With PRIVATE mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / PRIVATE}
> > +There maybe be chances of side channels attack with PRIVATE
> > +memory mapping similar to SHARED with read-only shared mappings.
> > +PRIVATE is not used for virtio pmem making this usecase
> > +irrelevant.
> > +
> > +\subsubsection{ Workload specific mapping}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Workload}
> > +For SHARED mapping, if workload is single application inside
> > +the driver and there is no risk with sharing of data between the devices.
> > +Driver sharing same backing file with SHARED mapping can be
> > +used as a valid configuration.
> > +
> > +\subsubsection{ Prevent cache eviction}\label{sec:Device Types / PMEM Device / Possible Security Implications / Countermeasures / Cache eviction}
> > +Don't allow cache evict from driver filesystem trim or discard command
> > +with virtio pmem. This rules out any possibility of evict-reload
> > +page cache side channel attacks if backing disk is shared(SHARED)
> > +with mutliple drivers. Though if we use per device backing file with
> > +shared mapping this countermeasure is not required.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [virtio-dev] Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-19  5:53       ` Pankaj Gupta
@ 2021-08-19  9:38         ` Stefan Hajnoczi
  2021-08-19 10:11           ` Pankaj Gupta
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2021-08-19  9:38 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: Cornelia Huck, virtio-dev, Dan Williams, David Hildenbrand,
	Michael S . Tsirkin, Taylor Stark, Pankaj Gupta

[-- Attachment #1: Type: text/plain, Size: 1479 bytes --]

On Thu, Aug 19, 2021 at 07:53:48AM +0200, Pankaj Gupta wrote:
> > >> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> > >> > +
> > >> > +The driver MUST implement a virtio based flushing interface.
> > >>
> > >> I think this can be dropped. The only way to make writes persistent is
> > >> by sending flush requests. There is no need to say that the driver has
> > >> to use this interface. It doesn't really have to, but then it won't be
> > >> able to guarantee persistence.
> > >
> > > I am not 100 percent sure on this.
> >
> > I don't think we want this in a normative section, but maybe there's a
> > place for it in a section that describes the general operation?
> 
> I was saying because this is the must property for virtio-pmem device
> which is emulated
> on regular block device. virtio-pmem above real persistent devices
> don't need any explicit flush.
> 
> This can be enhanced later for real persistent devices which don't
> need explicit flush
> and thus driver writers don't have to worry about it.

A feature bit can be added in the future that hints to the driver that
flushes are not necessary (only CPU cache flushes are required for
persistence).

However, I remember seeing specifications for persistent memory devices
that do require explicit flush operations (not just a CPU cache flush).
Let's keep them in mind too.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [virtio-dev] Re: [PATCH v2] virtio-pmem: PMEM device spec
  2021-08-19  9:38         ` [virtio-dev] " Stefan Hajnoczi
@ 2021-08-19 10:11           ` Pankaj Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Pankaj Gupta @ 2021-08-19 10:11 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Cornelia Huck, virtio-dev, Dan Williams, David Hildenbrand,
	Michael S . Tsirkin, Taylor Stark, Pankaj Gupta

On Thu, 19 Aug 2021 at 11:39, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Thu, Aug 19, 2021 at 07:53:48AM +0200, Pankaj Gupta wrote:
> > > >> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> > > >> > +
> > > >> > +The driver MUST implement a virtio based flushing interface.
> > > >>
> > > >> I think this can be dropped. The only way to make writes persistent is
> > > >> by sending flush requests. There is no need to say that the driver has
> > > >> to use this interface. It doesn't really have to, but then it won't be
> > > >> able to guarantee persistence.
> > > >
> > > > I am not 100 percent sure on this.
> > >
> > > I don't think we want this in a normative section, but maybe there's a
> > > place for it in a section that describes the general operation?
> >
> > I was saying because this is the must property for virtio-pmem device
> > which is emulated
> > on regular block device. virtio-pmem above real persistent devices
> > don't need any explicit flush.
> >
> > This can be enhanced later for real persistent devices which don't
> > need explicit flush
> > and thus driver writers don't have to worry about it.
>
> A feature bit can be added in the future that hints to the driver that
> flushes are not necessary (only CPU cache flushes are required for
> persistence).

Sure. will remove this then.
>
> However, I remember seeing specifications for persistent memory devices
> that do require explicit flush operations (not just a CPU cache flush).
> Let's keep them in mind too.

Thanks. Maybe internally they would be doing CPU flushes (a guess) and/or
some metadata sync operation.

Thank you for all the feedback. Will send a v3 with the suggested changes.

Best regards,
Pankaj

>
> Stefan


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-19 10:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13  9:52 [PATCH v2] virtio-pmem: PMEM device spec Pankaj Gupta
2021-08-17 15:26 ` Stefan Hajnoczi
2021-08-17 20:11   ` Pankaj Gupta
2021-08-18  8:45     ` [virtio-dev] " Cornelia Huck
2021-08-19  5:53       ` Pankaj Gupta
2021-08-19  9:38         ` [virtio-dev] " Stefan Hajnoczi
2021-08-19 10:11           ` Pankaj Gupta
2021-08-18  9:22 ` Cornelia Huck
2021-08-19  6:03   ` Pankaj Gupta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.