* RE: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
@ 2023-02-09 3:35 ` Parav Pandit
2023-02-09 3:36 ` Xuan Zhuo
2023-03-07 11:15 ` [virtio-dev] " Xuan Zhuo
` (3 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2023-02-09 3:35 UTC (permalink / raw)
To: Xuan Zhuo, virtio-comment
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic
Hi Xuan,
> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Xuan Zhuo
> conformance.tex | 26 +++
> content.tex | 1 +
> virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
Can you please rebase your patches?
Devices are placed under their own directory and maintained in multiple smaller files.
device-types/ism/...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: RE: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:35 ` [virtio-comment] " Parav Pandit
@ 2023-02-09 3:36 ` Xuan Zhuo
0 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-02-09 3:36 UTC (permalink / raw)
To: Parav Pandit
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
virtio-comment
On Thu, 9 Feb 2023 03:35:33 +0000, Parav Pandit <parav@nvidia.com> wrote:
> Hi Xuan,
>
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Xuan Zhuo
>
> > conformance.tex | 26 +++
> > content.tex | 1 +
> > virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
>
> Can you please rebase your patches?
> Devices are placed under their own directory and maintained in multiple smaller files.
>
> device-types/ism/...
OK, will fix.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2023-02-09 3:35 ` [virtio-comment] " Parav Pandit
@ 2023-03-07 11:15 ` Xuan Zhuo
2023-03-15 11:15 ` Xuan Zhuo
2023-03-23 14:46 ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
` (2 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-07 11:15 UTC (permalink / raw)
To: Xuan Zhuo
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
virtio-comment
Any comments for this. ^_^
Thanks.
On Thu, 9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine
> including an ism device and with the guests getting the shared memory by the ism
> devices.
>
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
>
> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------| |------------------------------------------------| |
> | | Guest | | Guest | |
> | | | | | |
> | | ---------------- | | ---------------- | |
> | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> | | ---------------- | | | | | ---------------- | | | |
> | | |cq| |map |map |map | | |cq| |map |map | |
> | | | | | | | | | | | | | | |
> | | | | ------------------- | | | | -------------------- | |
> | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> | | | | ------------------- | | | | -------------------- | |
> | | | | | | | |
> | | | | | | | |
> | | Qemu | | | Qemu | | |
> | |--------------------------------+---------------| |-------------------------------+----------------| |
> | | | |
> | | | |
> | |------------------------------+------------------------| |
> | | |
> | | |
> | -------------------------- |
> | | M1 | | M2 | | M3 | |
> | -------------------------- |
> | |
> | HOST |
> ---------------------------------------------------------------------------------------------------------------
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> ---
> conformance.tex | 26 +++
> content.tex | 1 +
> virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 600 insertions(+)
> create mode 100644 virtio-ism.tex
>
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
> \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>
> A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +\end{itemize}
> +
> \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> A conformant implementation MUST be either transitional or
> non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> \input{virtio-scmi.tex}
> \input{virtio-gpio.tex}
> \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>
> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------| |------------------------------------------------| |
> +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> +| | | | | | | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | | driver | | | | | | | driver | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | |cq| |map |map |map | | |cq| |map |map | |
> +| | | | | | | | | | | | | | |
> +| | | | ------------------- | | | | ------------------- | |
> +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> +| | | | ------------------- | | | | ------------------- | |
> +| | | | | | | |
> +| | | | | | | |
> +| | | | | | | |
> +| |--------------------------------+---------------| |--------------------------------+---------------| |
> +| | | |
> +| | | |
> +| |------------------------------+------------------------| |
> +| | |
> +| | |
> +| -------------------------- |
> +| | M1 | | M2 | | M3 | |
> +| -------------------------- |
> +| |
> +| |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> + 44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> + don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> + le128 cdid;
> + le64 devid;
> + le64 chunk_size;
> + le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[\field{cdid}] This is used to identify the communication domain. Only
> + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> + is world-wide unique in a sense that there not be another communication
> + domain with the same \field{cdid}.
> +
> + \item[\field{devid}] This is used to identify the ism device in the single
> + communication domain.
> +
> + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> + memory is divided into multiple chunks. Every ism chunk has the same
> + size.
> +
> + \item[\field{notify_size}] This is the size of the ism notify-address. The
> + notify-address is used to notify the device that the content of the
> + ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device. The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3) // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> + other peers of the update event of the ism region content.
> +
> + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> + \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> + le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> + le64 ev_type;
> + void *buffer;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> + structure based on this.
> +
> + If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> + device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> + u8 class;
> + u8 command;
> + u8 command_specific_data[];
> + u8 ack;
> + u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK 0
> +#define VIRTIO_ISM_ERR 255
> +
> +#define VIRTIO_ISM_ENOENT 2
> +#define VIRTIO_ISM_E2BIG 7
> +#define VIRTIO_ISM_ENOMEM 12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD 101
> +#define VIRTIO_ISM_PERM_EWRITE 102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> + le64 size;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> + le64 token;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC 0
> + #define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> + le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> + le64 size;
> + le64 permissions;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY 1
> + #define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> + le64 token;
> + le64 rw_perm;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> + le64 size;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH 2
> + #define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> +header part of the last chunk is effective.
> +
> +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +Based on controlq, the device can release references to the ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_detach {
> + le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_DETACH 3
> + #define VIRTIO_ISM_CTRL_DETACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> +region, the device who allocated this ism region MUST generate a
> +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> +
> +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> +attach operation, then the buffer is free after the detach operation.
> +
> +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +Based on controlq, the driver can set the permissions for each ism
> +region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_grant_default {
> + le64 token;
> + le64 permissions;
> +};
> +
> +struct virtio_ism_ctrl_grant_dev {
> + le64 token;
> + le64 permissions;
> + le64 peer_devid;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_GRANT 4
> + #define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT 0
> + #define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> +\end{lstlisting}
> +
> +The permissions:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_PERM_READ (1 << 0)
> +#define VIRTIO_ISM_PERM_WRITE (1 << 1)
> +
> +#define VIRTIO_ISM_PERM_ATTACH (1 << 2)
> +
> +#define VIRTIO_ISM_PERM_MANAGE (1 << 3)
> +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT (1 << 4)
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[VIRTIO_ISM_PERM_READ] read permission
> +\item[VIRTIO_ISM_PERM_WRITE] write permission
> +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> +
> +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> + permission can modify the permission of this ism region. By default, only
> + the device that allocated the region has this permission.
> +
> +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> + with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> +
> +\end{description}
> +
> +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> +
> +Permission control is divided into two categories, one is the dedicated
> +permissions for a certain device, and the other is the default permissions for
> +all device.
> +
> +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The default permissions of the newly allocated ism region MUST be (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> +
> +For a certain device, if there is dedicated permissions, the default
> +permissions are illegal to this device.
> +
> +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> +
> +For the ism region update event, the driver can bind an interrupt to the ism
> +region. Then this ism region's update event will no longer be passed through
> +eventq, but the interrupt will be directly triggered.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_irq_vector {
> + le64 token;
> + le64 vector;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_EVENT_VECTOR 5
> + #define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +
> +The device MUST record the relationship between the ism region and the vector
> +notified by the driver, and notify the update event of this region to driver
> +by the corresponding vector. And the device MUST NOT use eventq to send the
> +update event of this ism region.
> +
> +
> --
> 2.32.0.3.g01195cf9f
>
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-03-07 11:15 ` [virtio-dev] " Xuan Zhuo
@ 2023-03-15 11:15 ` Xuan Zhuo
0 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-15 11:15 UTC (permalink / raw)
To: pasic
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
virtio-comment
On Tue, 7 Mar 2023 19:15:17 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> Any comments for this. ^_^
hi Halil,
Do you have any comments?
Thanks.
>
> Thanks.
>
> On Thu, 9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> >
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> >
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------| |------------------------------------------------| |
> > | | Guest | | Guest | |
> > | | | | | |
> > | | ---------------- | | ---------------- | |
> > | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> > | | ---------------- | | | | | ---------------- | | | |
> > | | |cq| |map |map |map | | |cq| |map |map | |
> > | | | | | | | | | | | | | | |
> > | | | | ------------------- | | | | -------------------- | |
> > | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> > | | | | ------------------- | | | | -------------------- | |
> > | | | | | | | |
> > | | | | | | | |
> > | | Qemu | | | Qemu | | |
> > | |--------------------------------+---------------| |-------------------------------+----------------| |
> > | | | |
> > | | | |
> > | |------------------------------+------------------------| |
> > | | |
> > | | |
> > | -------------------------- |
> > | | M1 | | M2 | | M3 | |
> > | -------------------------- |
> > | |
> > | HOST |
> > ---------------------------------------------------------------------------------------------------------------
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> > Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> > Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> > Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> > Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> > Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> > Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> > conformance.tex | 26 +++
> > content.tex | 1 +
> > virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 600 insertions(+)
> > create mode 100644 virtio-ism.tex
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> > \end{itemize}
> >
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> > \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> >
> > A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> > \end{itemize}
> >
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> > +\end{itemize}
> > +
> > \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> > A conformant implementation MUST be either transitional or
> > non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> > \input{virtio-scmi.tex}
> > \input{virtio-gpio.tex}
> > \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> >
> > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> >
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------| |------------------------------------------------| |
> > +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> > +| | | | | | | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | | driver | | | | | | | driver | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | |cq| |map |map |map | | |cq| |map |map | |
> > +| | | | | | | | | | | | | | |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| |--------------------------------+---------------| |--------------------------------+---------------| |
> > +| | | |
> > +| | | |
> > +| |------------------------------+------------------------| |
> > +| | |
> > +| | |
> > +| -------------------------- |
> > +| | M1 | | M2 | | M3 | |
> > +| -------------------------- |
> > +| |
> > +| |
> > +|-------------------------------------------------------------------------------------------------------------|
> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > + 44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > + don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > + le128 cdid;
> > + le64 devid;
> > + le64 chunk_size;
> > + le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > + \item[\field{cdid}] This is used to identify the communication domain. Only
> > + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > + is world-wide unique in a sense that there not be another communication
> > + domain with the same \field{cdid}.
> > +
> > + \item[\field{devid}] This is used to identify the ism device in the single
> > + communication domain.
> > +
> > + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > + memory is divided into multiple chunks. Every ism chunk has the same
> > + size.
> > +
> > + \item[\field{notify_size}] This is the size of the ism notify-address. The
> > + notify-address is used to notify the device that the content of the
> > + ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
> > +\field{chunk_size} MUST be a power of two.
> > +
> > +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism region is on the device memory. It is a physical memory for driver, but
> > +it is not for device. The device has the ability to modify the physical memory
> > +corresponding to the device memory.
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> > +The driver pass its memory to the device when alloc a new ism region. The
> > +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> > +is negotiated, physical memory is provided by the device.
> > +
> > +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> > +provides memory to device. Under normal circumstances, this memory will not be
> > +used. This is to limit driver to take up too much memory based on attach
> > +operation.
> > +
> > +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism device supports event notification of the ism region. When a device
> > +kick/attach/detach a region, other ism region referrers will receive related
> > +events.
> > +
> > +A buffer received from eventq can contain multiple event structures.
> > +
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> > +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> > +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> > +#define VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3) // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > + \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> > + other peers of the update event of the ism region content.
> > +
> > + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> > + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> > + \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> > +\end{description}
> > +
> > +The event structures:
> > +\begin{lstlisting}
> > +struct virtio_ism_event_update {
> > + le64 ev_type;
> > + le64 offset;
> > + le64 devid;
> > +};
> > +
> > +struct virtio_ism_event_attach_detach {
> > + le64 ev_type;
> > + le64 offset;
> > + le64 devid;
> > + le64 peers;
> > +};
> > +
> > +struct virtio_ism_event_buffer_free {
> > + le64 ev_type;
> > + void *buffer;
> > +};
> > +
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > + structure based on this.
> > +
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> > +
> > +\item[\field{offset}] The offset of ism regions with the event.
> > +
> > +\item[\field{devid}] \field{devid} of the device that generated events.
> > +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> > + device that receiving this event)
> > +
> > +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> > +
> > +\end{description}
> > +
> > +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> > +
> > +The permissions of an ism region determine whether this ism region can be
> > +attached and the read and write permissions after attach. The driver can set the
> > +default permissions for all devices, or set the dedicated permissions for a
> > +certain device. If there are dedicated permissions for a device, the default
> > +permissions are invalid for this device.
> > +
> > +For example, the driver can set the default permission of an ism region as
> > +read-only, and set the dedicated permissions as writable for a device with devid
> > +0xff. Then all devices except 0xff in such a communication domain can attach
> > +this ism region and only have read-only permission. And the device 0xff can
> > +write to the ism region after attaching.
> > +
> > +When a driver has the management permission of the ism region, then it can
> > +modify the permissions of this ism region. By default, only the device that
> > +allocated the ism region has this permission.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> > +during reset. \field{devid} MUST NOT be 0.
> > +
> > +The device exposes memory to the driver based on Shared Memory Regions
> > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > +However, it does not need to allocate physical memory during initialization.
> > +
> > +The \field{shmid} of a region MUST be one of the following
> > +\begin{lstlisting}
> > +enum virtio_ism_shm_id {
> > + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> > + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> > +};
> > +\end{lstlisting}
> > +
> > +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> > +
> > +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> > +driver. This memory area is used to notify the device with update event, and
> > +each ism region MUST have a corresponding notify-address inside this area, and
> > +the size of the notify-address is \field{notify_size}.
> > +
> > +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The driver MUST NOT access any chunk before it is referred by one ism region.
> > +
> > +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> > +
> > +The driver uses the control virtqueue send commands to implement operations on
> > +the ism region and some global configurations.
> > +
> > +All commands are of the following form:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl {
> > + u8 class;
> > + u8 command;
> > + u8 command_specific_data[];
> > + u8 ack;
> > + u8 command_specific_data_reply[];
> > +};
> > +
> > +/* ack values */
> > +#define VIRTIO_ISM_OK 0
> > +#define VIRTIO_ISM_ERR 255
> > +
> > +#define VIRTIO_ISM_ENOENT 2
> > +#define VIRTIO_ISM_E2BIG 7
> > +#define VIRTIO_ISM_ENOMEM 12
> > +#define VIRTIO_ISM_ENOSPEC 28
> > +
> > +#define VIRTIO_ISM_PERM_EATTACH 100
> > +#define VIRTIO_ISM_PERM_EREAD 101
> > +#define VIRTIO_ISM_PERM_EWRITE 102
> > +\end{lstlisting}
> > +
> > +The \field{class}, \field{command} and command-specific-data are set by the
> > +driver, and the device sets the \field{ack} byte and optionally
> > +\field{command-specific-data-reply}.
> > +
> > +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> > +
> > +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +Based on controlq, the driver can request to allocate an ism region.
> > +
> > +The reply from the device will carry a token, which can be passed
> > +to other driver for attaching to this ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc {
> > + le64 size;
> > + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc_reply {
> > + le64 token;
> > + le64 num;
> > + le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ALLOC 0
> > + #define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> > +free chunks from the device memory. And the physical memory is mapped to these
> > +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> > +provide physical memory by itself. Otherwise, the memory provided by the driver
> > +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> > +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> > +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> > +
> > +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> > +ism region. A new token MUST be dynamically created for this ism region
> > +simultaneously. The token MUST be unique inside this communication domain. And
> > +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks.
> > +
> > +The device MUST clear the memory of this ism region before committing to the
> > +driver.
> > +
> > +If \field{size} is smaller than \field{chunk_size}, the ism region also
> > +occupies one chunk.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG.
> > +
> > +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +Based on controlq, the driver can query the information of an ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query {
> > + le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query_reply {
> > + le64 size;
> > + le64 permissions;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_QUERY 1
> > + #define VIRTIO_ISM_CTRL_QUERY_INFO 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> > +\field{size} with the size of the ism region. The device MUST get the
> > +permissions of this ism region for the device-self, then fill in the
> > +\field{permissions}.
> > +
> > +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +Based on controlq, the driver can request to attach an ism region with a
> > +specified token.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach {
> > + le64 token;
> > + le64 rw_perm;
> > + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach_reply {
> > + le64 size;
> > + le64 num;
> > + le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ATTACH 2
> > + #define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST first look for the free chunks from the device memory. And the
> > +physical memory of the ism region is mapped to these chunks in order. If there is
> > +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> > +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks. \field{size} is the size of this ism region.
> > +
> > +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> > +assigning the physical memory of this ism region into the device memory.
> > +
> > +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> > +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> > +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> > +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> > +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> > +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> > +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> > +
> > +The device MUST set the read and write permissions of the physical memory inside
> > +the device memory based on \field{rw_perm}.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> > +
> > +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> > +header part of the last chunk is effective.
> > +
> > +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +Based on controlq, the device can release references to the ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_detach {
> > + le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_DETACH 3
> > + #define VIRTIO_ISM_CTRL_DETACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +
> > +If the ism region specified by \field{token} is not attached, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> > +region, the device who allocated this ism region MUST generate a
> > +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> > +
> > +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> > +attach operation, then the buffer is free after the detach operation.
> > +
> > +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +Based on controlq, the driver can set the permissions for each ism
> > +region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_grant_default {
> > + le64 token;
> > + le64 permissions;
> > +};
> > +
> > +struct virtio_ism_ctrl_grant_dev {
> > + le64 token;
> > + le64 permissions;
> > + le64 peer_devid;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_GRANT 4
> > + #define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT 0
> > + #define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> > +\end{lstlisting}
> > +
> > +The permissions:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_PERM_READ (1 << 0)
> > +#define VIRTIO_ISM_PERM_WRITE (1 << 1)
> > +
> > +#define VIRTIO_ISM_PERM_ATTACH (1 << 2)
> > +
> > +#define VIRTIO_ISM_PERM_MANAGE (1 << 3)
> > +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT (1 << 4)
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[VIRTIO_ISM_PERM_READ] read permission
> > +\item[VIRTIO_ISM_PERM_WRITE] write permission
> > +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> > +
> > +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> > + permission can modify the permission of this ism region. By default, only
> > + the device that allocated the region has this permission.
> > +
> > +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> > + with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> > +
> > +\end{description}
> > +
> > +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> > +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> > +
> > +Permission control is divided into two categories, one is the dedicated
> > +permissions for a certain device, and the other is the default permissions for
> > +all device.
> > +
> > +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +
> > +If the ism region specified by \field{token} is not attached, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The default permissions of the newly allocated ism region MUST be (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> > +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> > +
> > +For a certain device, if there is dedicated permissions, the default
> > +permissions are illegal to this device.
> > +
> > +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> > +
> > +For the ism region update event, the driver can bind an interrupt to the ism
> > +region. Then this ism region's update event will no longer be passed through
> > +eventq, but the interrupt will be directly triggered.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_irq_vector {
> > + le64 token;
> > + le64 vector;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_EVENT_VECTOR 5
> > + #define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> > +
> > +The device MUST record the relationship between the ism region and the vector
> > +notified by the driver, and notify the update event of this region to driver
> > +by the corresponding vector. And the device MUST NOT use eventq to send the
> > +update event of this ism region.
> > +
> > +
> > --
> > 2.32.0.3.g01195cf9f
> >
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2023-02-09 3:35 ` [virtio-comment] " Parav Pandit
2023-03-07 11:15 ` [virtio-dev] " Xuan Zhuo
@ 2023-03-23 14:46 ` Halil Pasic
2023-03-24 3:08 ` Xuan Zhuo
2023-03-24 4:03 ` Wen Gu
2023-03-24 4:51 ` Parav Pandit
2023-04-26 7:41 ` [virtio-dev] " Xuan Zhuo
4 siblings, 2 replies; 20+ messages in thread
From: Halil Pasic @ 2023-03-23 14:46 UTC (permalink / raw)
To: Xuan Zhuo
Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
Halil Pasic
On Thu, 9 Feb 2023 11:30:56 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> + le128 cdid;
> + le64 devid;
> + le64 chunk_size;
> + le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[\field{cdid}] This is used to identify the communication domain. Only
> + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> + is world-wide unique in a sense that there not be another communication
> + domain with the same \field{cdid}.
> +
> + \item[\field{devid}] This is used to identify the ism device in the single
> + communication domain.
> +
> + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> + memory is divided into multiple chunks. Every ism chunk has the same
> + size.
> +
> + \item[\field{notify_size}] This is the size of the ism notify-address. The
> + notify-address is used to notify the device that the content of the
> + ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
Hi Xuan Zhou!
My understanding is the following: you goal for virtio-ism is that
it should be suitable for usage with SMC-D (much like the original ISM
device). Is that right?
If yes, then let us have a look at the following example. We have
two guests sitting on the same hypervisor: A and B. Both of the
guests have an rdma capable interface, a virtio-ism device and
traditional ISM device. So they could talk over SMC-R, SMC-D via
virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
look like?
Where I am going with this? Either you need a novel way to
discover peers (probably before the usual way is employed)
or (probably preferably) you need to make this part of the
CLC stuff. What are your ideas with regards to this? How is
it supposed to work?
To get back to the things proposed here: the cdid is IMHO
a nice thing, and is functionally corresponding to the
(S)EID. But it is 16 byte wide, and I have no idea how
is it supposed to be used in the CLC handshake.
If this is really supposed to work with SMC and not just take
inspiration from it, I would like some insight from our
SMC experts (they are already on copy).
Regards,
Halil
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-03-23 14:46 ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
@ 2023-03-24 3:08 ` Xuan Zhuo
2023-03-24 4:03 ` Wen Gu
1 sibling, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-24 3:08 UTC (permalink / raw)
To: Halil Pasic
Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
Halil Pasic, Wen Gu
On Thu, 23 Mar 2023 15:46:56 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Thu, 9 Feb 2023 11:30:56 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > + le128 cdid;
> > + le64 devid;
> > + le64 chunk_size;
> > + le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > + \item[\field{cdid}] This is used to identify the communication domain. Only
> > + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > + is world-wide unique in a sense that there not be another communication
> > + domain with the same \field{cdid}.
> > +
> > + \item[\field{devid}] This is used to identify the ism device in the single
> > + communication domain.
> > +
> > + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > + memory is divided into multiple chunks. Every ism chunk has the same
> > + size.
> > +
> > + \item[\field{notify_size}] This is the size of the ism notify-address. The
> > + notify-address is used to notify the device that the content of the
> > + ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
>
>
> Hi Xuan Zhou!
>
> My understanding is the following: you goal for virtio-ism is that
> it should be suitable for usage with SMC-D (much like the original ISM
> device). Is that right?
Yes, in our interior, it is indeed working with SMC. But on the other
hand, I also hope to provide a way to share memory between VMs. You can see that
our POC provides the examples that users can use the sharing memory directly.
>
> If yes, then let us have a look at the following example. We have
> two guests sitting on the same hypervisor: A and B. Both of the
> guests have an rdma capable interface, a virtio-ism device and
> traditional ISM device. So they could talk over SMC-R, SMC-D via
> virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
> look like?
Not only these, but also SMC-Loopback. I know Wen is promoting this aspect. I
think this is not a problem of virtio-ism. SMC may have to be renovated
together.
>
> Where I am going with this? Either you need a novel way to
> discover peers (probably before the usual way is employed)
> or (probably preferably) you need to make this part of the
> CLC stuff. What are your ideas with regards to this? How is
> it supposed to work?
>
> To get back to the things proposed here: the cdid is IMHO
> a nice thing, and is functionally corresponding to the
> (S)EID. But it is 16 byte wide, and I have no idea how
> is it supposed to be used in the CLC handshake.
This is indeed the biggest problem for Virtio-ISM + SMC, but I think Wen can
solve it. Of course, other SMC experts are discussed together, how to compatible
or support virtio-exm and smc-loopback.
>
> If this is really supposed to work with SMC and not just take
> inspiration from it,
As I said above, we are not just working with SMC. I also hope that
users can use these shared memory directly. So this is also an independent
memory sharing mechanism without smc.
> I would like some insight from our
> SMC experts (they are already on copy).
Yes. I also want to hear more reply.
Thanks.
>
> Regards,
> Halil
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-03-23 14:46 ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
2023-03-24 3:08 ` Xuan Zhuo
@ 2023-03-24 4:03 ` Wen Gu
[not found] ` <7a9ebec0-5e87-b80f-4f2c-c4db7ae4fe84@linux.ibm.com>
1 sibling, 1 reply; 20+ messages in thread
From: Wen Gu @ 2023-03-24 4:03 UTC (permalink / raw)
To: Halil Pasic, Xuan Zhuo
Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
guwen
On 2023/3/23 22:46, Halil Pasic wrote:
> On Thu, 9 Feb 2023 11:30:56 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
>> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
>> +
>> +\begin{lstlisting}
>> +struct virtio_ism_config {
>> + le128 cdid;
>> + le64 devid;
>> + le64 chunk_size;
>> + le64 notify_size;
>> +};
>> +\end{lstlisting}
>> +
>> +\begin{description}
>> + \item[\field{cdid}] This is used to identify the communication domain. Only
>> + ism devices with the same \field{cdid} can communicate. A \field{cdid}
>> + is world-wide unique in a sense that there not be another communication
>> + domain with the same \field{cdid}.
>> +
>> + \item[\field{devid}] This is used to identify the ism device in the single
>> + communication domain.
>> +
>> + \item[\field{chunk_size}] This is the size of the ism chunk. The device
>> + memory is divided into multiple chunks. Every ism chunk has the same
>> + size.
>> +
>> + \item[\field{notify_size}] This is the size of the ism notify-address. The
>> + notify-address is used to notify the device that the content of the
>> + ism region has been updated.
>> +
>> +\end{description}
>> +
>> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
>> +
>> +The device MUST ensure that the \field{cdid} of the device on the same
>> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
>> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>> +
>> +In the single communication domain, the device MUST ensure that the \field{devid}
>> +is unique.
>> +
>
>
> Hi Xuan Zhou!
>
> My understanding is the following: you goal for virtio-ism is that
> it should be suitable for usage with SMC-D (much like the original ISM
> device). Is that right?
>
> If yes, then let us have a look at the following example. We have
> two guests sitting on the same hypervisor: A and B. Both of the
> guests have an rdma capable interface, a virtio-ism device and
> traditional ISM device. So they could talk over SMC-R, SMC-D via
> virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
> look like?
>
Hi Halil!
Now SMCv2 protocol allows CLC Proposal message to carry 1 SMC-R
GID, 1 SMC-Rv2 GID, and up to 8 SMC-D [CHID,GID] at the same time
for server to choose from.
And after receiving CLC Proposal message, server will match the local
devices with devices provided in CLC Proposal messages in this order:
SMC-Dv2 (ISMv2) device -> SMC-Dv1 (ISMv1) device -> SMC-Rv2 (RoCEv2) device -> SMC-Rv1 (RoCE) device
If there are multiple SMC-Dv2 devices provided in CLC Proposal messages,
such as virtio-ism and ISM in this example, the priority depends on their
order in the slot. (we treat virtio-ism as a kind of SMC-Dv2 device)
> Where I am going with this? Either you need a novel way to
> discover peers (probably before the usual way is employed)
> or (probably preferably) you need to make this part of the
> CLC stuff. What are your ideas with regards to this? How is
> it supposed to work?
>
The discussion about
1) how to distinguish SMC-D device type (loopback/virtio-ism/ISM)
2) how to make sure peer virtio-ism can talk to
3) how to define priority or preference of SMC-D devices
are on-going in the mail thread of meeting note: "SMC interlock | Alibaba - IBM | 2023-03-20"
A quich summary here:
1) we propose to use reserved CHID to indicate what kind of device it is.
such as 0xFFFF for loopback, 0xFFFE for virtio-ism.
2) whether two peer's SMC-D device can talk depends on:
a. whether VCHID is the same;
b. whether smc_ism_cantalk() return true;
c. whether SEID is the same;
In step b, smc_ism_cantalk() will simply call driver interface of
ops->query_remote_gid() to tell. So as you can see, GID is the key
point to check whether peers' SMC-D device can talk.
And cdid here in virtio-ism determines whether talk is possible.
Same cdid means two virtio-ism can talk to each other.
So the problem may be how to covert 128-bits cdid to 64-bits GID
used by SMC-D and avoid GID collision ? I have no elegant idea
only a workaround right now ... The discussion on GID collision
is now on-going in mail thread of meeting note: "SMC interlock | Alibaba - IBM | 2023-03-20".
If you have any comments about this, very welcome to discuss there.
3) Priority or preference of SMC-D devices are decided by slot
order now, or explicitly by CHID if we all agree to use reserved
CHID for extension SMC-D device. (haven't been comfirmed now)
> To get back to the things proposed here: the cdid is IMHO
> a nice thing, and is functionally corresponding to the
> (S)EID. But it is 16 byte wide, and I have no idea how
> is it supposed to be used in the CLC handshake.
>
CLC handshake carry one SEID for all the SMC-D device. Considering
coexistence with ISM, I am not sure whether we can change or increase
the SEID.. cc Alexandra
Thanks!
Wen Gu
> If this is really supposed to work with SMC and not just take
> inspiration from it, I would like some insight from our
> SMC experts (they are already on copy).
>
> Regards,
> Halil
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
` (2 preceding siblings ...)
2023-03-23 14:46 ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
@ 2023-03-24 4:51 ` Parav Pandit
2023-03-24 6:35 ` Xuan Zhuo
2023-03-24 9:10 ` Michael S. Tsirkin
2023-04-26 7:41 ` [virtio-dev] " Xuan Zhuo
4 siblings, 2 replies; 20+ messages in thread
From: Parav Pandit @ 2023-03-24 4:51 UTC (permalink / raw)
To: Xuan Zhuo, virtio-comment
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic
On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine
I am still learning this new memory sharing beast, so mostly cosmetics
comments below.
instead of guest of virtual machines is confusing.
It can be just written as - can be shared between virtual machines
running on same host ..
> including an ism device and with the guests getting the shared memory by the ism
> devices.
>
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
>
multiple peers mean, multiple peer ism device?
Or you meant multiple vms?
> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------| |------------------------------------------------| |
> | | Guest | | Guest | |
> | | | | | |
instead of guest, naming it VM makes it consistent with the spec and
description.
> | | ---------------- | | ---------------- | |
> | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> | | ---------------- | | | | | ---------------- | | | |
> | | |cq| |map |map |map | | |cq| |map |map | |
> | | | | | | | | | | | | | | |
> | | | | ------------------- | | | | -------------------- | |
> | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> | | | | ------------------- | | | | -------------------- | |
> | | | | | | | |
> | | | | | | | |
> | | Qemu | | | Qemu | | |
> | |--------------------------------+---------------| |-------------------------------+----------------| |
> | | | |
> | | | |
> | |------------------------------+------------------------| |
> | | |
> | | |
> | -------------------------- |
> | | M1 | | M2 | | M3 | |
> | -------------------------- |
> | |
> | HOST |
> ---------------------------------------------------------------------------------------------------------------
>
You might want to show the event q as well next to control q as its so
fundamental for this whole operation.
> ---
> conformance.tex | 26 +++
> content.tex | 1 +
> virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 600 insertions(+)
> create mode 100644 virtio-ism.tex
>
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
> \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>
> A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
Though using control vq to configure irq is good idea, I guess
queue_msix_vector can be used. It needs some more text that
queue_msix_vector will not be used etc.
if the event queue create interface exists over ctrl vq than things tied
up well. Else I think transport specific existing scheme more aligns to
the spec.
> +\end{itemize}
> +
> \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> A conformant implementation MUST be either transitional or
> non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> \input{virtio-scmi.tex}
> \input{virtio-gpio.tex}
> \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>
> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------| |------------------------------------------------| |
> +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> +| | | | | | | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | | driver | | | | | | | driver | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | |cq| |map |map |map | | |cq| |map |map | |
> +| | | | | | | | | | | | | | |
> +| | | | ------------------- | | | | ------------------- | |
> +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> +| | | | ------------------- | | | | ------------------- | |
> +| | | | | | | |
> +| | | | | | | |
> +| | | | | | | |
> +| |--------------------------------+---------------| |--------------------------------+---------------| |
> +| | | |
> +| | | |
> +| |------------------------------+------------------------| |
> +| | |
Above line is misaligned.
> +| | |
> +| -------------------------- |
> +| | M1 | | M2 | | M3 | |
> +| -------------------------- |
> +| |
> +| |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
The peer driver will make it more clear.
> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> + 44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> + don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> + le128 cdid;
le128 data type is not present in the spec.
Better to define uuid data structure and refer here.
struct uuid {..}
struct uuid cdid;
...
> + le64 devid;
> + le64 chunk_size;
> + le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[\field{cdid}] This is used to identify the communication domain. Only
> + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> + is world-wide unique in a sense that there not be another communication
> + domain with the same \field{cdid}.
> +
> + \item[\field{devid}] This is used to identify the ism device in the single
> + communication domain.
> +
> + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> + memory is divided into multiple chunks. Every ism chunk has the same
> + size.
> +
> + \item[\field{notify_size}] This is the size of the ism notify-address. The
> + notify-address is used to notify the device that the content of the
> + ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device.
If it is not physical memory, than what is it? May be you want to write
that.
> The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
Above sentence as continuation to physical memory implies that its RW
for device and RO for driver.
I likely misunderstood.
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
The driver passes its memory to the device when allocating a new ism region.
> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3) // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> + other peers of the update event of the ism region content.
> +
> + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> + \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> + le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> + le64 ev_type;
> + void *buffer;
Is this a virtual address being shared between driver and device and in
the cvq command?
If so, do you need any attachment to process id or such abstract handle
so that some day user process can also access it?
or the use case is to only consume by the driver, hence only one VA?
> +};
> + > +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> + structure based on this.
> +
> + If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> + device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
control virtqueue to send commands
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> + u8 class;
> + u8 command;
> + u8 command_specific_data[];
> + u8 ack;
> + u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK 0
> +#define VIRTIO_ISM_ERR 255
> +
> +#define VIRTIO_ISM_ENOENT 2
> +#define VIRTIO_ISM_E2BIG 7
> +#define VIRTIO_ISM_ENOMEM 12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD 101
> +#define VIRTIO_ISM_PERM_EWRITE 102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> + le64 size;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
align void* and le64 to same.
void * here and above doesnt define the exact data size.
device will not be able to understand bit width of void* on 64-bit and
32-bit systems.
please change to u64.
Also I think spec follows /* */ comment style, instead of //.
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> + le64 token;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC 0
> + #define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> + le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> + le64 size;
> + le64 permissions;
Better to define the decoding of the permissions here.
Also below you have field rw_perm. May be rename both to sama name.
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY 1
> + #define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> + le64 token;
> + le64 rw_perm;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> + le64 size;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH 2
> + #define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we w
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-03-24 4:51 ` Parav Pandit
@ 2023-03-24 6:35 ` Xuan Zhuo
2023-03-24 9:10 ` Michael S. Tsirkin
1 sibling, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-24 6:35 UTC (permalink / raw)
To: Parav Pandit
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
virtio-comment
On Fri, 24 Mar 2023 00:51:13 -0400, Parav Pandit <parav@nvidia.com> wrote:
>
>
> On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
>
> I am still learning this new memory sharing beast, so mostly cosmetics
> comments below.
>
> instead of guest of virtual machines is confusing.
> It can be just written as - can be shared between virtual machines
> running on same host ..
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> >
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> >
> multiple peers mean, multiple peer ism device?
Yes. multiple peer ism device.
Yes, we should use device, not VM or Guest.
> Or you meant multiple vms?
>
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------| |------------------------------------------------| |
> > | | Guest | | Guest | |
> > | | | | | |
> instead of guest, naming it VM makes it consistent with the spec and
> description.
Yes
>
> > | | ---------------- | | ---------------- | |
> > | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> > | | ---------------- | | | | | ---------------- | | | |
> > | | |cq| |map |map |map | | |cq| |map |map | |
> > | | | | | | | | | | | | | | |
> > | | | | ------------------- | | | | -------------------- | |
> > | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> > | | | | ------------------- | | | | -------------------- | |
> > | | | | | | | |
> > | | | | | | | |
> > | | Qemu | | | Qemu | | |
> > | |--------------------------------+---------------| |-------------------------------+----------------| |
> > | | | |
> > | | | |
> > | |------------------------------+------------------------| |
> > | | |
> > | | |
> > | -------------------------- |
> > | | M1 | | M2 | | M3 | |
> > | -------------------------- |
> > | |
> > | HOST |
> > ---------------------------------------------------------------------------------------------------------------
> >
> You might want to show the event q as well next to control q as its so
> fundamental for this whole operation.
Yes. I will try.
>
> > ---
> > conformance.tex | 26 +++
> > content.tex | 1 +
> > virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 600 insertions(+)
> > create mode 100644 virtio-ism.tex
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> > \end{itemize}
> >
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> > \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> >
> > A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> > \end{itemize}
> >
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> Though using control vq to configure irq is good idea, I guess
> queue_msix_vector can be used. It needs some more text that
> queue_msix_vector will not be used etc.
The irq is bind to virtio ism region not queue. So we use the control vq to
configure that.
>
> if the event queue create interface exists over ctrl vq than things tied
> up well. Else I think transport specific existing scheme more aligns to
> the spec.
The situation here is not within the scope of standardization.
Virtio ISM Region is a relatively special concept to virtio spec. It is a piece
of memory, not Queue. So there is no solution provided in the current
specification here. We share this Virtio Ism Region between multiple Device. The
interruption of the binding here is to inform the content of this memory is
updated, and it is used to notify other devices
>
> > +\end{itemize}
> > +
> > \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> > A conformant implementation MUST be either transitional or
> > non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> > \input{virtio-scmi.tex}
> > \input{virtio-gpio.tex}
> > \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> >
> > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> >
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------| |------------------------------------------------| |
> > +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> > +| | | | | | | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | | driver | | | | | | | driver | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | |cq| |map |map |map | | |cq| |map |map | |
> > +| | | | | | | | | | | | | | |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| |--------------------------------+---------------| |--------------------------------+---------------| |
> > +| | | |
> > +| | | |
> > +| |------------------------------+------------------------| |
> > +| | |
> Above line is misaligned.
>
> > +| | |
> > +| -------------------------- |
> > +| | M1 | | M2 | | M3 | |
> > +| -------------------------- |
> > +| |
> > +| |
> > +|-------------------------------------------------------------------------------------------------------------|
> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> The peer driver will make it more clear.
Yes
>
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > + 44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > + don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > + le128 cdid;
> le128 data type is not present in the spec.
> Better to define uuid data structure and refer here.
>
> struct uuid {..}
>
> struct uuid cdid;
> ...
will fix
> > + le64 devid;
> > + le64 chunk_size;
> > + le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > + \item[\field{cdid}] This is used to identify the communication domain. Only
> > + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > + is world-wide unique in a sense that there not be another communication
> > + domain with the same \field{cdid}.
> > +
> > + \item[\field{devid}] This is used to identify the ism device in the single
> > + communication domain.
> > +
> > + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > + memory is divided into multiple chunks. Every ism chunk has the same
> > + size.
> > +
> > + \item[\field{notify_size}] This is the size of the ism notify-address. The
> > + notify-address is used to notify the device that the content of the
> > + ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
> > +\field{chunk_size} MUST be a power of two.
> > +
> > +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism region is on the device memory. It is a physical memory for driver, but
> > +it is not for device.
> If it is not physical memory, than what is it? May be you want to write
> that.
I originally wanted to express that this ISM Region is a physical memory for
the drive, but it is virtual memory for the device. But now I realize that this
sentence is meaningless. I will remove it in the next version.
>
> > The device has the ability to modify the physical memory
> > +corresponding to the device memory.
> > +
> Above sentence as continuation to physical memory implies that its RW
> for device and RO for driver.
> I likely misunderstood.
>
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> > +The driver pass its memory to the device when alloc a new ism region. The
> The driver passes its memory to the device when allocating a new ism region.
>
> > +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> > +is negotiated, physical memory is provided by the device.
> > +
> > +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> > +provides memory to device. Under normal circumstances, this memory will not be
> > +used. This is to limit driver to take up too much memory based on attach
> > +operation.
> > +
> > +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism device supports event notification of the ism region. When a device
> > +kick/attach/detach a region, other ism region referrers will receive related
> > +events.
> > +
> > +A buffer received from eventq can contain multiple event structures.
> > +
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> > +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> > +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> > +#define VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3) // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > + \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> > + other peers of the update event of the ism region content.
> > +
> > + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> > + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> > + \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> > +\end{description}
> > +
> > +The event structures:
> > +\begin{lstlisting}
> > +struct virtio_ism_event_update {
> > + le64 ev_type;
> > + le64 offset;
> > + le64 devid;
> > +};
> > +
> > +struct virtio_ism_event_attach_detach {
> > + le64 ev_type;
> > + le64 offset;
> > + le64 devid;
> > + le64 peers;
> > +};
> > +
> > +struct virtio_ism_event_buffer_free {
> > + le64 ev_type;
> > + void *buffer;
> Is this a virtual address being shared between driver and device and in
> the cvq command?
> If so, do you need any attachment to process id or such abstract handle
> so that some day user process can also access it?
> or the use case is to only consume by the driver, hence only one VA?
Yes, I miss something. Maybe I should add an offset.
>
> > +};
> > + > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > + structure based on this.
> > +
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> > + If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> > +
> > +\item[\field{offset}] The offset of ism regions with the event.
> > +
> > +\item[\field{devid}] \field{devid} of the device that generated events.
> > +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> > + device that receiving this event)
> > +
> > +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> > +
> > +\end{description}
> > +
> > +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> > +
> > +The permissions of an ism region determine whether this ism region can be
> > +attached and the read and write permissions after attach. The driver can set the
> > +default permissions for all devices, or set the dedicated permissions for a
> > +certain device. If there are dedicated permissions for a device, the default
> > +permissions are invalid for this device.
> > +
> > +For example, the driver can set the default permission of an ism region as
> > +read-only, and set the dedicated permissions as writable for a device with devid
> > +0xff. Then all devices except 0xff in such a communication domain can attach
> > +this ism region and only have read-only permission. And the device 0xff can
> > +write to the ism region after attaching.
> > +
> > +When a driver has the management permission of the ism region, then it can
> > +modify the permissions of this ism region. By default, only the device that
> > +allocated the ism region has this permission.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> > +during reset. \field{devid} MUST NOT be 0.
> > +
> > +The device exposes memory to the driver based on Shared Memory Regions
> > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > +However, it does not need to allocate physical memory during initialization.
> > +
> > +The \field{shmid} of a region MUST be one of the following
> > +\begin{lstlisting}
> > +enum virtio_ism_shm_id {
> > + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> > + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> > +};
> > +\end{lstlisting}
> > +
> > +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> > +
> > +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> > +driver. This memory area is used to notify the device with update event, and
> > +each ism region MUST have a corresponding notify-address inside this area, and
> > +the size of the notify-address is \field{notify_size}.
> > +
> > +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The driver MUST NOT access any chunk before it is referred by one ism region.
> > +
> > +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> > +
> > +The driver uses the control virtqueue send commands to implement operations on
> control virtqueue to send commands
>
> > +the ism region and some global configurations.
> > +
> > +All commands are of the following form:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl {
> > + u8 class;
> > + u8 command;
> > + u8 command_specific_data[];
> > + u8 ack;
> > + u8 command_specific_data_reply[];
> > +};
> > +
> > +/* ack values */
> > +#define VIRTIO_ISM_OK 0
> > +#define VIRTIO_ISM_ERR 255
> > +
> > +#define VIRTIO_ISM_ENOENT 2
> > +#define VIRTIO_ISM_E2BIG 7
> > +#define VIRTIO_ISM_ENOMEM 12
> > +#define VIRTIO_ISM_ENOSPEC 28
> > +
> > +#define VIRTIO_ISM_PERM_EATTACH 100
> > +#define VIRTIO_ISM_PERM_EREAD 101
> > +#define VIRTIO_ISM_PERM_EWRITE 102
> > +\end{lstlisting}
> > +
> > +The \field{class}, \field{command} and command-specific-data are set by the
> > +driver, and the device sets the \field{ack} byte and optionally
> > +\field{command-specific-data-reply}.
> > +
> > +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> > +
> > +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +Based on controlq, the driver can request to allocate an ism region.
> > +
> > +The reply from the device will carry a token, which can be passed
> > +to other driver for attaching to this ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc {
> > + le64 size;
> > + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> align void* and le64 to same.
>
> void * here and above doesnt define the exact data size.
> device will not be able to understand bit width of void* on 64-bit and
> 32-bit systems.
>
> please change to u64.
>
> Also I think spec follows /* */ comment style, instead of //.
>
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc_reply {
> > + le64 token;
> > + le64 num;
> > + le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ALLOC 0
> > + #define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> > +free chunks from the device memory. And the physical memory is mapped to these
> > +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> > +provide physical memory by itself. Otherwise, the memory provided by the driver
> > +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> > +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> > +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> > +
> > +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> > +ism region. A new token MUST be dynamically created for this ism region
> > +simultaneously. The token MUST be unique inside this communication domain. And
> > +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks.
> > +
> > +The device MUST clear the memory of this ism region before committing to the
> > +driver.
> > +
> > +If \field{size} is smaller than \field{chunk_size}, the ism region also
> > +occupies one chunk.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG.
> > +
> > +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +Based on controlq, the driver can query the information of an ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query {
> > + le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query_reply {
> > + le64 size;
> > + le64 permissions;
> Better to define the decoding of the permissions here.
> Also below you have field rw_perm. May be rename both to sama name.
Will fix.
Thanks.
>
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_QUERY 1
> > + #define VIRTIO_ISM_CTRL_QUERY_INFO 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> > +\field{size} with the size of the ism region. The device MUST get the
> > +permissions of this ism region for the device-self, then fill in the
> > +\field{permissions}.
> > +
> > +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +Based on controlq, the driver can request to attach an ism region with a
> > +specified token.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach {
> > + le64 token;
> > + le64 rw_perm;
> > + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach_reply {
> > + le64 size;
> > + le64 num;
> > + le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ATTACH 2
> > + #define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST first look for the free chunks from the device memory. And the
> > +physical memory of the ism region is mapped to these chunks in order. If there is
> > +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> > +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks. \field{size} is the size of this ism region.
> > +
> > +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> > +assigning the physical memory of this ism region into the device memory.
> > +
> > +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> > +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> > +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> > +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> > +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> > +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> > +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> > +
> > +The device MUST set the read and write permissions of the physical memory inside
> > +the device memory based on \field{rw_perm}.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> > +
> > +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we w
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-03-24 4:51 ` Parav Pandit
2023-03-24 6:35 ` Xuan Zhuo
@ 2023-03-24 9:10 ` Michael S. Tsirkin
1 sibling, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2023-03-24 9:10 UTC (permalink / raw)
To: Parav Pandit
Cc: Xuan Zhuo, virtio-comment, hans, herongguang, zmlcc, dust.li,
tonylu, zhenzao, helinguo, gerry, cohuck, jasowang, Jan Kiszka,
wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
pasic
On Fri, Mar 24, 2023 at 12:51:13AM -0400, Parav Pandit wrote:
>
>
> On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
>
> I am still learning this new memory sharing beast, so mostly cosmetics
> comments below.
>
> instead of guest of virtual machines is confusing.
> It can be just written as - can be shared between virtual machines running
> on same host ..
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> >
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> >
> multiple peers mean, multiple peer ism device?
> Or you meant multiple vms?
>
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------| |------------------------------------------------| |
> > | | Guest | | Guest | |
> > | | | | | |
> instead of guest, naming it VM makes it consistent with the spec and
> description.
Ideally we just talk about drivers. But it's just a commit log,
it does not matter.
If you are talking about software that runs then calling it VM
is wrong - VM is the interface between guest software and hypervisor.
> > | | ---------------- | | ---------------- | |
> > | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> > | | ---------------- | | | | | ---------------- | | | |
> > | | |cq| |map |map |map | | |cq| |map |map | |
> > | | | | | | | | | | | | | | |
> > | | | | ------------------- | | | | -------------------- | |
> > | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> > | | | | ------------------- | | | | -------------------- | |
> > | | | | | | | |
> > | | | | | | | |
> > | | Qemu | | | Qemu | | |
> > | |--------------------------------+---------------| |-------------------------------+----------------| |
> > | | | |
> > | | | |
> > | |------------------------------+------------------------| |
> > | | |
> > | | |
> > | -------------------------- |
> > | | M1 | | M2 | | M3 | |
> > | -------------------------- |
> > | |
> > | HOST |
> > ---------------------------------------------------------------------------------------------------------------
> >
> You might want to show the event q as well next to control q as its so
> fundamental for this whole operation.
>
> > ---
> > conformance.tex | 26 +++
> > content.tex | 1 +
> > virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 600 insertions(+)
> > create mode 100644 virtio-ism.tex
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> > \end{itemize}
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> > \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> > A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> > \end{itemize}
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> Though using control vq to configure irq is good idea, I guess
> queue_msix_vector can be used. It needs some more text that
> queue_msix_vector will not be used etc.
>
> if the event queue create interface exists over ctrl vq than things tied up
> well. Else I think transport specific existing scheme more aligns to the
> spec.
>
> > +\end{itemize}
> > +
> > \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> > A conformant implementation MUST be either transitional or
> > non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> > \input{virtio-scmi.tex}
> > \input{virtio-gpio.tex}
> > \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> > \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------| |------------------------------------------------| |
> > +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> > +| | | | | | | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | | driver | | | | | | | driver | | | | |
> > +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> > +| | |cq| |map |map |map | | |cq| |map |map | |
> > +| | | | | | | | | | | | | | |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> > +| | | | ------------------- | | | | ------------------- | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| | | | | | | |
> > +| |--------------------------------+---------------| |--------------------------------+---------------| |
> > +| | | |
> > +| | | |
> > +| |------------------------------+------------------------| |
> > +| | |
> Above line is misaligned.
>
> > +| | |
> > +| -------------------------- |
> > +| | M1 | | M2 | | M3 | |
> > +| -------------------------- |
> > +| |
> > +| |
> > +|-------------------------------------------------------------------------------------------------------------|
And I think this is a bad way to include drawings.
There are lots of latex packages for diagrams, please
pick one.
> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> The peer driver will make it more clear.
>
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > + 44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > + don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > + le128 cdid;
> le128 data type is not present in the spec.
> Better to define uuid data structure and refer here.
>
> struct uuid {..}
>
> struct uuid cdid;
> ...
Agree. And is it really little endian?
--
MST
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread
* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
2023-02-09 3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
` (3 preceding siblings ...)
2023-03-24 4:51 ` Parav Pandit
@ 2023-04-26 7:41 ` Xuan Zhuo
4 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-04-26 7:41 UTC (permalink / raw)
To: virtio-comment
Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic
Any commnets for this.
Thanks.
On Thu, 9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine
> including an ism device and with the guests getting the shared memory by the ism
> devices.
>
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
>
> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------| |------------------------------------------------| |
> | | Guest | | Guest | |
> | | | | | |
> | | ---------------- | | ---------------- | |
> | | | driver | [M1] [M2] [M3] | | | driver | [M2] [M3] | |
> | | ---------------- | | | | | ---------------- | | | |
> | | |cq| |map |map |map | | |cq| |map |map | |
> | | | | | | | | | | | | | | |
> | | | | ------------------- | | | | -------------------- | |
> | |----|--|----------------| device memory |-----| |----|--|----------------| device memory |----| |
> | | | | ------------------- | | | | -------------------- | |
> | | | | | | | |
> | | | | | | | |
> | | Qemu | | | Qemu | | |
> | |--------------------------------+---------------| |-------------------------------+----------------| |
> | | | |
> | | | |
> | |------------------------------+------------------------| |
> | | |
> | | |
> | -------------------------- |
> | | M1 | | M2 | | M3 | |
> | -------------------------- |
> | |
> | HOST |
> ---------------------------------------------------------------------------------------------------------------
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> ---
> conformance.tex | 26 +++
> content.tex | 1 +
> virtio-ism.tex | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 600 insertions(+)
> create mode 100644 virtio-ism.tex
>
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
> \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>
> A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> \end{itemize}
>
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +\end{itemize}
> +
> \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> A conformant implementation MUST be either transitional or
> non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> \input{virtio-scmi.tex}
> \input{virtio-gpio.tex}
> \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>
> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------| |------------------------------------------------| |
> +| | VM [M1] [M2] [M3] | | VM [M2] [M3] | |
> +| | | | | | | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | | driver | | | | | | | driver | | | | |
> +| | -----------------------|------|------|--- | | ------------------------------|------|--- | |
> +| | |cq| |map |map |map | | |cq| |map |map | |
> +| | | | | | | | | | | | | | |
> +| | | | ------------------- | | | | ------------------- | |
> +| |----|--|----------------| device memory |-----| |----|--|----------------| device memory |-----| |
> +| | | | ------------------- | | | | ------------------- | |
> +| | | | | | | |
> +| | | | | | | |
> +| | | | | | | |
> +| |--------------------------------+---------------| |--------------------------------+---------------| |
> +| | | |
> +| | | |
> +| |------------------------------+------------------------| |
> +| | |
> +| | |
> +| -------------------------- |
> +| | M1 | | M2 | | M3 | |
> +| -------------------------- |
> +| |
> +| |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> + 44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> + don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> + le128 cdid;
> + le64 devid;
> + le64 chunk_size;
> + le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[\field{cdid}] This is used to identify the communication domain. Only
> + ism devices with the same \field{cdid} can communicate. A \field{cdid}
> + is world-wide unique in a sense that there not be another communication
> + domain with the same \field{cdid}.
> +
> + \item[\field{devid}] This is used to identify the ism device in the single
> + communication domain.
> +
> + \item[\field{chunk_size}] This is the size of the ism chunk. The device
> + memory is divided into multiple chunks. Every ism chunk has the same
> + size.
> +
> + \item[\field{notify_size}] This is the size of the ism notify-address. The
> + notify-address is used to notify the device that the content of the
> + ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device. The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3) // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> + \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> + other peers of the update event of the ism region content.
> +
> + \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> + \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> + \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> + le64 ev_type;
> + le64 offset;
> + le64 devid;
> + le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> + le64 ev_type;
> + void *buffer;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> + structure based on this.
> +
> + If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> + If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> + device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> + VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> + VIRTIO_ISM_SHM_ID_REGIONS = 1,
> + VIRTIO_ISM_SHM_ID_NOTIFY = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> + u8 class;
> + u8 command;
> + u8 command_specific_data[];
> + u8 ack;
> + u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK 0
> +#define VIRTIO_ISM_ERR 255
> +
> +#define VIRTIO_ISM_ENOENT 2
> +#define VIRTIO_ISM_E2BIG 7
> +#define VIRTIO_ISM_ENOMEM 12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD 101
> +#define VIRTIO_ISM_PERM_EWRITE 102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> + le64 size;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> + le64 token;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC 0
> + #define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> + le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> + le64 size;
> + le64 permissions;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY 1
> + #define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> + le64 token;
> + le64 rw_perm;
> + void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> + le64 size;
> + le64 num;
> + le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH 2
> + #define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> +header part of the last chunk is effective.
> +
> +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +Based on controlq, the device can release references to the ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_detach {
> + le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_DETACH 3
> + #define VIRTIO_ISM_CTRL_DETACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> +region, the device who allocated this ism region MUST generate a
> +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> +
> +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> +attach operation, then the buffer is free after the detach operation.
> +
> +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +Based on controlq, the driver can set the permissions for each ism
> +region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_grant_default {
> + le64 token;
> + le64 permissions;
> +};
> +
> +struct virtio_ism_ctrl_grant_dev {
> + le64 token;
> + le64 permissions;
> + le64 peer_devid;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_GRANT 4
> + #define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT 0
> + #define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> +\end{lstlisting}
> +
> +The permissions:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_PERM_READ (1 << 0)
> +#define VIRTIO_ISM_PERM_WRITE (1 << 1)
> +
> +#define VIRTIO_ISM_PERM_ATTACH (1 << 2)
> +
> +#define VIRTIO_ISM_PERM_MANAGE (1 << 3)
> +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT (1 << 4)
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[VIRTIO_ISM_PERM_READ] read permission
> +\item[VIRTIO_ISM_PERM_WRITE] write permission
> +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> +
> +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> + permission can modify the permission of this ism region. By default, only
> + the device that allocated the region has this permission.
> +
> +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> + with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> +
> +\end{description}
> +
> +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> +
> +Permission control is divided into two categories, one is the dedicated
> +permissions for a certain device, and the other is the default permissions for
> +all device.
> +
> +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The default permissions of the newly allocated ism region MUST be (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> +
> +For a certain device, if there is dedicated permissions, the default
> +permissions are illegal to this device.
> +
> +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> +
> +For the ism region update event, the driver can bind an interrupt to the ism
> +region. Then this ism region's update event will no longer be passed through
> +eventq, but the interrupt will be directly triggered.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_irq_vector {
> + le64 token;
> + le64 vector;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_EVENT_VECTOR 5
> + #define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +
> +The device MUST record the relationship between the ism region and the vector
> +notified by the driver, and notify the update event of this region to driver
> +by the corresponding vector. And the device MUST NOT use eventq to send the
> +update event of this ism region.
> +
> +
> --
> 2.32.0.3.g01195cf9f
>
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread