virtio-dev.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/1] introduce virtio-ism: internal shared memory device
@ 2023-02-09  3:30 Xuan Zhuo
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
  0 siblings, 1 reply; 20+ messages in thread
From: Xuan Zhuo @ 2023-02-09  3:30 UTC (permalink / raw)
  To: virtio-comment
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, xuanzhuo, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic

Hello everyone,

# Background

    Nowadays, there is a common scenario to accelerate communication between
    different VMs and containers, including light weight virtual machine based
    containers. One way to achieve this is to colocate them on the same host.
    However, the performance of inter-VM communication through network stack is
    not optimal and may also waste extra CPU cycles. This scenario has been
    discussed many times, but still no generic solution available [1] [2] [3].

    With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
    We found that by changing the communication channel between VMs from TCP to
    SMC with shared memory, we can achieve superior performance for a common
    socket-based application[5]:
      - latency reduced by about 50%
      - throughput increased by about 300%
      - CPU consumption reduced by about 50%

    Since there is no particularly suitable shared memory management solution
    matches the need for SMC(See ## Comparison with existing technology), and
    virtio is the standard for communication in the virtualization world, we
    want to implement a virtio-ism device based on virtio, which can support
    on-demand memory sharing across VMs, containers or VM-container. To match
    the needs of SMC, the virtio-ism device need to support:

    1. Dynamic provision: shared memory regions are dynamically allocated and
       provisioned.
    2. Multi-region management: the shared memory is divided into regions,
       and a peer may allocate one or more regions from the same shared memory
       device.
    3. Permission control: the permission of each region can be set seperately.
    4. Dynamic connection: each ism region of a device can be shared with
       different devices, eventually a device can be shared with thousands of
       devices

# Virtio ISM device

    An ISM(Internal Shared Memory) device provides the ability to access memory
    shared between multiple devices. This allows low-overhead communication in
    presence of such memory. For example, memory can be shared with guests of
    multiple virtual machines running on the same host, with each virtual
    machine including an ism device and with the guests getting the shared
    memory by the ism devices.

    An ism device can communicate with multiple peers simultaneously. This
    communication can be dynamically started and ended.

## Design

    This is a structure diagram based on ism sharing between two vms.

    |-------------------------------------------------------------------------------------------------------------|
    | |------------------------------------------------|       |------------------------------------------------| |
    | | Guest                                          |       | Guest                                          | |
    | |                                                |       |                                                | |
    | |   ----------------                             |       |   ----------------                             | |
    | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
    | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
    | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
    | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |                                |               |       |                               |                | |
    | |                                |               |       |                               |                | |
    | | Qemu                           |               |       | Qemu                          |                | |
    | |--------------------------------+---------------|       |-------------------------------+----------------| |
    |                                  |                                                       |                  |
    |                                  |                                                       |                  |
    |                                  |------------------------------+------------------------|                  |
    |                                                                 |                                           |
    |                                                                 |                                           |
    |                                                   --------------------------                                |
    |                                                    | M1 |   | M2 |   | M3 |                                 |
    |                                                   --------------------------                                |
    |                                                                                                             |
    | HOST                                                                                                        |
    ---------------------------------------------------------------------------------------------------------------

## Inspiration

    Our design idea for virtio-ism comes from IBM's ISM device, to pay tribute,
    we directly name this device "ism".

    Information about IBM ism device and SMC:
      1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
      2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
      3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
      4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
      5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf

## ISM VLAN

    Since SMC uses TCP to handshake with IP facilities, virtio-ism device is not
    bound to existing IP device, and the latest ISMv2 device doesn't require
    VLAN. So it is not necessary for virtio-ism to support VLAN attributes.

## Live Migration

    Currently SMC-D doesn't support migration to another device or fallback. And
    SMC-R supports migration to another link, no fallback.

    So we may not support live migration for the time being.

## About hot plugging of the ism device

    Hot plugging of devices is a heavier, possibly failed, time-consuming, and
    less scalable operation. So, we don't plan to support it for now.


# Usage (SMC as example)

    There is one of possible use cases:

    1. SMC calls the interface ism_alloc_region() of the ism driver to return
       the location of a memory region in the PCI space and a token.
    2. The ism driver mmap the memory region and return to SMC with the token
    3. SMC passes the token to the connected peer
    4. the peer calls the ism driver interface ism_attach_region(token) to
       get the location of the PCI space of the shared memory
    5. The connected pair communicating through the shared memory

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

   1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
      that use this VM, so the security is not enough.

   2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
      all other VMs that use the ivshmem 2.0 shared memory device, which also
      does not meet our needs in terms of security.

## vhost-pci and virtiovhostuser

    1. does not support dynamic allocation
    2. one device just support connect to one vm


# POC CODE

There are no functions related to eventq and perm yet.
This implementation is for V2 version spec.

## Qemu (virtio ism device):

     https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1

    Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".

##  Kernel (virtio ism driver and smc support):

     https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223


### SMC

    Support SMC-D works with virtio-ism.

    Use SMC with virtio-ism to accelerate inter-VM communication.

    1. insmod virtio-ism and smc module.
    2. use smc-tools [1] to get the device name of SMC-D based on virtio-ism.

      $ smcd d # here is _virtio2_
      FID  Type  PCI-ID        PCHID  InUse  #LGs  PNET-ID
      0000 0     virtio2       0000   Yes       1  *C1

    3. add the nic and SMC-D device to the same pnet, do it in both client and server.

      $ smc_pnet -a -I eth1 c1 # use eth1 to setup SMC connection
      $ smc_pnet -a -D virtio2 c1 # virtio2 is the virtio-ism device

    4. use SMC to accelerate your application, smc_run in [1] can do this.

      # smc_run use LD_PRELOAD to hijack socket syscall with AF_SMC
      $ smc_run sockperf server --tcp # run in server
      $ smc_run sockperf tp --tcp -i a.b.c.d # run in client

    [1] https://github.com/ibm-s390-linux/smc-tools

    Notice: The current POC state, we only tested some basic functions.

### App inside user space

    The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
    device in user space directly.

    Try tools/virtio/virtio-ism/virtio-ism-mmap

    Usage:
         cd tools/virtio/virtio-ism/; make
         insmode virtio-ism.ko

    case1: communicate

       vm1: ./virtio-ism-mmap alloc -> token
       vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit

       vm2 will write msg to shared memory, then notify vm1. After vm1 receive
       notify, then read from shared memory.

    case2: ping-pong test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 pp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client commit(kick) to server, server recv notify, commit(kick) to client
        4. loop #3

    case3: throughput test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client write 1M data to ism region
        4. client commit(kick) to server
        5. server recv notify, copy the data, the commit(kick) back to client
        6. loop #3-#5

    case4: throughput test with protocol defined by user.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000

        Used the ism region as a ring.

        In this scene, client and server are in the polling mode. Test it on
        my machine, the maximum can reach 12GBps

# References

    [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
    [2] https://dl.acm.org/doi/10.1145/2847562
    [3] https://hal.archives-ouvertes.fr/hal-00368622/document
    [4] https://lwn.net/Articles/711071/
    [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/


If there are any problems, please point them out.
Hope to hear from you, thank you.

v3:
   1. support to apply memory from vm
   2. add query operation
   3. optimize the description of spec and enrich some details
   4. use the communication domain as a term
   5. replace gid with cdid

v2:
   1. add Attach/Detach event
   2. add Events Filter
   3. allow Alloc/Attach huge region
   4. remove host/guest terms

v1:
   1. cover letter adding explanation of ism vlan
   2. spec add gid
   3. explain the source of ideas about ism
   4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap


Xuan Zhuo (1):
  virtio-ism: introduce new device virtio-ism

 conformance.tex |  26 +++
 content.tex     |   1 +
 virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 600 insertions(+)
 create mode 100644 virtio-ism.tex

-- 
2.32.0.3.g01195cf9f


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 [PATCH v3 0/1] introduce virtio-ism: internal shared memory device Xuan Zhuo
@ 2023-02-09  3:30 ` Xuan Zhuo
  2023-02-09  3:35   ` [virtio-comment] " Parav Pandit
                     ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-02-09  3:30 UTC (permalink / raw)
  To: virtio-comment
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, xuanzhuo, mst, cohuck, jasowang, Jan Kiszka, wintera,
	kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic

An ISM(Internal Shared Memory) device provides the ability to access memory
shared between multiple devices. This allows low-overhead communication in
presence of such memory. For example, memory can be shared with guests of
multiple virtual machines running on the same host, with each virtual machine
including an ism device and with the guests getting the shared memory by the ism
devices.

An ism device can communicate with multiple peers simultaneously. This
communication can be dynamically started and ended.

|-------------------------------------------------------------------------------------------------------------|
| |------------------------------------------------|       |------------------------------------------------| |
| | Guest                                          |       | Guest                                          | |
| |                                                |       |                                                | |
| |   ----------------                             |       |   ----------------                             | |
| |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
| |   ----------------       |      |      |       |       |   ----------------               |      |      | |
| |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
| |    |  |                  |      |      |       |       |    |  |                          |      |      | |
| |    |  |                -------------------     |       |    |  |                --------------------    | |
| |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
| |    |  |                -------------------     |       |    |  |                --------------------    | |
| |                                |               |       |                               |                | |
| |                                |               |       |                               |                | |
| | Qemu                           |               |       | Qemu                          |                | |
| |--------------------------------+---------------|       |-------------------------------+----------------| |
|                                  |                                                       |                  |
|                                  |                                                       |                  |
|                                  |------------------------------+------------------------|                  |
|                                                                 |                                           |
|                                                                 |                                           |
|                                                   --------------------------                                |
|                                                    | M1 |   | M2 |   | M3 |                                 |
|                                                   --------------------------                                |
|                                                                                                             |
| HOST                                                                                                        |
---------------------------------------------------------------------------------------------------------------

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
Suggested-by: Halil Pasic <pasic@linux.ibm.com>
---
 conformance.tex |  26 +++
 content.tex     |   1 +
 virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 600 insertions(+)
 create mode 100644 virtio-ism.tex

diff --git a/conformance.tex b/conformance.tex
index c3c1d3e..0a1456a 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
 \end{itemize}
 
+\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
+
+A ISM driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
+\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
+\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
+\end{itemize}
+
 \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
 
 A device MUST conform to the following normative statements:
@@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
 \end{itemize}
 
+\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
+
+A ISM device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
+\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
+\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
+\end{itemize}
+
 \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
 A conformant implementation MUST be either transitional or
 non-transitional, see \ref{intro:Legacy
diff --git a/content.tex b/content.tex
index 96f4723..fe02aec 100644
--- a/content.tex
+++ b/content.tex
@@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
 \input{virtio-scmi.tex}
 \input{virtio-gpio.tex}
 \input{virtio-pmem.tex}
+\input{virtio-ism.tex}
 
 \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
diff --git a/virtio-ism.tex b/virtio-ism.tex
new file mode 100644
index 0000000..a1720d8
--- /dev/null
+++ b/virtio-ism.tex
@@ -0,0 +1,573 @@
+\section{ISM Device}\label{sec:Device Types / ISM Device}
+
+\begin{lstlisting}
+|-------------------------------------------------------------------------------------------------------------|
+| |------------------------------------------------|    |------------------------------------------------|    |
+| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
+| |                          |      |      |       |    |                                 |      |       |    |
+| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
+| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
+| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
+| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
+| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
+| |    |  |                -------------------     |    |    |  |                -------------------     |    |
+| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
+| |    |  |                -------------------     |    |    |  |                -------------------     |    |
+| |                                |               |    |                                |               |    |
+| |                                |               |    |                                |               |    |
+| |                                |               |    |                                |               |    |
+| |--------------------------------+---------------|    |--------------------------------+---------------|    |
+|                                  |                                                       |                  |
+|                                  |                                                       |                  |
+|                                  |------------------------------+------------------------|                  |
+|                                                                 |                                           |
+|                                                                 |                                           |
+|                                                   --------------------------                                |
+|                                                    | M1 |   | M2 |   | M3 |                                 |
+|                                                   --------------------------                                |
+|                                                                                                             |
+|                                                                                                             |
+|-------------------------------------------------------------------------------------------------------------|
+\end{lstlisting}
+
+An ISM(Internal Shared Memory) device provides the ability to access memory
+shared between multiple devices. This allows low-overhead communication in
+presence of such memory. For example, memory can be shared with guests of
+multiple virtual machines running on the same host, with each virtual machine
+including an ism device and with the guests getting the shared memory by the ism
+devices.
+
+An ism device can communicate with multiple peers simultaneously. This
+communication can be dynamically started and ended.
+
+All the devices with the ability to communicate with each other form a
+communication domain. Two devices from different communication domains can't
+communicate.
+
+The device memory of the ism device is divided into multiple chunks with the
+same size. Every ism region contains multiple chunks. When communicating between
+two devices, an ism region is used as a shared memory.
+
+The ism region is the basis for communication between ism devices.
+
+The process of communication between two drivers is that one driver allocates an
+ism region and obtains a token. Then the peer uses this token to attach the same
+ism region, the two drivers realize the memory(ism region) sharing. The driver
+can also notify peer by kick notify-address the ism region has been updated.
+
+An ism region can be referred by its \field{token}, or the \field{offset}.
+The \field{offset} is the offset of the first chunk inside the ism region
+starting from the device memory head.
+
+\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
+	44
+
+\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
+\begin{description}
+\item[0] controlq
+\item[1] eventq
+\end{description}
+
+\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
+\begin{description}
+\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
+    don't need to provide memory for alloc/attach operation.
+
+\end{description}
+
+\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
+
+\begin{lstlisting}
+struct virtio_ism_config {
+	le128 cdid;
+	le64 devid;
+	le64 chunk_size;
+	le64 notify_size;
+};
+\end{lstlisting}
+
+\begin{description}
+    \item[\field{cdid}] This is used to identify the communication domain. Only
+        ism devices with the same \field{cdid} can communicate. A \field{cdid}
+        is world-wide unique in a sense that there not be another communication
+        domain with the same \field{cdid}.
+
+    \item[\field{devid}] This is used to identify the ism device in the single
+        communication domain.
+
+    \item[\field{chunk_size}] This is the size of the ism chunk. The device
+        memory is divided into multiple chunks. Every ism chunk has the same
+        size.
+
+    \item[\field{notify_size}] This is the size of the ism notify-address. The
+        notify-address is used to notify the device that the content of the
+        ism region has been updated.
+
+\end{description}
+
+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
+
+The device MUST ensure that the \field{cdid} of the device on the same
+communication domain is same. The \field{cdid} MUST be a version 4 UUID as
+specified by \hyperref[intro:rfc4122]{[RFC4122]}.
+
+In the single communication domain, the device MUST ensure that the \field{devid}
+is unique.
+
+\field{chunk_size} MUST be a power of two.
+
+\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
+
+The ism region is on the device memory. It is a physical memory for driver, but
+it is not for device. The device has the ability to modify the physical memory
+corresponding to the device memory.
+
+If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
+The driver pass its memory to the device when alloc a new ism region. The
+device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
+is negotiated, physical memory is provided by the device.
+
+For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
+provides memory to device. Under normal circumstances, this memory will not be
+used. This is to limit driver to take up too much memory based on attach
+operation.
+
+\subsection{Event}\label{sec:Device Types / ISM Device / Event}
+
+The ism device supports event notification of the ism region. When a device
+kick/attach/detach a region, other ism region referrers will receive related
+events.
+
+A buffer received from eventq can contain multiple event structures.
+
+\begin{lstlisting}
+#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
+#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
+#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
+#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
+\end{lstlisting}
+
+\begin{description}
+    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
+        other peers of the update event of the ism region content.
+
+    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
+    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
+    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
+\end{description}
+
+The event structures:
+\begin{lstlisting}
+struct virtio_ism_event_update {
+	le64 ev_type;
+	le64 offset;
+	le64 devid;
+};
+
+struct virtio_ism_event_attach_detach {
+	le64 ev_type;
+	le64 offset;
+	le64 devid;
+	le64 peers;
+};
+
+struct virtio_ism_event_buffer_free {
+	le64 ev_type;
+    void *buffer;
+};
+
+\end{lstlisting}
+
+\begin{description}
+\item[\field{ev_type}] The type of event, the driver can get the size of the
+    structure based on this.
+
+        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
+        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
+        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
+
+\item[\field{offset}] The offset of ism regions with the event.
+
+\item[\field{devid}] \field{devid} of the device that generated events.
+\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
+    device that receiving this event)
+
+\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
+
+\end{description}
+
+\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
+
+The permissions of an ism region determine whether this ism region can be
+attached and the read and write permissions after attach. The driver can set the
+default permissions for all devices, or set the dedicated permissions for a
+certain device. If there are dedicated permissions for a device, the default
+permissions are invalid for this device.
+
+For example, the driver can set the default permission of an ism region as
+read-only, and set the dedicated permissions as writable for a device with devid
+0xff. Then all devices except 0xff in such a communication domain can attach
+this ism region and only have read-only permission. And the device 0xff can
+write to the ism region after attaching.
+
+When a driver has the management permission of the ism region, then it can
+modify the permissions of this ism region. By default, only the device that
+allocated the ism region has this permission.
+
+\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
+
+\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
+
+The device MUST generate a \field{devid}. \field{devid} remains unchanged
+during reset. \field{devid} MUST NOT be 0.
+
+The device exposes memory to the driver based on Shared Memory Regions
+\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
+However, it does not need to allocate physical memory during initialization.
+
+The \field{shmid} of a region MUST be one of the following
+\begin{lstlisting}
+enum virtio_ism_shm_id {
+	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
+	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
+	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
+};
+\end{lstlisting}
+
+The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
+
+The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
+driver. This memory area is used to notify the device with update event, and
+each ism region MUST have a corresponding notify-address inside this area, and
+the size of the notify-address is \field{notify_size}.
+
+\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
+
+The driver MUST NOT access any chunk before it is referred by one ism region.
+
+\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
+
+The driver uses the control virtqueue send commands to implement operations on
+the ism region and some global configurations.
+
+All commands are of the following form:
+\begin{lstlisting}
+struct virtio_ism_ctrl {
+	u8 class;
+	u8 command;
+	u8 command_specific_data[];
+	u8 ack;
+	u8 command_specific_data_reply[];
+};
+
+/* ack values */
+#define VIRTIO_ISM_OK      0
+#define VIRTIO_ISM_ERR     255
+
+#define VIRTIO_ISM_ENOENT  2
+#define VIRTIO_ISM_E2BIG   7
+#define VIRTIO_ISM_ENOMEM  12
+#define VIRTIO_ISM_ENOSPEC 28
+
+#define VIRTIO_ISM_PERM_EATTACH 100
+#define VIRTIO_ISM_PERM_EREAD   101
+#define VIRTIO_ISM_PERM_EWRITE  102
+\end{lstlisting}
+
+The \field{class}, \field{command} and command-specific-data are set by the
+driver, and the device sets the \field{ack} byte and optionally
+\field{command-specific-data-reply}.
+
+\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
+
+\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+Based on controlq, the driver can request to allocate an ism region.
+
+The reply from the device will carry a token, which can be passed
+to other driver for attaching to this ism region.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_alloc {
+	le64 size;
+    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
+};
+\end{lstlisting}
+
+The \field{command_specific_data_reply} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_alloc_reply {
+	le64 token;
+	le64 num;
+    le64 chunks[];
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_ALLOC  0
+	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
+\end{lstlisting}
+
+
+\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
+free chunks from the device memory. And the physical memory is mapped to these
+chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
+provide physical memory by itself. Otherwise, the memory provided by the driver
+will be used. If there is no enough free chunk, the device MUST set \field{ack}
+to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
+the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
+
+The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
+ism region. A new token MUST be dynamically created for this ism region
+simultaneously. The token MUST be unique inside this communication domain. And
+the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
+is the number of the chunks.
+
+The device MUST clear the memory of this ism region before committing to the
+driver.
+
+If \field{size} is smaller than \field{chunk_size}, the ism region also
+occupies one chunk.
+
+If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
+to VIRTIO_ISM_E2BIG.
+
+\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
+
+If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
+with \field{buffer}. The size of the buffer MUST equal to the size of the ism
+region that we want to allocate.
+
+\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
+Based on controlq, the driver can query the information of an ism region.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_query {
+	le64 token;
+};
+\end{lstlisting}
+
+The \field{command_specific_data_reply} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_query_reply {
+	le64 size;
+	le64 permissions;
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_QUERY  1
+	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
+\end{lstlisting}
+
+\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
+
+If the ism region specified by \field{token} does not exist, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
+\field{size} with the size of the ism region. The device MUST get the
+permissions of this ism region for the device-self, then fill in the
+\field{permissions}.
+
+\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
+
+Based on controlq, the driver can request to attach an ism region with a
+specified token.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_attach {
+	le64 token;
+	le64 rw_perm;
+    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
+};
+\end{lstlisting}
+
+The \field{command_specific_data_reply} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_attach_reply {
+	le64 size;
+	le64 num;
+    le64 chunks[];
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_ATTACH  2
+	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
+\end{lstlisting}
+
+\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
+
+If the ism region specified by \field{token} does not exist, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+The device MUST first look for the free chunks from the device memory. And the
+physical memory of the ism region is mapped to these chunks in order. If there is
+no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
+The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
+is the number of the chunks. \field{size} is the size of this ism region.
+
+The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
+assigning the physical memory of this ism region into the device memory.
+
+If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
+region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
+\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
+VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
+to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
+the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
+the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
+
+The device MUST set the read and write permissions of the physical memory inside
+the device memory based on \field{rw_perm}.
+
+If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
+to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
+
+The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
+
+\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
+
+If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
+with \field{buffer}. The size of the buffer MUST equal to the size of the ism
+region that we want to allocate.
+
+If \field {size} is less than \field{num} times \field {chunk_size}, then only
+header part of the last chunk is effective.
+
+\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
+Based on controlq, the device can release references to the ism region.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_detach {
+	le64 token;
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_DETACH  3
+	#define VIRTIO_ISM_CTRL_DETACH_REGION 0
+\end{lstlisting}
+
+\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
+
+If the ism region specified by \field{token} is not attached, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
+
+If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
+region, the device who allocated this ism region MUST generate a
+VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
+
+\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
+
+If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
+attach operation, then the buffer is free after the detach operation.
+
+\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
+Based on controlq, the driver can set the permissions for each ism
+region.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_grant_default {
+	le64 token;
+	le64 permissions;
+};
+
+struct virtio_ism_ctrl_grant_dev {
+	le64 token;
+	le64 permissions;
+	le64 peer_devid;
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_GRANT  4
+	#define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT    0
+	#define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
+\end{lstlisting}
+
+The permissions:
+\begin{lstlisting}
+#define VIRTIO_ISM_PERM_READ       (1 << 0)
+#define VIRTIO_ISM_PERM_WRITE      (1 << 1)
+
+#define VIRTIO_ISM_PERM_ATTACH     (1 << 2)
+
+#define VIRTIO_ISM_PERM_MANAGE     (1 << 3)
+#define VIRTIO_ISM_PERM_CLEAN_DEFAULT     (1 << 4)
+\end{lstlisting}
+
+\begin{description}
+\item[VIRTIO_ISM_PERM_READ] read permission
+\item[VIRTIO_ISM_PERM_WRITE] write permission
+\item[VIRTIO_ISM_PERM_ATTACH] attach permission
+
+\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
+	permission can modify the permission of this ism region. By default, only
+	the device that allocated the region has this permission.
+
+\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
+    with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
+
+\end{description}
+
+VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
+VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
+
+Permission control is divided into two categories, one is the dedicated
+permissions for a certain device, and the other is the default permissions for
+all device.
+
+\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
+
+If the ism region specified by \field{token} is not attached, the device MUST set
+\field{ack} to VIRTIO_ISM_ENOENT.
+
+The default permissions of the newly allocated ism region MUST be  (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
+The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
+
+For a certain device, if there is dedicated permissions, the default
+permissions are illegal to this device.
+
+\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
+
+For the ism region update event, the driver can bind an interrupt to the ism
+region. Then this ism region's update event will no longer be passed through
+eventq, but the interrupt will be directly triggered.
+
+The \field{command_specific_data} has the following format:
+\begin{lstlisting}
+struct virtio_ism_ctrl_irq_vector {
+	le64 token;
+	le64 vector;
+};
+\end{lstlisting}
+
+The \field{class} and \field{command}:
+\begin{lstlisting}
+#define VIRTIO_ISM_CTRL_EVENT_VECTOR  5
+	#define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
+\end{lstlisting}
+
+
+\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
+
+The device MUST record the relationship between the ism region and the vector
+notified by the driver, and notify the update event of this region to driver
+by the corresponding vector. And the device MUST NOT use eventq to send the
+update event of this ism region.
+
+
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* RE: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
@ 2023-02-09  3:35   ` Parav Pandit
  2023-02-09  3:36     ` Xuan Zhuo
  2023-03-07 11:15   ` [virtio-dev] " Xuan Zhuo
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Parav Pandit @ 2023-02-09  3:35 UTC (permalink / raw)
  To: Xuan Zhuo, virtio-comment
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic

Hi Xuan,

> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Xuan Zhuo

>  conformance.tex |  26 +++
>  content.tex     |   1 +
>  virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++

Can you please rebase your patches?
Devices are placed under their own directory and maintained in multiple smaller files.

device-types/ism/...


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RE: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:35   ` [virtio-comment] " Parav Pandit
@ 2023-02-09  3:36     ` Xuan Zhuo
  0 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-02-09  3:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
	virtio-comment

On Thu, 9 Feb 2023 03:35:33 +0000, Parav Pandit <parav@nvidia.com> wrote:
> Hi Xuan,
>
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Xuan Zhuo
>
> >  conformance.tex |  26 +++
> >  content.tex     |   1 +
> >  virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
>
> Can you please rebase your patches?
> Devices are placed under their own directory and maintained in multiple smaller files.
>
> device-types/ism/...


OK, will fix.

Thanks.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
  2023-02-09  3:35   ` [virtio-comment] " Parav Pandit
@ 2023-03-07 11:15   ` Xuan Zhuo
  2023-03-15 11:15     ` Xuan Zhuo
  2023-03-23 14:46   ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-07 11:15 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
	virtio-comment


Any comments for this. ^_^

Thanks.

On Thu,  9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine
> including an ism device and with the guests getting the shared memory by the ism
> devices.
>
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
>
> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------|       |------------------------------------------------| |
> | | Guest                                          |       | Guest                                          | |
> | |                                                |       |                                                | |
> | |   ----------------                             |       |   ----------------                             | |
> | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |                                |               |       |                               |                | |
> | |                                |               |       |                               |                | |
> | | Qemu                           |               |       | Qemu                          |                | |
> | |--------------------------------+---------------|       |-------------------------------+----------------| |
> |                                  |                                                       |                  |
> |                                  |                                                       |                  |
> |                                  |------------------------------+------------------------|                  |
> |                                                                 |                                           |
> |                                                                 |                                           |
> |                                                   --------------------------                                |
> |                                                    | M1 |   | M2 |   | M3 |                                 |
> |                                                   --------------------------                                |
> |                                                                                                             |
> | HOST                                                                                                        |
> ---------------------------------------------------------------------------------------------------------------
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  conformance.tex |  26 +++
>  content.tex     |   1 +
>  virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 600 insertions(+)
>  create mode 100644 virtio-ism.tex
>
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
>  \end{itemize}
>
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
>  \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>
>  A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
>  \end{itemize}
>
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +\end{itemize}
> +
>  \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
>  A conformant implementation MUST be either transitional or
>  non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
>  \input{virtio-scmi.tex}
>  \input{virtio-gpio.tex}
>  \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>
>  \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------|    |------------------------------------------------|    |
> +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> +| |                          |      |      |       |    |                                 |      |       |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> +|                                  |                                                       |                  |
> +|                                  |                                                       |                  |
> +|                                  |------------------------------+------------------------|                  |
> +|                                                                 |                                           |
> +|                                                                 |                                           |
> +|                                                   --------------------------                                |
> +|                                                    | M1 |   | M2 |   | M3 |                                 |
> +|                                                   --------------------------                                |
> +|                                                                                                             |
> +|                                                                                                             |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> +	44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> +    don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> +	le128 cdid;
> +	le64 devid;
> +	le64 chunk_size;
> +	le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[\field{cdid}] This is used to identify the communication domain. Only
> +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> +        is world-wide unique in a sense that there not be another communication
> +        domain with the same \field{cdid}.
> +
> +    \item[\field{devid}] This is used to identify the ism device in the single
> +        communication domain.
> +
> +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> +        memory is divided into multiple chunks. Every ism chunk has the same
> +        size.
> +
> +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> +        notify-address is used to notify the device that the content of the
> +        ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device. The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> +        other peers of the update event of the ism region content.
> +
> +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +	le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> +	le64 ev_type;
> +    void *buffer;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> +    structure based on this.
> +
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> +    device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> +	u8 class;
> +	u8 command;
> +	u8 command_specific_data[];
> +	u8 ack;
> +	u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK      0
> +#define VIRTIO_ISM_ERR     255
> +
> +#define VIRTIO_ISM_ENOENT  2
> +#define VIRTIO_ISM_E2BIG   7
> +#define VIRTIO_ISM_ENOMEM  12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD   101
> +#define VIRTIO_ISM_PERM_EWRITE  102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> +	le64 size;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> +	le64 token;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC  0
> +	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> +	le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> +	le64 size;
> +	le64 permissions;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY  1
> +	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> +	le64 token;
> +	le64 rw_perm;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> +	le64 size;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH  2
> +	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> +header part of the last chunk is effective.
> +
> +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +Based on controlq, the device can release references to the ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_detach {
> +	le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_DETACH  3
> +	#define VIRTIO_ISM_CTRL_DETACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> +region, the device who allocated this ism region MUST generate a
> +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> +
> +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> +attach operation, then the buffer is free after the detach operation.
> +
> +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +Based on controlq, the driver can set the permissions for each ism
> +region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_grant_default {
> +	le64 token;
> +	le64 permissions;
> +};
> +
> +struct virtio_ism_ctrl_grant_dev {
> +	le64 token;
> +	le64 permissions;
> +	le64 peer_devid;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_GRANT  4
> +	#define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT    0
> +	#define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> +\end{lstlisting}
> +
> +The permissions:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_PERM_READ       (1 << 0)
> +#define VIRTIO_ISM_PERM_WRITE      (1 << 1)
> +
> +#define VIRTIO_ISM_PERM_ATTACH     (1 << 2)
> +
> +#define VIRTIO_ISM_PERM_MANAGE     (1 << 3)
> +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT     (1 << 4)
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[VIRTIO_ISM_PERM_READ] read permission
> +\item[VIRTIO_ISM_PERM_WRITE] write permission
> +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> +
> +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> +	permission can modify the permission of this ism region. By default, only
> +	the device that allocated the region has this permission.
> +
> +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> +    with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> +
> +\end{description}
> +
> +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> +
> +Permission control is divided into two categories, one is the dedicated
> +permissions for a certain device, and the other is the default permissions for
> +all device.
> +
> +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The default permissions of the newly allocated ism region MUST be  (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> +
> +For a certain device, if there is dedicated permissions, the default
> +permissions are illegal to this device.
> +
> +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> +
> +For the ism region update event, the driver can bind an interrupt to the ism
> +region. Then this ism region's update event will no longer be passed through
> +eventq, but the interrupt will be directly triggered.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_irq_vector {
> +	le64 token;
> +	le64 vector;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_EVENT_VECTOR  5
> +	#define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +
> +The device MUST record the relationship between the ism region and the vector
> +notified by the driver, and notify the update event of this region to driver
> +by the corresponding vector. And the device MUST NOT use eventq to send the
> +update event of this ism region.
> +
> +
> --
> 2.32.0.3.g01195cf9f
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-03-07 11:15   ` [virtio-dev] " Xuan Zhuo
@ 2023-03-15 11:15     ` Xuan Zhuo
  0 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-15 11:15 UTC (permalink / raw)
  To: pasic
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
	virtio-comment

On Tue, 7 Mar 2023 19:15:17 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> Any comments for this. ^_^

hi Halil,

Do you have any comments?

Thanks.

>
> Thanks.
>
> On Thu,  9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> >
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> >
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------|       |------------------------------------------------| |
> > | | Guest                                          |       | Guest                                          | |
> > | |                                                |       |                                                | |
> > | |   ----------------                             |       |   ----------------                             | |
> > | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> > | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> > | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> > | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |                                |               |       |                               |                | |
> > | |                                |               |       |                               |                | |
> > | | Qemu                           |               |       | Qemu                          |                | |
> > | |--------------------------------+---------------|       |-------------------------------+----------------| |
> > |                                  |                                                       |                  |
> > |                                  |                                                       |                  |
> > |                                  |------------------------------+------------------------|                  |
> > |                                                                 |                                           |
> > |                                                                 |                                           |
> > |                                                   --------------------------                                |
> > |                                                    | M1 |   | M2 |   | M3 |                                 |
> > |                                                   --------------------------                                |
> > |                                                                                                             |
> > | HOST                                                                                                        |
> > ---------------------------------------------------------------------------------------------------------------
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> > Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> > Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> > Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> > Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> > Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> > Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  conformance.tex |  26 +++
> >  content.tex     |   1 +
> >  virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 600 insertions(+)
> >  create mode 100644 virtio-ism.tex
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >  \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> >  \end{itemize}
> >
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> >  \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> >
> >  A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >  \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> >  \end{itemize}
> >
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> > +\end{itemize}
> > +
> >  \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> >  A conformant implementation MUST be either transitional or
> >  non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> >  \input{virtio-scmi.tex}
> >  \input{virtio-gpio.tex}
> >  \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> >
> >  \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> >
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------|    |------------------------------------------------|    |
> > +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> > +| |                          |      |      |       |    |                                 |      |       |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> > +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> > +|                                  |                                                       |                  |
> > +|                                  |                                                       |                  |
> > +|                                  |------------------------------+------------------------|                  |
> > +|                                                                 |                                           |
> > +|                                                                 |                                           |
> > +|                                                   --------------------------                                |
> > +|                                                    | M1 |   | M2 |   | M3 |                                 |
> > +|                                                   --------------------------                                |
> > +|                                                                                                             |
> > +|                                                                                                             |
> > +|-------------------------------------------------------------------------------------------------------------|
> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > +	44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > +    don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > +	le128 cdid;
> > +	le64 devid;
> > +	le64 chunk_size;
> > +	le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[\field{cdid}] This is used to identify the communication domain. Only
> > +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > +        is world-wide unique in a sense that there not be another communication
> > +        domain with the same \field{cdid}.
> > +
> > +    \item[\field{devid}] This is used to identify the ism device in the single
> > +        communication domain.
> > +
> > +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > +        memory is divided into multiple chunks. Every ism chunk has the same
> > +        size.
> > +
> > +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> > +        notify-address is used to notify the device that the content of the
> > +        ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
> > +\field{chunk_size} MUST be a power of two.
> > +
> > +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism region is on the device memory. It is a physical memory for driver, but
> > +it is not for device. The device has the ability to modify the physical memory
> > +corresponding to the device memory.
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> > +The driver pass its memory to the device when alloc a new ism region. The
> > +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> > +is negotiated, physical memory is provided by the device.
> > +
> > +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> > +provides memory to device. Under normal circumstances, this memory will not be
> > +used. This is to limit driver to take up too much memory based on attach
> > +operation.
> > +
> > +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism device supports event notification of the ism region. When a device
> > +kick/attach/detach a region, other ism region referrers will receive related
> > +events.
> > +
> > +A buffer received from eventq can contain multiple event structures.
> > +
> > +\begin{lstlisting}
> > +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> > +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> > +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> > +#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> > +        other peers of the update event of the ism region content.
> > +
> > +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> > +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> > +    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> > +\end{description}
> > +
> > +The event structures:
> > +\begin{lstlisting}
> > +struct virtio_ism_event_update {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +};
> > +
> > +struct virtio_ism_event_attach_detach {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +	le64 peers;
> > +};
> > +
> > +struct virtio_ism_event_buffer_free {
> > +	le64 ev_type;
> > +    void *buffer;
> > +};
> > +
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > +    structure based on this.
> > +
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> > +
> > +\item[\field{offset}] The offset of ism regions with the event.
> > +
> > +\item[\field{devid}] \field{devid} of the device that generated events.
> > +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> > +    device that receiving this event)
> > +
> > +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> > +
> > +\end{description}
> > +
> > +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> > +
> > +The permissions of an ism region determine whether this ism region can be
> > +attached and the read and write permissions after attach. The driver can set the
> > +default permissions for all devices, or set the dedicated permissions for a
> > +certain device. If there are dedicated permissions for a device, the default
> > +permissions are invalid for this device.
> > +
> > +For example, the driver can set the default permission of an ism region as
> > +read-only, and set the dedicated permissions as writable for a device with devid
> > +0xff. Then all devices except 0xff in such a communication domain can attach
> > +this ism region and only have read-only permission. And the device 0xff can
> > +write to the ism region after attaching.
> > +
> > +When a driver has the management permission of the ism region, then it can
> > +modify the permissions of this ism region. By default, only the device that
> > +allocated the ism region has this permission.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> > +during reset. \field{devid} MUST NOT be 0.
> > +
> > +The device exposes memory to the driver based on Shared Memory Regions
> > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > +However, it does not need to allocate physical memory during initialization.
> > +
> > +The \field{shmid} of a region MUST be one of the following
> > +\begin{lstlisting}
> > +enum virtio_ism_shm_id {
> > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > +};
> > +\end{lstlisting}
> > +
> > +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> > +
> > +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> > +driver. This memory area is used to notify the device with update event, and
> > +each ism region MUST have a corresponding notify-address inside this area, and
> > +the size of the notify-address is \field{notify_size}.
> > +
> > +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The driver MUST NOT access any chunk before it is referred by one ism region.
> > +
> > +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> > +
> > +The driver uses the control virtqueue send commands to implement operations on
> > +the ism region and some global configurations.
> > +
> > +All commands are of the following form:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl {
> > +	u8 class;
> > +	u8 command;
> > +	u8 command_specific_data[];
> > +	u8 ack;
> > +	u8 command_specific_data_reply[];
> > +};
> > +
> > +/* ack values */
> > +#define VIRTIO_ISM_OK      0
> > +#define VIRTIO_ISM_ERR     255
> > +
> > +#define VIRTIO_ISM_ENOENT  2
> > +#define VIRTIO_ISM_E2BIG   7
> > +#define VIRTIO_ISM_ENOMEM  12
> > +#define VIRTIO_ISM_ENOSPEC 28
> > +
> > +#define VIRTIO_ISM_PERM_EATTACH 100
> > +#define VIRTIO_ISM_PERM_EREAD   101
> > +#define VIRTIO_ISM_PERM_EWRITE  102
> > +\end{lstlisting}
> > +
> > +The \field{class}, \field{command} and command-specific-data are set by the
> > +driver, and the device sets the \field{ack} byte and optionally
> > +\field{command-specific-data-reply}.
> > +
> > +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> > +
> > +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +Based on controlq, the driver can request to allocate an ism region.
> > +
> > +The reply from the device will carry a token, which can be passed
> > +to other driver for attaching to this ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc {
> > +	le64 size;
> > +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc_reply {
> > +	le64 token;
> > +	le64 num;
> > +    le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ALLOC  0
> > +	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> > +free chunks from the device memory. And the physical memory is mapped to these
> > +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> > +provide physical memory by itself. Otherwise, the memory provided by the driver
> > +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> > +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> > +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> > +
> > +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> > +ism region. A new token MUST be dynamically created for this ism region
> > +simultaneously. The token MUST be unique inside this communication domain. And
> > +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks.
> > +
> > +The device MUST clear the memory of this ism region before committing to the
> > +driver.
> > +
> > +If \field{size} is smaller than \field{chunk_size}, the ism region also
> > +occupies one chunk.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG.
> > +
> > +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +Based on controlq, the driver can query the information of an ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query {
> > +	le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query_reply {
> > +	le64 size;
> > +	le64 permissions;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_QUERY  1
> > +	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> > +\field{size} with the size of the ism region. The device MUST get the
> > +permissions of this ism region for the device-self, then fill in the
> > +\field{permissions}.
> > +
> > +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +Based on controlq, the driver can request to attach an ism region with a
> > +specified token.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach {
> > +	le64 token;
> > +	le64 rw_perm;
> > +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach_reply {
> > +	le64 size;
> > +	le64 num;
> > +    le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ATTACH  2
> > +	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST first look for the free chunks from the device memory. And the
> > +physical memory of the ism region is mapped to these chunks in order. If there is
> > +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> > +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks. \field{size} is the size of this ism region.
> > +
> > +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> > +assigning the physical memory of this ism region into the device memory.
> > +
> > +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> > +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> > +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> > +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> > +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> > +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> > +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> > +
> > +The device MUST set the read and write permissions of the physical memory inside
> > +the device memory based on \field{rw_perm}.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> > +
> > +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> > +header part of the last chunk is effective.
> > +
> > +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +Based on controlq, the device can release references to the ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_detach {
> > +	le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_DETACH  3
> > +	#define VIRTIO_ISM_CTRL_DETACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +
> > +If the ism region specified by \field{token} is not attached, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> > +region, the device who allocated this ism region MUST generate a
> > +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> > +
> > +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> > +attach operation, then the buffer is free after the detach operation.
> > +
> > +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +Based on controlq, the driver can set the permissions for each ism
> > +region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_grant_default {
> > +	le64 token;
> > +	le64 permissions;
> > +};
> > +
> > +struct virtio_ism_ctrl_grant_dev {
> > +	le64 token;
> > +	le64 permissions;
> > +	le64 peer_devid;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_GRANT  4
> > +	#define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT    0
> > +	#define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> > +\end{lstlisting}
> > +
> > +The permissions:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_PERM_READ       (1 << 0)
> > +#define VIRTIO_ISM_PERM_WRITE      (1 << 1)
> > +
> > +#define VIRTIO_ISM_PERM_ATTACH     (1 << 2)
> > +
> > +#define VIRTIO_ISM_PERM_MANAGE     (1 << 3)
> > +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT     (1 << 4)
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[VIRTIO_ISM_PERM_READ] read permission
> > +\item[VIRTIO_ISM_PERM_WRITE] write permission
> > +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> > +
> > +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> > +	permission can modify the permission of this ism region. By default, only
> > +	the device that allocated the region has this permission.
> > +
> > +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> > +    with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> > +
> > +\end{description}
> > +
> > +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> > +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> > +
> > +Permission control is divided into two categories, one is the dedicated
> > +permissions for a certain device, and the other is the default permissions for
> > +all device.
> > +
> > +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +
> > +If the ism region specified by \field{token} is not attached, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The default permissions of the newly allocated ism region MUST be  (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> > +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> > +
> > +For a certain device, if there is dedicated permissions, the default
> > +permissions are illegal to this device.
> > +
> > +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> > +
> > +For the ism region update event, the driver can bind an interrupt to the ism
> > +region. Then this ism region's update event will no longer be passed through
> > +eventq, but the interrupt will be directly triggered.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_irq_vector {
> > +	le64 token;
> > +	le64 vector;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_EVENT_VECTOR  5
> > +	#define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> > +
> > +The device MUST record the relationship between the ism region and the vector
> > +notified by the driver, and notify the update event of this region to driver
> > +by the corresponding vector. And the device MUST NOT use eventq to send the
> > +update event of this ism region.
> > +
> > +
> > --
> > 2.32.0.3.g01195cf9f
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
  2023-02-09  3:35   ` [virtio-comment] " Parav Pandit
  2023-03-07 11:15   ` [virtio-dev] " Xuan Zhuo
@ 2023-03-23 14:46   ` Halil Pasic
  2023-03-24  3:08     ` Xuan Zhuo
  2023-03-24  4:03     ` Wen Gu
  2023-03-24  4:51   ` Parav Pandit
  2023-04-26  7:41   ` [virtio-dev] " Xuan Zhuo
  4 siblings, 2 replies; 20+ messages in thread
From: Halil Pasic @ 2023-03-23 14:46 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
	Halil Pasic

On Thu,  9 Feb 2023 11:30:56 +0800
Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:

> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> +	le128 cdid;
> +	le64 devid;
> +	le64 chunk_size;
> +	le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[\field{cdid}] This is used to identify the communication domain. Only
> +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> +        is world-wide unique in a sense that there not be another communication
> +        domain with the same \field{cdid}.
> +
> +    \item[\field{devid}] This is used to identify the ism device in the single
> +        communication domain.
> +
> +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> +        memory is divided into multiple chunks. Every ism chunk has the same
> +        size.
> +
> +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> +        notify-address is used to notify the device that the content of the
> +        ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +


Hi Xuan Zhou!

My understanding is the following: you goal for virtio-ism is that
it should be suitable for usage with SMC-D (much like the original ISM
device). Is that right?

If yes, then let us have a look at the following example. We have
two guests sitting on the same hypervisor: A and B. Both of the
guests have an rdma capable interface, a virtio-ism device and
traditional ISM device. So they could talk over SMC-R, SMC-D via
virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
look like?

Where I am going with this? Either you need a novel way to
discover peers (probably before the usual way is employed)
or (probably preferably) you need to make this part of the
CLC stuff. What are your ideas with regards to this? How is
it supposed to work?

To get back to the things proposed here: the cdid is IMHO
a nice thing, and is functionally corresponding to the
(S)EID. But it is 16 byte wide, and I have no idea how
is it supposed to be used in the CLC handshake.

If this is really supposed to work with SMC and not just take
inspiration from it, I would like some insight from our
SMC experts (they are already on copy).

Regards,
Halil

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-03-23 14:46   ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
@ 2023-03-24  3:08     ` Xuan Zhuo
  2023-03-24  4:03     ` Wen Gu
  1 sibling, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-24  3:08 UTC (permalink / raw)
  To: Halil Pasic
  Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
	Halil Pasic, Wen Gu

On Thu, 23 Mar 2023 15:46:56 +0100, Halil Pasic <pasic@linux.ibm.com> wrote:
> On Thu,  9 Feb 2023 11:30:56 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > +	le128 cdid;
> > +	le64 devid;
> > +	le64 chunk_size;
> > +	le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[\field{cdid}] This is used to identify the communication domain. Only
> > +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > +        is world-wide unique in a sense that there not be another communication
> > +        domain with the same \field{cdid}.
> > +
> > +    \item[\field{devid}] This is used to identify the ism device in the single
> > +        communication domain.
> > +
> > +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > +        memory is divided into multiple chunks. Every ism chunk has the same
> > +        size.
> > +
> > +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> > +        notify-address is used to notify the device that the content of the
> > +        ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
>
>
> Hi Xuan Zhou!
>
> My understanding is the following: you goal for virtio-ism is that
> it should be suitable for usage with SMC-D (much like the original ISM
> device). Is that right?

Yes, in our interior, it is indeed working with SMC. But on the other
hand, I also hope to provide a way to share memory between VMs. You can see that
our POC provides the examples that users can use the sharing memory directly.

>
> If yes, then let us have a look at the following example. We have
> two guests sitting on the same hypervisor: A and B. Both of the
> guests have an rdma capable interface, a virtio-ism device and
> traditional ISM device. So they could talk over SMC-R, SMC-D via
> virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
> look like?

Not only these, but also SMC-Loopback. I know Wen is promoting this aspect. I
think this is not a problem of virtio-ism. SMC may have to be renovated
together.


>
> Where I am going with this? Either you need a novel way to
> discover peers (probably before the usual way is employed)
> or (probably preferably) you need to make this part of the
> CLC stuff. What are your ideas with regards to this? How is
> it supposed to work?
>
> To get back to the things proposed here: the cdid is IMHO
> a nice thing, and is functionally corresponding to the
> (S)EID. But it is 16 byte wide, and I have no idea how
> is it supposed to be used in the CLC handshake.

This is indeed the biggest problem for Virtio-ISM + SMC, but I think Wen can
solve it. Of course, other SMC experts are discussed together, how to compatible
or support virtio-exm and smc-loopback.

>
> If this is really supposed to work with SMC and not just take
> inspiration from it,

As I said above, we are not just working with SMC. I also hope that
users can use these shared memory directly. So this is also an independent
memory sharing mechanism without smc.

> I would like some insight from our
> SMC experts (they are already on copy).

Yes. I also want to hear more reply.

Thanks.



>
> Regards,
> Halil

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-03-23 14:46   ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
  2023-03-24  3:08     ` Xuan Zhuo
@ 2023-03-24  4:03     ` Wen Gu
       [not found]       ` <7a9ebec0-5e87-b80f-4f2c-c4db7ae4fe84@linux.ibm.com>
  1 sibling, 1 reply; 20+ messages in thread
From: Wen Gu @ 2023-03-24  4:03 UTC (permalink / raw)
  To: Halil Pasic, Xuan Zhuo
  Cc: virtio-comment, hans, herongguang, zmlcc, dust.li, tonylu,
	zhenzao, helinguo, gerry, mst, cohuck, jasowang, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
	guwen



On 2023/3/23 22:46, Halil Pasic wrote:

> On Thu,  9 Feb 2023 11:30:56 +0800
> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> 
>> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
>> +
>> +\begin{lstlisting}
>> +struct virtio_ism_config {
>> +	le128 cdid;
>> +	le64 devid;
>> +	le64 chunk_size;
>> +	le64 notify_size;
>> +};
>> +\end{lstlisting}
>> +
>> +\begin{description}
>> +    \item[\field{cdid}] This is used to identify the communication domain. Only
>> +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
>> +        is world-wide unique in a sense that there not be another communication
>> +        domain with the same \field{cdid}.
>> +
>> +    \item[\field{devid}] This is used to identify the ism device in the single
>> +        communication domain.
>> +
>> +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
>> +        memory is divided into multiple chunks. Every ism chunk has the same
>> +        size.
>> +
>> +    \item[\field{notify_size}] This is the size of the ism notify-address. The
>> +        notify-address is used to notify the device that the content of the
>> +        ism region has been updated.
>> +
>> +\end{description}
>> +
>> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
>> +
>> +The device MUST ensure that the \field{cdid} of the device on the same
>> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
>> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>> +
>> +In the single communication domain, the device MUST ensure that the \field{devid}
>> +is unique.
>> +
> 
> 
> Hi Xuan Zhou!
> 
> My understanding is the following: you goal for virtio-ism is that
> it should be suitable for usage with SMC-D (much like the original ISM
> device). Is that right?
> 
> If yes, then let us have a look at the following example. We have
> two guests sitting on the same hypervisor: A and B. Both of the
> guests have an rdma capable interface, a virtio-ism device and
> traditional ISM device. So they could talk over SMC-R, SMC-D via
> virtio and SMC-D via (PCI-)ISM. How would the CLC Proposal message
> look like?
> 

Hi Halil!

Now SMCv2 protocol allows CLC Proposal message to carry 1 SMC-R
GID, 1 SMC-Rv2 GID, and up to 8 SMC-D [CHID,GID] at the same time
for server to choose from.

And after receiving CLC Proposal message, server will match the local
devices with devices provided in CLC Proposal messages in this order:

SMC-Dv2 (ISMv2) device -> SMC-Dv1 (ISMv1) device -> SMC-Rv2 (RoCEv2) device -> SMC-Rv1 (RoCE) device

If there are multiple SMC-Dv2 devices provided in CLC Proposal messages,
such as virtio-ism and ISM in this example, the priority depends on their
order in the slot. (we treat virtio-ism as a kind of SMC-Dv2 device)

> Where I am going with this? Either you need a novel way to
> discover peers (probably before the usual way is employed)
> or (probably preferably) you need to make this part of the
> CLC stuff. What are your ideas with regards to this? How is
> it supposed to work?
> 

The discussion about
1) how to distinguish SMC-D device type (loopback/virtio-ism/ISM)
2) how to make sure peer virtio-ism can talk to
3) how to define priority or preference of SMC-D devices
are on-going in the mail thread of meeting note: "SMC interlock | Alibaba - IBM | 2023-03-20"

A quich summary here:

1) we propose to use reserved CHID to indicate what kind of device it is.
    such as 0xFFFF for loopback, 0xFFFE for virtio-ism.

2) whether two peer's SMC-D device can talk depends on:
    a. whether VCHID is the same;
    b. whether smc_ism_cantalk() return true;
    c. whether SEID is the same;

    In step b, smc_ism_cantalk() will simply call driver interface of
    ops->query_remote_gid() to tell. So as you can see, GID is the key
    point to check whether peers' SMC-D device can talk.

    And cdid here in virtio-ism determines whether talk is possible.
    Same cdid means two virtio-ism can talk to each other.

    So the problem may be how to covert 128-bits cdid to 64-bits GID
    used by SMC-D and avoid GID collision ? I have no elegant idea
    only a workaround right now ... The discussion on GID collision
    is now on-going in mail thread of meeting note: "SMC interlock | Alibaba - IBM | 2023-03-20".
    If you have any comments about this, very welcome to discuss there.

3) Priority or preference of SMC-D devices are decided by slot
    order now, or explicitly by CHID if we all agree to use reserved
    CHID for extension SMC-D device. (haven't been comfirmed now)

> To get back to the things proposed here: the cdid is IMHO
> a nice thing, and is functionally corresponding to the
> (S)EID. But it is 16 byte wide, and I have no idea how
> is it supposed to be used in the CLC handshake.
> 

CLC handshake carry one SEID for all the SMC-D device. Considering
coexistence with ISM, I am not sure whether we can change or increase
the SEID.. cc Alexandra

Thanks!
Wen Gu

> If this is really supposed to work with SMC and not just take
> inspiration from it, I would like some insight from our
> SMC experts (they are already on copy).
> 
> Regards,
> Halil
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
                     ` (2 preceding siblings ...)
  2023-03-23 14:46   ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
@ 2023-03-24  4:51   ` Parav Pandit
  2023-03-24  6:35     ` Xuan Zhuo
  2023-03-24  9:10     ` Michael S. Tsirkin
  2023-04-26  7:41   ` [virtio-dev] " Xuan Zhuo
  4 siblings, 2 replies; 20+ messages in thread
From: Parav Pandit @ 2023-03-24  4:51 UTC (permalink / raw)
  To: Xuan Zhuo, virtio-comment
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic



On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine

I am still learning this new memory sharing beast, so mostly cosmetics 
comments below.

instead of guest of virtual machines is confusing.
It can be just written as - can be shared between virtual machines 
running on same host ..
> including an ism device and with the guests getting the shared memory by the ism
> devices.
> 
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
> 
multiple peers mean, multiple peer ism device?
Or you meant multiple vms?

> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------|       |------------------------------------------------| |
> | | Guest                                          |       | Guest                                          | |
> | |                                                |       |                                                | |
instead of guest, naming it VM makes it consistent with the spec and 
description.

> | |   ----------------                             |       |   ----------------                             | |
> | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |                                |               |       |                               |                | |
> | |                                |               |       |                               |                | |
> | | Qemu                           |               |       | Qemu                          |                | |
> | |--------------------------------+---------------|       |-------------------------------+----------------| |
> |                                  |                                                       |                  |
> |                                  |                                                       |                  |
> |                                  |------------------------------+------------------------|                  |
> |                                                                 |                                           |
> |                                                                 |                                           |
> |                                                   --------------------------                                |
> |                                                    | M1 |   | M2 |   | M3 |                                 |
> |                                                   --------------------------                                |
> |                                                                                                             |
> | HOST                                                                                                        |
> ---------------------------------------------------------------------------------------------------------------
>
You might want to show the event q as well next to control q as its so 
fundamental for this whole operation.

> ---
>   conformance.tex |  26 +++
>   content.tex     |   1 +
>   virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 600 insertions(+)
>   create mode 100644 virtio-ism.tex
> 
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>   \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
>   \end{itemize}
>   
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
>   \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>   
>   A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>   \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
>   \end{itemize}
>   
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
Though using control vq to configure irq is good idea, I guess 
queue_msix_vector can be used. It needs some more text that 
queue_msix_vector will not be used etc.

if the event queue create interface exists over ctrl vq than things tied 
up well. Else I think transport specific existing scheme more aligns to 
the spec.

> +\end{itemize}
> +
>   \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
>   A conformant implementation MUST be either transitional or
>   non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
>   \input{virtio-scmi.tex}
>   \input{virtio-gpio.tex}
>   \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>   
>   \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>   
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------|    |------------------------------------------------|    |
> +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> +| |                          |      |      |       |    |                                 |      |       |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> +|                                  |                                                       |                  |
> +|                                  |                                                       |                  |
> +|                                  |------------------------------+------------------------|                  |
> +|                                                                 |                                           |
Above line is misaligned.

> +|                                                                 |                                           |
> +|                                                   --------------------------                                |
> +|                                                    | M1 |   | M2 |   | M3 |                                 |
> +|                                                   --------------------------                                |
> +|                                                                                                             |
> +|                                                                                                             |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
The peer driver will make it more clear.

> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> +	44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> +    don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> +	le128 cdid;
le128 data type is not present in the spec.
Better to define uuid data structure and refer here.

struct uuid {..}

struct uuid cdid;
...
> +	le64 devid;
> +	le64 chunk_size;
> +	le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[\field{cdid}] This is used to identify the communication domain. Only
> +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> +        is world-wide unique in a sense that there not be another communication
> +        domain with the same \field{cdid}.
> +
> +    \item[\field{devid}] This is used to identify the ism device in the single
> +        communication domain.
> +
> +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> +        memory is divided into multiple chunks. Every ism chunk has the same
> +        size.
> +
> +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> +        notify-address is used to notify the device that the content of the
> +        ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device. 
If it is not physical memory, than what is it? May be you want to write 
that.

> The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
Above sentence as continuation to physical memory implies that its RW 
for device and RO for driver.
I likely misunderstood.

> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
The driver passes its memory to the device when allocating a new ism region.

> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> +        other peers of the update event of the ism region content.
> +
> +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +	le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> +	le64 ev_type;
> +    void *buffer;
Is this a virtual address being shared between driver and device and in 
the cvq command?
If so, do you need any attachment to process id or such abstract handle 
so that some day user process can also access it?
or the use case is to only consume by the driver, hence only one VA?

> +};
> + > +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> +    structure based on this.
> +
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> +    device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
control virtqueue to send commands

> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> +	u8 class;
> +	u8 command;
> +	u8 command_specific_data[];
> +	u8 ack;
> +	u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK      0
> +#define VIRTIO_ISM_ERR     255
> +
> +#define VIRTIO_ISM_ENOENT  2
> +#define VIRTIO_ISM_E2BIG   7
> +#define VIRTIO_ISM_ENOMEM  12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD   101
> +#define VIRTIO_ISM_PERM_EWRITE  102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> +	le64 size;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
align void* and le64 to same.

void * here and above doesnt define the exact data size.
device will not be able to understand bit width of void* on 64-bit and 
32-bit systems.

please change to u64.

Also I think spec follows /* */ comment style, instead of //.

> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> +	le64 token;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC  0
> +	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> +	le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> +	le64 size;
> +	le64 permissions;
Better to define the decoding of the permissions here.
Also below you have field rw_perm. May be rename both to sama name.

> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY  1
> +	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> +	le64 token;
> +	le64 rw_perm;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> +	le64 size;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH  2
> +	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we w

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-03-24  4:51   ` Parav Pandit
@ 2023-03-24  6:35     ` Xuan Zhuo
  2023-03-24  9:10     ` Michael S. Tsirkin
  1 sibling, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-03-24  6:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic,
	virtio-comment

On Fri, 24 Mar 2023 00:51:13 -0400, Parav Pandit <parav@nvidia.com> wrote:
>
>
> On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
>
> I am still learning this new memory sharing beast, so mostly cosmetics
> comments below.
>
> instead of guest of virtual machines is confusing.
> It can be just written as - can be shared between virtual machines
> running on same host ..
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> >
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> >
> multiple peers mean, multiple peer ism device?

Yes. multiple peer ism device.

Yes, we should use device, not VM or Guest.


> Or you meant multiple vms?
>
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------|       |------------------------------------------------| |
> > | | Guest                                          |       | Guest                                          | |
> > | |                                                |       |                                                | |
> instead of guest, naming it VM makes it consistent with the spec and
> description.

Yes

>
> > | |   ----------------                             |       |   ----------------                             | |
> > | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> > | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> > | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> > | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |                                |               |       |                               |                | |
> > | |                                |               |       |                               |                | |
> > | | Qemu                           |               |       | Qemu                          |                | |
> > | |--------------------------------+---------------|       |-------------------------------+----------------| |
> > |                                  |                                                       |                  |
> > |                                  |                                                       |                  |
> > |                                  |------------------------------+------------------------|                  |
> > |                                                                 |                                           |
> > |                                                                 |                                           |
> > |                                                   --------------------------                                |
> > |                                                    | M1 |   | M2 |   | M3 |                                 |
> > |                                                   --------------------------                                |
> > |                                                                                                             |
> > | HOST                                                                                                        |
> > ---------------------------------------------------------------------------------------------------------------
> >
> You might want to show the event q as well next to control q as its so
> fundamental for this whole operation.


Yes. I will try.


>
> > ---
> >   conformance.tex |  26 +++
> >   content.tex     |   1 +
> >   virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 600 insertions(+)
> >   create mode 100644 virtio-ism.tex
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >   \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> >   \end{itemize}
> >
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> >   \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> >
> >   A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >   \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> >   \end{itemize}
> >
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> Though using control vq to configure irq is good idea, I guess
> queue_msix_vector can be used. It needs some more text that
> queue_msix_vector will not be used etc.


The irq is bind to virtio ism region not queue. So we use the control vq to
configure that.


>
> if the event queue create interface exists over ctrl vq than things tied
> up well. Else I think transport specific existing scheme more aligns to
> the spec.

The situation here is not within the scope of standardization.

Virtio ISM Region is a relatively special concept to virtio spec. It is a piece
of memory, not Queue. So there is no solution provided in the current
specification here. We share this Virtio Ism Region between multiple Device. The
interruption of the binding here is to inform the content of this memory is
updated, and it is used to notify  other devices


>
> > +\end{itemize}
> > +
> >   \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> >   A conformant implementation MUST be either transitional or
> >   non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> >   \input{virtio-scmi.tex}
> >   \input{virtio-gpio.tex}
> >   \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> >
> >   \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> >
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------|    |------------------------------------------------|    |
> > +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> > +| |                          |      |      |       |    |                                 |      |       |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> > +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> > +|                                  |                                                       |                  |
> > +|                                  |                                                       |                  |
> > +|                                  |------------------------------+------------------------|                  |
> > +|                                                                 |                                           |
> Above line is misaligned.
>
> > +|                                                                 |                                           |
> > +|                                                   --------------------------                                |
> > +|                                                    | M1 |   | M2 |   | M3 |                                 |
> > +|                                                   --------------------------                                |
> > +|                                                                                                             |
> > +|                                                                                                             |
> > +|-------------------------------------------------------------------------------------------------------------|
> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> The peer driver will make it more clear.

Yes

>
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > +	44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > +    don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > +	le128 cdid;
> le128 data type is not present in the spec.
> Better to define uuid data structure and refer here.
>
> struct uuid {..}
>
> struct uuid cdid;
> ...


will fix

> > +	le64 devid;
> > +	le64 chunk_size;
> > +	le64 notify_size;
> > +};
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[\field{cdid}] This is used to identify the communication domain. Only
> > +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> > +        is world-wide unique in a sense that there not be another communication
> > +        domain with the same \field{cdid}.
> > +
> > +    \item[\field{devid}] This is used to identify the ism device in the single
> > +        communication domain.
> > +
> > +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> > +        memory is divided into multiple chunks. Every ism chunk has the same
> > +        size.
> > +
> > +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> > +        notify-address is used to notify the device that the content of the
> > +        ism region has been updated.
> > +
> > +\end{description}
> > +
> > +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> > +
> > +The device MUST ensure that the \field{cdid} of the device on the same
> > +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> > +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > +
> > +In the single communication domain, the device MUST ensure that the \field{devid}
> > +is unique.
> > +
> > +\field{chunk_size} MUST be a power of two.
> > +
> > +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism region is on the device memory. It is a physical memory for driver, but
> > +it is not for device.
> If it is not physical memory, than what is it? May be you want to write
> that.


I originally wanted to express that this ISM Region is a physical memory for
the drive, but it is virtual memory for the device. But now I realize that this
sentence is meaningless. I will remove it in the next version.


>
> > The device has the ability to modify the physical memory
> > +corresponding to the device memory.
> > +
> Above sentence as continuation to physical memory implies that its RW
> for device and RO for driver.
> I likely misunderstood.
>
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> > +The driver pass its memory to the device when alloc a new ism region. The
> The driver passes its memory to the device when allocating a new ism region.
>
> > +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> > +is negotiated, physical memory is provided by the device.
> > +
> > +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> > +provides memory to device. Under normal circumstances, this memory will not be
> > +used. This is to limit driver to take up too much memory based on attach
> > +operation.
> > +
> > +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> > +
> > +The ism device supports event notification of the ism region. When a device
> > +kick/attach/detach a region, other ism region referrers will receive related
> > +events.
> > +
> > +A buffer received from eventq can contain multiple event structures.
> > +
> > +\begin{lstlisting}
> > +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> > +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> > +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> > +#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> > +        other peers of the update event of the ism region content.
> > +
> > +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> > +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> > +    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> > +\end{description}
> > +
> > +The event structures:
> > +\begin{lstlisting}
> > +struct virtio_ism_event_update {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +};
> > +
> > +struct virtio_ism_event_attach_detach {
> > +	le64 ev_type;
> > +	le64 offset;
> > +	le64 devid;
> > +	le64 peers;
> > +};
> > +
> > +struct virtio_ism_event_buffer_free {
> > +	le64 ev_type;
> > +    void *buffer;
> Is this a virtual address being shared between driver and device and in
> the cvq command?
> If so, do you need any attachment to process id or such abstract handle
> so that some day user process can also access it?
> or the use case is to only consume by the driver, hence only one VA?

Yes, I miss something. Maybe I should add an offset.

>
> > +};
> > + > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{ev_type}] The type of event, the driver can get the size of the
> > +    structure based on this.
> > +
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> > +        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> > +
> > +\item[\field{offset}] The offset of ism regions with the event.
> > +
> > +\item[\field{devid}] \field{devid} of the device that generated events.
> > +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> > +    device that receiving this event)
> > +
> > +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> > +
> > +\end{description}
> > +
> > +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> > +
> > +The permissions of an ism region determine whether this ism region can be
> > +attached and the read and write permissions after attach. The driver can set the
> > +default permissions for all devices, or set the dedicated permissions for a
> > +certain device. If there are dedicated permissions for a device, the default
> > +permissions are invalid for this device.
> > +
> > +For example, the driver can set the default permission of an ism region as
> > +read-only, and set the dedicated permissions as writable for a device with devid
> > +0xff. Then all devices except 0xff in such a communication domain can attach
> > +this ism region and only have read-only permission. And the device 0xff can
> > +write to the ism region after attaching.
> > +
> > +When a driver has the management permission of the ism region, then it can
> > +modify the permissions of this ism region. By default, only the device that
> > +allocated the ism region has this permission.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> > +during reset. \field{devid} MUST NOT be 0.
> > +
> > +The device exposes memory to the driver based on Shared Memory Regions
> > +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> > +However, it does not need to allocate physical memory during initialization.
> > +
> > +The \field{shmid} of a region MUST be one of the following
> > +\begin{lstlisting}
> > +enum virtio_ism_shm_id {
> > +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> > +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> > +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> > +};
> > +\end{lstlisting}
> > +
> > +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> > +
> > +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> > +driver. This memory area is used to notify the device with update event, and
> > +each ism region MUST have a corresponding notify-address inside this area, and
> > +the size of the notify-address is \field{notify_size}.
> > +
> > +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> > +
> > +The driver MUST NOT access any chunk before it is referred by one ism region.
> > +
> > +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> > +
> > +The driver uses the control virtqueue send commands to implement operations on
> control virtqueue to send commands
>
> > +the ism region and some global configurations.
> > +
> > +All commands are of the following form:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl {
> > +	u8 class;
> > +	u8 command;
> > +	u8 command_specific_data[];
> > +	u8 ack;
> > +	u8 command_specific_data_reply[];
> > +};
> > +
> > +/* ack values */
> > +#define VIRTIO_ISM_OK      0
> > +#define VIRTIO_ISM_ERR     255
> > +
> > +#define VIRTIO_ISM_ENOENT  2
> > +#define VIRTIO_ISM_E2BIG   7
> > +#define VIRTIO_ISM_ENOMEM  12
> > +#define VIRTIO_ISM_ENOSPEC 28
> > +
> > +#define VIRTIO_ISM_PERM_EATTACH 100
> > +#define VIRTIO_ISM_PERM_EREAD   101
> > +#define VIRTIO_ISM_PERM_EWRITE  102
> > +\end{lstlisting}
> > +
> > +The \field{class}, \field{command} and command-specific-data are set by the
> > +driver, and the device sets the \field{ack} byte and optionally
> > +\field{command-specific-data-reply}.
> > +
> > +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> > +
> > +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +Based on controlq, the driver can request to allocate an ism region.
> > +
> > +The reply from the device will carry a token, which can be passed
> > +to other driver for attaching to this ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc {
> > +	le64 size;
> > +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> align void* and le64 to same.
>
> void * here and above doesnt define the exact data size.
> device will not be able to understand bit width of void* on 64-bit and
> 32-bit systems.
>
> please change to u64.
>
> Also I think spec follows /* */ comment style, instead of //.
>
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_alloc_reply {
> > +	le64 token;
> > +	le64 num;
> > +    le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ALLOC  0
> > +	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> > +\end{lstlisting}
> > +
> > +
> > +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> > +free chunks from the device memory. And the physical memory is mapped to these
> > +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> > +provide physical memory by itself. Otherwise, the memory provided by the driver
> > +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> > +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> > +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> > +
> > +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> > +ism region. A new token MUST be dynamically created for this ism region
> > +simultaneously. The token MUST be unique inside this communication domain. And
> > +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks.
> > +
> > +The device MUST clear the memory of this ism region before committing to the
> > +driver.
> > +
> > +If \field{size} is smaller than \field{chunk_size}, the ism region also
> > +occupies one chunk.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG.
> > +
> > +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we want to allocate.
> > +
> > +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +Based on controlq, the driver can query the information of an ism region.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query {
> > +	le64 token;
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_query_reply {
> > +	le64 size;
> > +	le64 permissions;
> Better to define the decoding of the permissions here.
> Also below you have field rw_perm. May be rename both to sama name.


Will fix.

Thanks.


>
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_QUERY  1
> > +	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> > +\field{size} with the size of the ism region. The device MUST get the
> > +permissions of this ism region for the device-self, then fill in the
> > +\field{permissions}.
> > +
> > +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +Based on controlq, the driver can request to attach an ism region with a
> > +specified token.
> > +
> > +The \field{command_specific_data} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach {
> > +	le64 token;
> > +	le64 rw_perm;
> > +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{command_specific_data_reply} has the following format:
> > +\begin{lstlisting}
> > +struct virtio_ism_ctrl_attach_reply {
> > +	le64 size;
> > +	le64 num;
> > +    le64 chunks[];
> > +};
> > +\end{lstlisting}
> > +
> > +The \field{class} and \field{command}:
> > +\begin{lstlisting}
> > +#define VIRTIO_ISM_CTRL_ATTACH  2
> > +	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> > +\end{lstlisting}
> > +
> > +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If the ism region specified by \field{token} does not exist, the device MUST set
> > +\field{ack} to VIRTIO_ISM_ENOENT.
> > +
> > +The device MUST first look for the free chunks from the device memory. And the
> > +physical memory of the ism region is mapped to these chunks in order. If there is
> > +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> > +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> > +is the number of the chunks. \field{size} is the size of this ism region.
> > +
> > +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> > +assigning the physical memory of this ism region into the device memory.
> > +
> > +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> > +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> > +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> > +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> > +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> > +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> > +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> > +
> > +The device MUST set the read and write permissions of the physical memory inside
> > +the device memory based on \field{rw_perm}.
> > +
> > +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> > +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> > +
> > +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> > +
> > +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +
> > +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> > +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> > +region that we w

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-03-24  4:51   ` Parav Pandit
  2023-03-24  6:35     ` Xuan Zhuo
@ 2023-03-24  9:10     ` Michael S. Tsirkin
  1 sibling, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2023-03-24  9:10 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Xuan Zhuo, virtio-comment, hans, herongguang, zmlcc, dust.li,
	tonylu, zhenzao, helinguo, gerry, cohuck, jasowang, Jan Kiszka,
	wintera, kgraul, wenjia, jaka, hca, twinkler, raspl, virtio-dev,
	pasic

On Fri, Mar 24, 2023 at 12:51:13AM -0400, Parav Pandit wrote:
> 
> 
> On 2/8/2023 10:30 PM, Xuan Zhuo wrote:
> > An ISM(Internal Shared Memory) device provides the ability to access memory
> > shared between multiple devices. This allows low-overhead communication in
> > presence of such memory. For example, memory can be shared with guests of
> > multiple virtual machines running on the same host, with each virtual machine
> 
> I am still learning this new memory sharing beast, so mostly cosmetics
> comments below.
> 
> instead of guest of virtual machines is confusing.
> It can be just written as - can be shared between virtual machines running
> on same host ..
> > including an ism device and with the guests getting the shared memory by the ism
> > devices.
> > 
> > An ism device can communicate with multiple peers simultaneously. This
> > communication can be dynamically started and ended.
> > 
> multiple peers mean, multiple peer ism device?
> Or you meant multiple vms?
> 
> > |-------------------------------------------------------------------------------------------------------------|
> > | |------------------------------------------------|       |------------------------------------------------| |
> > | | Guest                                          |       | Guest                                          | |
> > | |                                                |       |                                                | |
> instead of guest, naming it VM makes it consistent with the spec and
> description.

Ideally we just talk about drivers. But it's just a commit log,
it does not matter.

If you are talking about software that runs then calling it VM
is wrong - VM is the interface between guest software and hypervisor.


> > | |   ----------------                             |       |   ----------------                             | |
> > | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> > | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> > | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> > | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> > | |    |  |                -------------------     |       |    |  |                --------------------    | |
> > | |                                |               |       |                               |                | |
> > | |                                |               |       |                               |                | |
> > | | Qemu                           |               |       | Qemu                          |                | |
> > | |--------------------------------+---------------|       |-------------------------------+----------------| |
> > |                                  |                                                       |                  |
> > |                                  |                                                       |                  |
> > |                                  |------------------------------+------------------------|                  |
> > |                                                                 |                                           |
> > |                                                                 |                                           |
> > |                                                   --------------------------                                |
> > |                                                    | M1 |   | M2 |   | M3 |                                 |
> > |                                                   --------------------------                                |
> > |                                                                                                             |
> > | HOST                                                                                                        |
> > ---------------------------------------------------------------------------------------------------------------
> > 
> You might want to show the event q as well next to control q as its so
> fundamental for this whole operation.
> 
> > ---
> >   conformance.tex |  26 +++
> >   content.tex     |   1 +
> >   virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 600 insertions(+)
> >   create mode 100644 virtio-ism.tex
> > 
> > diff --git a/conformance.tex b/conformance.tex
> > index c3c1d3e..0a1456a 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >   \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
> >   \end{itemize}
> > +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> > +
> > +A ISM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\end{itemize}
> > +
> >   \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
> >   A device MUST conform to the following normative statements:
> > @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >   \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
> >   \end{itemize}
> > +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> > +
> > +A ISM device MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> > +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> Though using control vq to configure irq is good idea, I guess
> queue_msix_vector can be used. It needs some more text that
> queue_msix_vector will not be used etc.
> 
> if the event queue create interface exists over ctrl vq than things tied up
> well. Else I think transport specific existing scheme more aligns to the
> spec.
> 
> > +\end{itemize}
> > +
> >   \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
> >   A conformant implementation MUST be either transitional or
> >   non-transitional, see \ref{intro:Legacy
> > diff --git a/content.tex b/content.tex
> > index 96f4723..fe02aec 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
> >   \input{virtio-scmi.tex}
> >   \input{virtio-gpio.tex}
> >   \input{virtio-pmem.tex}
> > +\input{virtio-ism.tex}
> >   \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> > diff --git a/virtio-ism.tex b/virtio-ism.tex
> > new file mode 100644
> > index 0000000..a1720d8
> > --- /dev/null
> > +++ b/virtio-ism.tex
> > @@ -0,0 +1,573 @@
> > +\section{ISM Device}\label{sec:Device Types / ISM Device}
> > +
> > +\begin{lstlisting}
> > +|-------------------------------------------------------------------------------------------------------------|
> > +| |------------------------------------------------|    |------------------------------------------------|    |
> > +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> > +| |                          |      |      |       |    |                                 |      |       |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> > +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> > +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> > +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> > +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |                                |               |    |                                |               |    |
> > +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> > +|                                  |                                                       |                  |
> > +|                                  |                                                       |                  |
> > +|                                  |------------------------------+------------------------|                  |
> > +|                                                                 |                                           |
> Above line is misaligned.
> 
> > +|                                                                 |                                           |
> > +|                                                   --------------------------                                |
> > +|                                                    | M1 |   | M2 |   | M3 |                                 |
> > +|                                                   --------------------------                                |
> > +|                                                                                                             |
> > +|                                                                                                             |
> > +|-------------------------------------------------------------------------------------------------------------|


And I think this is a bad way to include drawings.
There are lots of latex packages for diagrams, please
pick one.


> > +\end{lstlisting}
> > +
> > +An ISM(Internal Shared Memory) device provides the ability to access memory
> > +shared between multiple devices. This allows low-overhead communication in
> > +presence of such memory. For example, memory can be shared with guests of
> > +multiple virtual machines running on the same host, with each virtual machine
> > +including an ism device and with the guests getting the shared memory by the ism
> > +devices.
> > +
> > +An ism device can communicate with multiple peers simultaneously. This
> > +communication can be dynamically started and ended.
> > +
> > +All the devices with the ability to communicate with each other form a
> > +communication domain. Two devices from different communication domains can't
> > +communicate.
> > +
> > +The device memory of the ism device is divided into multiple chunks with the
> > +same size. Every ism region contains multiple chunks. When communicating between
> > +two devices, an ism region is used as a shared memory.
> > +
> > +The ism region is the basis for communication between ism devices.
> > +
> > +The process of communication between two drivers is that one driver allocates an
> > +ism region and obtains a token. Then the peer uses this token to attach the same
> The peer driver will make it more clear.
> 
> > +ism region, the two drivers realize the memory(ism region) sharing. The driver
> > +can also notify peer by kick notify-address the ism region has been updated.
> > +
> > +An ism region can be referred by its \field{token}, or the \field{offset}.
> > +The \field{offset} is the offset of the first chunk inside the ism region
> > +starting from the device memory head.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> > +	44
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] controlq
> > +\item[1] eventq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> > +\begin{description}
> > +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> > +    don't need to provide memory for alloc/attach operation.
> > +
> > +\end{description}
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_ism_config {
> > +	le128 cdid;
> le128 data type is not present in the spec.
> Better to define uuid data structure and refer here.
> 
> struct uuid {..}
> 
> struct uuid cdid;
> ...

Agree. And is it really little endian?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
       [not found]       ` <7a9ebec0-5e87-b80f-4f2c-c4db7ae4fe84@linux.ibm.com>
@ 2023-04-05 12:52         ` Michael S. Tsirkin
  2023-04-07  3:22           ` Xuan Zhuo
  0 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2023-04-05 12:52 UTC (permalink / raw)
  To: Alexandra Winter
  Cc: Wen Gu, Halil Pasic, Xuan Zhuo, virtio-comment, hans,
	herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo, gerry,
	cohuck, jasowang, Jan Kiszka, kgraul, wenjia, jaka, hca,
	twinkler, raspl, virtio-dev

On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> 
> 
> On 24.03.23 05:03, Wen Gu wrote:
> > 
> > 
> > On 2023/3/23 22:46, Halil Pasic wrote:
> > 
> >> On Thu,  9 Feb 2023 11:30:56 +0800
> >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >>
> 
> ...
> 
> > 
> >> To get back to the things proposed here: the cdid is IMHO
> >> a nice thing, and is functionally corresponding to the
> >> (S)EID. But it is 16 byte wide, and I have no idea how
> >> is it supposed to be used in the CLC handshake.
> >>
> > 
> > CLC handshake carry one SEID for all the SMC-D device. Considering
> > coexistence with ISM, I am not sure whether we can change or increase
> > the SEID.. cc Alexandra
> > 
> > Thanks!
> > Wen Gu
> 
> As mentioned by others, discussions are ongoing.
> It would be great, if we can agree on a way to use the existing CLC handshake
> for SMC-D via virtio-ism and ism-loopback.
> In that case SEID needs to be unique per hardware instance, cannot be increased and
> can only be changed for x86 in a non-colliding way.
> 
> An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> are free to define new fields (e.g. UUIDs).
> 
> Alexandra

Problem with tying to hardware is that it is blocking
migration (which is a challenge with ism anyway, but still).


> > 
> >> If this is really supposed to work with SMC and not just take
> >> inspiration from it, I would like some insight from our
> >> SMC experts (they are already on copy).
> >>
> >> Regards,
> >> Halil
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> >> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-05 12:52         ` Michael S. Tsirkin
@ 2023-04-07  3:22           ` Xuan Zhuo
  2023-04-07 11:13             ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
  2023-04-10  1:23             ` Jason Wang
  0 siblings, 2 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-04-07  3:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Wen Gu, Halil Pasic, virtio-comment, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, cohuck, jasowang,
	Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> >
> >
> > On 24.03.23 05:03, Wen Gu wrote:
> > >
> > >
> > > On 2023/3/23 22:46, Halil Pasic wrote:
> > >
> > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >>
> >
> > ...
> >
> > >
> > >> To get back to the things proposed here: the cdid is IMHO
> > >> a nice thing, and is functionally corresponding to the
> > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > >> is it supposed to be used in the CLC handshake.
> > >>
> > >
> > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > coexistence with ISM, I am not sure whether we can change or increase
> > > the SEID.. cc Alexandra
> > >
> > > Thanks!
> > > Wen Gu
> >
> > As mentioned by others, discussions are ongoing.
> > It would be great, if we can agree on a way to use the existing CLC handshake
> > for SMC-D via virtio-ism and ism-loopback.
> > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > can only be changed for x86 in a non-colliding way.
> >
> > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > are free to define new fields (e.g. UUIDs).
> >
> > Alexandra
>
> Problem with tying to hardware is that it is blocking
> migration (which is a challenge with ism anyway, but still).


We don't want to support migration. At least we don't want to support it for the
time being. Because there are indeed many problems. I think Migration is not
necessary for a new Virtio device.

Thanks.


>
>
> > >
> > >> If this is really supposed to work with SMC and not just take
> > >> inspiration from it, I would like some insight from our
> > >> SMC experts (they are already on copy).
> > >>
> > >> Regards,
> > >> Halil
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > >> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-07  3:22           ` Xuan Zhuo
@ 2023-04-07 11:13             ` Michael S. Tsirkin
  2023-04-10  1:47               ` Xuan Zhuo
  2023-04-10  1:23             ` Jason Wang
  1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07 11:13 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Wen Gu, Halil Pasic, virtio-comment, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, cohuck, jasowang,
	Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Fri, Apr 07, 2023 at 11:22:15AM +0800, Xuan Zhuo wrote:
> On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> > >
> > >
> > > On 24.03.23 05:03, Wen Gu wrote:
> > > >
> > > >
> > > > On 2023/3/23 22:46, Halil Pasic wrote:
> > > >
> > > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >>
> > >
> > > ...
> > >
> > > >
> > > >> To get back to the things proposed here: the cdid is IMHO
> > > >> a nice thing, and is functionally corresponding to the
> > > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > > >> is it supposed to be used in the CLC handshake.
> > > >>
> > > >
> > > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > > coexistence with ISM, I am not sure whether we can change or increase
> > > > the SEID.. cc Alexandra
> > > >
> > > > Thanks!
> > > > Wen Gu
> > >
> > > As mentioned by others, discussions are ongoing.
> > > It would be great, if we can agree on a way to use the existing CLC handshake
> > > for SMC-D via virtio-ism and ism-loopback.
> > > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > > can only be changed for x86 in a non-colliding way.
> > >
> > > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > > are free to define new fields (e.g. UUIDs).
> > >
> > > Alexandra
> >
> > Problem with tying to hardware is that it is blocking
> > migration (which is a challenge with ism anyway, but still).
> 
> 
> We don't want to support migration. At least we don't want to support it for the
> time being. Because there are indeed many problems. I think Migration is not
> necessary for a new Virtio device.
> 
> Thanks.

The specific implementation does not matter much. At the spec level we
strive to make interfaces generic so they can be reused down the road,
rather than having to invent new ones for each use-case.
Maybe we can come up with a way that let devices choose either
an existing one with a SEID or a new one with a UUID?


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-07  3:22           ` Xuan Zhuo
  2023-04-07 11:13             ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
@ 2023-04-10  1:23             ` Jason Wang
  2023-04-10  1:53               ` Xuan Zhuo
  1 sibling, 1 reply; 20+ messages in thread
From: Jason Wang @ 2023-04-10  1:23 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, Wen Gu, Halil Pasic, virtio-comment, hans,
	herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo, gerry,
	cohuck, Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Fri, Apr 7, 2023 at 11:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> > >
> > >
> > > On 24.03.23 05:03, Wen Gu wrote:
> > > >
> > > >
> > > > On 2023/3/23 22:46, Halil Pasic wrote:
> > > >
> > > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >>
> > >
> > > ...
> > >
> > > >
> > > >> To get back to the things proposed here: the cdid is IMHO
> > > >> a nice thing, and is functionally corresponding to the
> > > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > > >> is it supposed to be used in the CLC handshake.
> > > >>
> > > >
> > > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > > coexistence with ISM, I am not sure whether we can change or increase
> > > > the SEID.. cc Alexandra
> > > >
> > > > Thanks!
> > > > Wen Gu
> > >
> > > As mentioned by others, discussions are ongoing.
> > > It would be great, if we can agree on a way to use the existing CLC handshake
> > > for SMC-D via virtio-ism and ism-loopback.
> > > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > > can only be changed for x86 in a non-colliding way.
> > >
> > > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > > are free to define new fields (e.g. UUIDs).
> > >
> > > Alexandra
> >
> > Problem with tying to hardware is that it is blocking
> > migration (which is a challenge with ism anyway, but still).
>
>
> We don't want to support migration. At least we don't want to support it for the
> time being. Because there are indeed many problems. I think Migration is not
> necessary for a new Virtio device.

We'd really need to take migration into consideration when developing
new devices. It is not guaranteed that this device is backed by local
RAMs. At least we should avoid designs that may complicate future
migration support.

Thanks

>
> Thanks.
>
>
> >
> >
> > > >
> > > >> If this is really supposed to work with SMC and not just take
> > > >> inspiration from it, I would like some insight from our
> > > >> SMC experts (they are already on copy).
> > > >>
> > > >> Regards,
> > > >> Halil
> > > >>
> > > >> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > >> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-07 11:13             ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
@ 2023-04-10  1:47               ` Xuan Zhuo
  0 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-04-10  1:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Wen Gu, Halil Pasic, virtio-comment, hans, herongguang, zmlcc,
	dust.li, tonylu, zhenzao, helinguo, gerry, cohuck, jasowang,
	Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Fri, 7 Apr 2023 07:13:13 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Fri, Apr 07, 2023 at 11:22:15AM +0800, Xuan Zhuo wrote:
> > On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> > > >
> > > >
> > > > On 24.03.23 05:03, Wen Gu wrote:
> > > > >
> > > > >
> > > > > On 2023/3/23 22:46, Halil Pasic wrote:
> > > > >
> > > > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > > > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >>
> > > >
> > > > ...
> > > >
> > > > >
> > > > >> To get back to the things proposed here: the cdid is IMHO
> > > > >> a nice thing, and is functionally corresponding to the
> > > > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > > > >> is it supposed to be used in the CLC handshake.
> > > > >>
> > > > >
> > > > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > > > coexistence with ISM, I am not sure whether we can change or increase
> > > > > the SEID.. cc Alexandra
> > > > >
> > > > > Thanks!
> > > > > Wen Gu
> > > >
> > > > As mentioned by others, discussions are ongoing.
> > > > It would be great, if we can agree on a way to use the existing CLC handshake
> > > > for SMC-D via virtio-ism and ism-loopback.
> > > > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > > > can only be changed for x86 in a non-colliding way.
> > > >
> > > > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > > > are free to define new fields (e.g. UUIDs).
> > > >
> > > > Alexandra
> > >
> > > Problem with tying to hardware is that it is blocking
> > > migration (which is a challenge with ism anyway, but still).
> >
> >
> > We don't want to support migration. At least we don't want to support it for the
> > time being. Because there are indeed many problems. I think Migration is not
> > necessary for a new Virtio device.
> >
> > Thanks.
>
> The specific implementation does not matter much. At the spec level we
> strive to make interfaces generic so they can be reused down the road,
> rather than having to invent new ones for each use-case.
> Maybe we can come up with a way that let devices choose either
> an existing one with a SEID or a new one with a UUID?

Yes. I agree this.

From the perspective of Virtio SPEC, we can provide some information for SMC. As
long as these requirements are reasonable. At present, @Wen is working in this.

Thanks.


>
>
> --
> MST
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-10  1:23             ` Jason Wang
@ 2023-04-10  1:53               ` Xuan Zhuo
  2023-04-10  2:04                 ` Jason Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Xuan Zhuo @ 2023-04-10  1:53 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Wen Gu, Halil Pasic, virtio-comment, hans,
	herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo, gerry,
	cohuck, Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Mon, 10 Apr 2023 09:23:04 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Fri, Apr 7, 2023 at 11:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> > > >
> > > >
> > > > On 24.03.23 05:03, Wen Gu wrote:
> > > > >
> > > > >
> > > > > On 2023/3/23 22:46, Halil Pasic wrote:
> > > > >
> > > > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > > > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >>
> > > >
> > > > ...
> > > >
> > > > >
> > > > >> To get back to the things proposed here: the cdid is IMHO
> > > > >> a nice thing, and is functionally corresponding to the
> > > > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > > > >> is it supposed to be used in the CLC handshake.
> > > > >>
> > > > >
> > > > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > > > coexistence with ISM, I am not sure whether we can change or increase
> > > > > the SEID.. cc Alexandra
> > > > >
> > > > > Thanks!
> > > > > Wen Gu
> > > >
> > > > As mentioned by others, discussions are ongoing.
> > > > It would be great, if we can agree on a way to use the existing CLC handshake
> > > > for SMC-D via virtio-ism and ism-loopback.
> > > > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > > > can only be changed for x86 in a non-colliding way.
> > > >
> > > > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > > > are free to define new fields (e.g. UUIDs).
> > > >
> > > > Alexandra
> > >
> > > Problem with tying to hardware is that it is blocking
> > > migration (which is a challenge with ism anyway, but still).
> >
> >
> > We don't want to support migration. At least we don't want to support it for the
> > time being. Because there are indeed many problems. I think Migration is not
> > necessary for a new Virtio device.
>
> We'd really need to take migration into consideration when developing
> new devices. It is not guaranteed that this device is backed by local
> RAMs.

Yes, we have some considerations and researches on the aspects also, such as
CXL.

We have not planned introducing these in SPEC directly, because we dont even
have a POC. It's just that there are researches in this area and we haven't
actually operated it.

> At least we should avoid designs that may complicate future
> migration support.

Yes, we try to avoid it. If you find that we have this problem, you are welcome
to point out.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > >
> > > > >
> > > > >> If this is really supposed to work with SMC and not just take
> > > > >> inspiration from it, I would like some insight from our
> > > > >> SMC experts (they are already on copy).
> > > > >>
> > > > >> Regards,
> > > > >> Halil
> > > > >>
> > > > >> ---------------------------------------------------------------------
> > > > >> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > >> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-04-10  1:53               ` Xuan Zhuo
@ 2023-04-10  2:04                 ` Jason Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Wang @ 2023-04-10  2:04 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, Wen Gu, Halil Pasic, virtio-comment, hans,
	herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo, gerry,
	cohuck, Jan Kiszka, kgraul, wenjia, jaka, hca, twinkler, raspl,
	virtio-dev, Alexandra Winter

On Mon, Apr 10, 2023 at 10:01 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Mon, 10 Apr 2023 09:23:04 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Fri, Apr 7, 2023 at 11:24 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Wed, 5 Apr 2023 08:52:14 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Wed, Apr 05, 2023 at 02:39:53PM +0200, Alexandra Winter wrote:
> > > > >
> > > > >
> > > > > On 24.03.23 05:03, Wen Gu wrote:
> > > > > >
> > > > > >
> > > > > > On 2023/3/23 22:46, Halil Pasic wrote:
> > > > > >
> > > > > >> On Thu,  9 Feb 2023 11:30:56 +0800
> > > > > >> Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >>
> > > > >
> > > > > ...
> > > > >
> > > > > >
> > > > > >> To get back to the things proposed here: the cdid is IMHO
> > > > > >> a nice thing, and is functionally corresponding to the
> > > > > >> (S)EID. But it is 16 byte wide, and I have no idea how
> > > > > >> is it supposed to be used in the CLC handshake.
> > > > > >>
> > > > > >
> > > > > > CLC handshake carry one SEID for all the SMC-D device. Considering
> > > > > > coexistence with ISM, I am not sure whether we can change or increase
> > > > > > the SEID.. cc Alexandra
> > > > > >
> > > > > > Thanks!
> > > > > > Wen Gu
> > > > >
> > > > > As mentioned by others, discussions are ongoing.
> > > > > It would be great, if we can agree on a way to use the existing CLC handshake
> > > > > for SMC-D via virtio-ism and ism-loopback.
> > > > > In that case SEID needs to be unique per hardware instance, cannot be increased and
> > > > > can only be changed for x86 in a non-colliding way.
> > > > >
> > > > > An alternative would be to define new a SMC-D(?) protocol variant/version, where we
> > > > > are free to define new fields (e.g. UUIDs).
> > > > >
> > > > > Alexandra
> > > >
> > > > Problem with tying to hardware is that it is blocking
> > > > migration (which is a challenge with ism anyway, but still).
> > >
> > >
> > > We don't want to support migration. At least we don't want to support it for the
> > > time being. Because there are indeed many problems. I think Migration is not
> > > necessary for a new Virtio device.
> >
> > We'd really need to take migration into consideration when developing
> > new devices. It is not guaranteed that this device is backed by local
> > RAMs.
>
> Yes, we have some considerations and researches on the aspects also, such as
> CXL.
>
> We have not planned introducing these in SPEC directly, because we dont even
> have a POC. It's just that there are researches in this area and we haven't
> actually operated it.

Good to know that.

>
> > At least we should avoid designs that may complicate future
> > migration support.
>
> Yes, we try to avoid it. If you find that we have this problem, you are welcome
> to point out.

That's fine, this device is really interesting (but I haven't had time
in going though this again).

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > >
> > > > > >
> > > > > >> If this is really supposed to work with SMC and not just take
> > > > > >> inspiration from it, I would like some insight from our
> > > > > >> SMC experts (they are already on copy).
> > > > > >>
> > > > > >> Regards,
> > > > > >> Halil
> > > > > >>
> > > > > >> ---------------------------------------------------------------------
> > > > > >> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > > >> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > >
> >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism
  2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
                     ` (3 preceding siblings ...)
  2023-03-24  4:51   ` Parav Pandit
@ 2023-04-26  7:41   ` Xuan Zhuo
  4 siblings, 0 replies; 20+ messages in thread
From: Xuan Zhuo @ 2023-04-26  7:41 UTC (permalink / raw)
  To: virtio-comment
  Cc: hans, herongguang, zmlcc, dust.li, tonylu, zhenzao, helinguo,
	gerry, mst, cohuck, jasowang, Jan Kiszka, wintera, kgraul,
	wenjia, jaka, hca, twinkler, raspl, virtio-dev, pasic

Any commnets for this.

Thanks.

On Thu,  9 Feb 2023 11:30:56 +0800, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> An ISM(Internal Shared Memory) device provides the ability to access memory
> shared between multiple devices. This allows low-overhead communication in
> presence of such memory. For example, memory can be shared with guests of
> multiple virtual machines running on the same host, with each virtual machine
> including an ism device and with the guests getting the shared memory by the ism
> devices.
>
> An ism device can communicate with multiple peers simultaneously. This
> communication can be dynamically started and ended.
>
> |-------------------------------------------------------------------------------------------------------------|
> | |------------------------------------------------|       |------------------------------------------------| |
> | | Guest                                          |       | Guest                                          | |
> | |                                                |       |                                                | |
> | |   ----------------                             |       |   ----------------                             | |
> | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> | |    |  |                -------------------     |       |    |  |                --------------------    | |
> | |                                |               |       |                               |                | |
> | |                                |               |       |                               |                | |
> | | Qemu                           |               |       | Qemu                          |                | |
> | |--------------------------------+---------------|       |-------------------------------+----------------| |
> |                                  |                                                       |                  |
> |                                  |                                                       |                  |
> |                                  |------------------------------+------------------------|                  |
> |                                                                 |                                           |
> |                                                                 |                                           |
> |                                                   --------------------------                                |
> |                                                    | M1 |   | M2 |   | M3 |                                 |
> |                                                   --------------------------                                |
> |                                                                                                             |
> | HOST                                                                                                        |
> ---------------------------------------------------------------------------------------------------------------
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
> Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
> Signed-off-by: Hans Zhang <hans@linux.alibaba.com>
> Signed-off-by: He Rongguang <herongguang@linux.alibaba.com>
> Suggested-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  conformance.tex |  26 +++
>  content.tex     |   1 +
>  virtio-ism.tex  | 573 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 600 insertions(+)
>  create mode 100644 virtio-ism.tex
>
> diff --git a/conformance.tex b/conformance.tex
> index c3c1d3e..0a1456a 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -335,6 +335,17 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{drivernormative:Device Types / PMEM Device / Device Initialization}
>  \end{itemize}
>
> +\conformance{\subsection}{ISM Driver Conformance}\label{sec:Conformance / Driver Conformance / ISM Driver Conformance}
> +
> +A ISM driver MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{drivernormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\end{itemize}
> +
>  \conformance{\section}{Device Conformance}\label{sec:Conformance / Device Conformance}
>
>  A device MUST conform to the following normative statements:
> @@ -621,6 +632,21 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Device Types / PMEM Device / Device Operation / Virtqueue return}
>  \end{itemize}
>
> +\conformance{\subsection}{ISM Device Conformance}\label{sec:Conformance / Device Conformance / ISM Device Conformance}
> +
> +A ISM device MUST conform to the following normative statements:
> +
> +\begin{itemize}
> +\item \ref{devicenormative:Device Types / ISM Device / Device configuration layout}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Initialization}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Query ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +\item \ref{devicenormative:Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +\end{itemize}
> +
>  \conformance{\section}{Legacy Interface: Transitional Device and Transitional Driver Conformance}\label{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}
>  A conformant implementation MUST be either transitional or
>  non-transitional, see \ref{intro:Legacy
> diff --git a/content.tex b/content.tex
> index 96f4723..fe02aec 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -7545,6 +7545,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
>  \input{virtio-scmi.tex}
>  \input{virtio-gpio.tex}
>  \input{virtio-pmem.tex}
> +\input{virtio-ism.tex}
>
>  \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>
> diff --git a/virtio-ism.tex b/virtio-ism.tex
> new file mode 100644
> index 0000000..a1720d8
> --- /dev/null
> +++ b/virtio-ism.tex
> @@ -0,0 +1,573 @@
> +\section{ISM Device}\label{sec:Device Types / ISM Device}
> +
> +\begin{lstlisting}
> +|-------------------------------------------------------------------------------------------------------------|
> +| |------------------------------------------------|    |------------------------------------------------|    |
> +| | VM                      [M1]   [M2]   [M3]     |    | VM                             [M2]   [M3]     |    |
> +| |                          |      |      |       |    |                                 |      |       |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |   |             driver   |      |      |  |    |    |   |             driver          |      |  |    |    |
> +| |   -----------------------|------|------|---    |    |   ------------------------------|------|---    |    |
> +| |    |cq|                  |map   |map   |map    |    |    |cq|                         |map   |map    |    |
> +| |    |  |                  |      |      |       |    |    |  |                         |      |       |    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |----|--|----------------|  device memory  |-----|    |----|--|----------------|  device memory  |-----|    |
> +| |    |  |                -------------------     |    |    |  |                -------------------     |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |                                |               |    |                                |               |    |
> +| |--------------------------------+---------------|    |--------------------------------+---------------|    |
> +|                                  |                                                       |                  |
> +|                                  |                                                       |                  |
> +|                                  |------------------------------+------------------------|                  |
> +|                                                                 |                                           |
> +|                                                                 |                                           |
> +|                                                   --------------------------                                |
> +|                                                    | M1 |   | M2 |   | M3 |                                 |
> +|                                                   --------------------------                                |
> +|                                                                                                             |
> +|                                                                                                             |
> +|-------------------------------------------------------------------------------------------------------------|
> +\end{lstlisting}
> +
> +An ISM(Internal Shared Memory) device provides the ability to access memory
> +shared between multiple devices. This allows low-overhead communication in
> +presence of such memory. For example, memory can be shared with guests of
> +multiple virtual machines running on the same host, with each virtual machine
> +including an ism device and with the guests getting the shared memory by the ism
> +devices.
> +
> +An ism device can communicate with multiple peers simultaneously. This
> +communication can be dynamically started and ended.
> +
> +All the devices with the ability to communicate with each other form a
> +communication domain. Two devices from different communication domains can't
> +communicate.
> +
> +The device memory of the ism device is divided into multiple chunks with the
> +same size. Every ism region contains multiple chunks. When communicating between
> +two devices, an ism region is used as a shared memory.
> +
> +The ism region is the basis for communication between ism devices.
> +
> +The process of communication between two drivers is that one driver allocates an
> +ism region and obtains a token. Then the peer uses this token to attach the same
> +ism region, the two drivers realize the memory(ism region) sharing. The driver
> +can also notify peer by kick notify-address the ism region has been updated.
> +
> +An ism region can be referred by its \field{token}, or the \field{offset}.
> +The \field{offset} is the offset of the first chunk inside the ism region
> +starting from the device memory head.
> +
> +\subsection{Device ID}\label{sec:Device Types / ISM Device / Device ID}
> +	44
> +
> +\subsection{Virtqueues}\label{sec:Device Types / ISM Device / Virtqueues}
> +\begin{description}
> +\item[0] controlq
> +\item[1] eventq
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / ISM Device / Feature bits}
> +\begin{description}
> +\item[VIRTIO_ISM_F_DEV_MEM(0)] Device provide memory for ism region, driver
> +    don't need to provide memory for alloc/attach operation.
> +
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / ISM Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_ism_config {
> +	le128 cdid;
> +	le64 devid;
> +	le64 chunk_size;
> +	le64 notify_size;
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[\field{cdid}] This is used to identify the communication domain. Only
> +        ism devices with the same \field{cdid} can communicate. A \field{cdid}
> +        is world-wide unique in a sense that there not be another communication
> +        domain with the same \field{cdid}.
> +
> +    \item[\field{devid}] This is used to identify the ism device in the single
> +        communication domain.
> +
> +    \item[\field{chunk_size}] This is the size of the ism chunk. The device
> +        memory is divided into multiple chunks. Every ism chunk has the same
> +        size.
> +
> +    \item[\field{notify_size}] This is the size of the ism notify-address. The
> +        notify-address is used to notify the device that the content of the
> +        ism region has been updated.
> +
> +\end{description}
> +
> +\devicenormative{\subsubsection}{Device configuration layout}{Device Types / ISM Device / Device configuration layout}
> +
> +The device MUST ensure that the \field{cdid} of the device on the same
> +communication domain is same. The \field{cdid} MUST be a version 4 UUID as
> +specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> +
> +In the single communication domain, the device MUST ensure that the \field{devid}
> +is unique.
> +
> +\field{chunk_size} MUST be a power of two.
> +
> +\subsection{Physical Memory}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism region is on the device memory. It is a physical memory for driver, but
> +it is not for device. The device has the ability to modify the physical memory
> +corresponding to the device memory.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, physical memory comes from driver.
> +The driver pass its memory to the device when alloc a new ism region. The
> +device map this physical memory into an ism region. If VIRTIO_ISM_F_DEV_MEM
> +is negotiated, physical memory is provided by the device.
> +
> +For attach operation, if VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver also
> +provides memory to device. Under normal circumstances, this memory will not be
> +used. This is to limit driver to take up too much memory based on attach
> +operation.
> +
> +\subsection{Event}\label{sec:Device Types / ISM Device / Event}
> +
> +The ism device supports event notification of the ism region. When a device
> +kick/attach/detach a region, other ism region referrers will receive related
> +events.
> +
> +A buffer received from eventq can contain multiple event structures.
> +
> +\begin{lstlisting}
> +#define   VIRTIO_ISM_EVENT_UPDATE (1 << 0)
> +#define   VIRTIO_ISM_EVENT_ATTACH (1 << 1)
> +#define   VIRTIO_ISM_EVENT_DETACH (1 << 2)
> +#define   VIRTIO_ISM_EVENT_BUFFER_FREE (1 << 3)  // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +\end{lstlisting}
> +
> +\begin{description}
> +    \item[VIRTIO_ISM_EVENT_UPDATE] The driver kick the notify-address to notify
> +        other peers of the update event of the ism region content.
> +
> +    \item[VIRTIO_ISM_EVENT_ATTACH] A new device attaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_DETACH] A device detaches the ism region.
> +    \item[VIRTIO_ISM_EVENT_BUFFER_FREE] The buffer allocated by this driver has been free.
> +\end{description}
> +
> +The event structures:
> +\begin{lstlisting}
> +struct virtio_ism_event_update {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +};
> +
> +struct virtio_ism_event_attach_detach {
> +	le64 ev_type;
> +	le64 offset;
> +	le64 devid;
> +	le64 peers;
> +};
> +
> +struct virtio_ism_event_buffer_free {
> +	le64 ev_type;
> +    void *buffer;
> +};
> +
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{ev_type}] The type of event, the driver can get the size of the
> +    structure based on this.
> +
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_UPDATE, the corresponding structure is struct virtio_ism_event_update.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_ATTACH or VIRTIO_ISM_EVENT_DETACH, the corresponding structure struct virtio_ism_event_attach_detach.
> +        If \field{ev_type} is VIRTIO_ISM_EVENT_BUFFER_FREE, the corresponding structure is struct virtio_ism_event_buffer_free.
> +
> +\item[\field{offset}] The offset of ism regions with the event.
> +
> +\item[\field{devid}] \field{devid} of the device that generated events.
> +\item[\field{peers}] The number of the peers referring to this ism region. (does not include the
> +    device that receiving this event)
> +
> +\item[\field{buffer}] The buffer passed to device when alloc/attach a new ism region.
> +
> +\end{description}
> +
> +\subsection{Permissions}\label{sec:Device Types / ISM Device / Permission}
> +
> +The permissions of an ism region determine whether this ism region can be
> +attached and the read and write permissions after attach. The driver can set the
> +default permissions for all devices, or set the dedicated permissions for a
> +certain device. If there are dedicated permissions for a device, the default
> +permissions are invalid for this device.
> +
> +For example, the driver can set the default permission of an ism region as
> +read-only, and set the dedicated permissions as writable for a device with devid
> +0xff. Then all devices except 0xff in such a communication domain can attach
> +this ism region and only have read-only permission. And the device 0xff can
> +write to the ism region after attaching.
> +
> +When a driver has the management permission of the ism region, then it can
> +modify the permissions of this ism region. By default, only the device that
> +allocated the ism region has this permission.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / ISM Device / Device Initialization}
> +
> +\devicenormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The device MUST generate a \field{devid}. \field{devid} remains unchanged
> +during reset. \field{devid} MUST NOT be 0.
> +
> +The device exposes memory to the driver based on Shared Memory Regions
> +\ref{sec:Basic Facilities of a Virtio Device / Shared Memory Regions}.
> +However, it does not need to allocate physical memory during initialization.
> +
> +The \field{shmid} of a region MUST be one of the following
> +\begin{lstlisting}
> +enum virtio_ism_shm_id {
> +	VIRTIO_ISM_SHM_ID_UNDEFINED = 0,
> +	VIRTIO_ISM_SHM_ID_REGIONS   = 1,
> +	VIRTIO_ISM_SHM_ID_NOTIFY    = 2,
> +};
> +\end{lstlisting}
> +
> +The device memory area with VIRTIO_ISM_SHM_ID_REGIONS is used to provide ism regions.
> +
> +The device MUST also provide a memory area with VIRTIO_ISM_SHM_ID_NOTIFY to the
> +driver. This memory area is used to notify the device with update event, and
> +each ism region MUST have a corresponding notify-address inside this area, and
> +the size of the notify-address is \field{notify_size}.
> +
> +\drivernormative{\subsubsection}{Device Initialization}{Device Types / ISM Device / Device Initialization}
> +
> +The driver MUST NOT access any chunk before it is referred by one ism region.
> +
> +\subsection{Control Virtqueue}\label{sec:Device Types / ISM Device / Control Virtqueue}
> +
> +The driver uses the control virtqueue send commands to implement operations on
> +the ism region and some global configurations.
> +
> +All commands are of the following form:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl {
> +	u8 class;
> +	u8 command;
> +	u8 command_specific_data[];
> +	u8 ack;
> +	u8 command_specific_data_reply[];
> +};
> +
> +/* ack values */
> +#define VIRTIO_ISM_OK      0
> +#define VIRTIO_ISM_ERR     255
> +
> +#define VIRTIO_ISM_ENOENT  2
> +#define VIRTIO_ISM_E2BIG   7
> +#define VIRTIO_ISM_ENOMEM  12
> +#define VIRTIO_ISM_ENOSPEC 28
> +
> +#define VIRTIO_ISM_PERM_EATTACH 100
> +#define VIRTIO_ISM_PERM_EREAD   101
> +#define VIRTIO_ISM_PERM_EWRITE  102
> +\end{lstlisting}
> +
> +The \field{class}, \field{command} and command-specific-data are set by the
> +driver, and the device sets the \field{ack} byte and optionally
> +\field{command-specific-data-reply}.
> +
> +\subsection{Device Operation}\label{sec:Device Types / ISM Driver / Device Operation}
> +
> +\subsubsection{Alloc ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +Based on controlq, the driver can request to allocate an ism region.
> +
> +The reply from the device will carry a token, which can be passed
> +to other driver for attaching to this ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc {
> +	le64 size;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_alloc_reply {
> +	le64 token;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ALLOC  0
> +	#define VIRTIO_ISM_CTRL_ALLOC_REGION 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +For the command VIRTIO_ISM_CTRL_ALLOC_REGION, the device MUST first look for the
> +free chunks from the device memory. And the physical memory is mapped to these
> +chunks in order. If VIRTIO_ISM_F_DEV_MEM is negotiated, then the device MUST
> +provide physical memory by itself. Otherwise, the memory provided by the driver
> +will be used. If there is no enough free chunk, the device MUST set \field{ack}
> +to VIRTIO_ISM_NOSPEC. If new physical memory cannot be allocated by the device,
> +the device MUST set \field{ack} to VIRTIO_ISM_NOMEM.
> +
> +The device sets \field{ack} to VIRTIO_ISM_OK after successfully allocating the
> +ism region. A new token MUST be dynamically created for this ism region
> +simultaneously. The token MUST be unique inside this communication domain. And
> +the device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks.
> +
> +The device MUST clear the memory of this ism region before committing to the
> +driver.
> +
> +If \field{size} is smaller than \field{chunk_size}, the ism region also
> +occupies one chunk.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG.
> +
> +\drivernormative{\subparagraph}{Alloc ISM Region}{Device Types / ISM Device / Device Operation / Alloc ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +\subsubsection{Query ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Query ISM Region}
> +Based on controlq, the driver can query the information of an ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query {
> +	le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_query_reply {
> +	le64 size;
> +	le64 permissions;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_QUERY  1
> +	#define VIRTIO_ISM_CTRL_QUERY_INFO 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Query ISM Region}{Device Types / ISM Device / Device Operation / Query ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT. Otherwise, the device MUST fill the
> +\field{size} with the size of the ism region. The device MUST get the
> +permissions of this ism region for the device-self, then fill in the
> +\field{permissions}.
> +
> +\subsubsection{Attach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +Based on controlq, the driver can request to attach an ism region with a
> +specified token.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach {
> +	le64 token;
> +	le64 rw_perm;
> +    void *buffer; // Only if VIRTIO_ISM_F_DEV_MEM is not negotiated
> +};
> +\end{lstlisting}
> +
> +The \field{command_specific_data_reply} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_attach_reply {
> +	le64 size;
> +	le64 num;
> +    le64 chunks[];
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_ATTACH  2
> +	#define VIRTIO_ISM_CTRL_ATTACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If the ism region specified by \field{token} does not exist, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST first look for the free chunks from the device memory. And the
> +physical memory of the ism region is mapped to these chunks in order. If there is
> +no enough free chunk, the device MUST set \field{ack} to VIRTIO_ISM_NOSPEC.
> +The device MUST fill \filed{chunks} with the offset of each chunk. \field{num}
> +is the number of the chunks. \field{size} is the size of this ism region.
> +
> +The device MUST set \field{ack} to VIRTIO_ISM_OK after successfully finding and
> +assigning the physical memory of this ism region into the device memory.
> +
> +If the device does not have the VIRTIO_ISM_PERM_ATTACH permission for this ism
> +region, the device MUST set \field{ack} to VIRTIO_ISM_PERM_EATTACH. If
> +\field{rw_perm} include VIRTIO_ISM_PERM_READ, but the device does not have the
> +VIRTIO_ISM_PERM_READ permission for this region, the device MUST set \field{ack}
> +to VIRTIO_ISM_PERM_EREAD. If \field{rw_perm} include VIRTIO_ISM_PERM_WRITE, but
> +the device does not have the VIRTIO_ISM_PERM_WRITE permission for this region,
> +the device MUST set \field{ack} to VIRTIO_ISM_PERM_EWRITE.
> +
> +The device MUST set the read and write permissions of the physical memory inside
> +the device memory based on \field{rw_perm}.
> +
> +If virtio_ism_ctrl_alloc_reply is not enough, the device MUST set the \field{ack}
> +to VIRTIO_ISM_E2BIG and set the \field{num} for next request.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_ATTACH event to other peers.
> +
> +\drivernormative{\subparagraph}{Attach ISM Region}{Device Types / ISM Device / Device Operation / Attach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated, the driver MUST provide the buffer
> +with \field{buffer}. The size of the buffer MUST equal to the size of the ism
> +region that we want to allocate.
> +
> +If \field {size} is less than \field{num} times \field {chunk_size}, then only
> +header part of the last chunk is effective.
> +
> +\subsubsection{Detach ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Detach ISM Region}
> +Based on controlq, the device can release references to the ism region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_detach {
> +	le64 token;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_DETACH  3
> +	#define VIRTIO_ISM_CTRL_DETACH_REGION 0
> +\end{lstlisting}
> +
> +\devicenormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The device MUST generate a VIRTIO_ISM_EVENT_DETACH event to other peers.
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and no one is referred to this ism
> +region, the device who allocated this ism region MUST generate a
> +VIRTIO_ISM_EVENT_BUFFER_FREE event to the driver.
> +
> +\drivernormative{\subparagraph}{Detach ISM Region}{Device Types / ISM Device / Device Operation / Detach ISM Region}
> +
> +If VIRTIO_ISM_F_DEV_MEM is not negotiated and the driver got the ism region by
> +attach operation, then the buffer is free after the detach operation.
> +
> +\subsubsection{Grant ISM Region}\label{sec:Device Types / ISM Device / Device Operation / Grant ISM Region}
> +Based on controlq, the driver can set the permissions for each ism
> +region.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_grant_default {
> +	le64 token;
> +	le64 permissions;
> +};
> +
> +struct virtio_ism_ctrl_grant_dev {
> +	le64 token;
> +	le64 permissions;
> +	le64 peer_devid;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_GRANT  4
> +	#define VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT    0
> +	#define VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE 1
> +\end{lstlisting}
> +
> +The permissions:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_PERM_READ       (1 << 0)
> +#define VIRTIO_ISM_PERM_WRITE      (1 << 1)
> +
> +#define VIRTIO_ISM_PERM_ATTACH     (1 << 2)
> +
> +#define VIRTIO_ISM_PERM_MANAGE     (1 << 3)
> +#define VIRTIO_ISM_PERM_CLEAN_DEFAULT     (1 << 4)
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[VIRTIO_ISM_PERM_READ] read permission
> +\item[VIRTIO_ISM_PERM_WRITE] write permission
> +\item[VIRTIO_ISM_PERM_ATTACH] attach permission
> +
> +\item[VIRTIO_ISM_PERM_MANAGE] Management permission, the device with this
> +	permission can modify the permission of this ism region. By default, only
> +	the device that allocated the region has this permission.
> +
> +\item[VIRTIO_ISM_PERM_DENY_DEFAULT] Clean all permissions of default, just used
> +    with VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE.
> +
> +\end{description}
> +
> +VIRTIO_ISM_CTRL_GRANT_SET_DEFAULT corresponds to virtio_ism_ctrl_grant_default;
> +VIRTIO_ISM_CTRL_GRANT_SET_FOR_DEVICE corresponds to virtio_ism_ctrl_grant_dev;
> +
> +Permission control is divided into two categories, one is the dedicated
> +permissions for a certain device, and the other is the default permissions for
> +all device.
> +
> +\devicenormative{\subparagraph}{Grant ISM Region}{Device Types / ISM Device / Device Operation / Grant ISM Region}
> +
> +If the ism region specified by \field{token} is not attached, the device MUST set
> +\field{ack} to VIRTIO_ISM_ENOENT.
> +
> +The default permissions of the newly allocated ism region MUST be  (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH).
> +The device that allocated the ism region MUST have the permissions (VIRTIO_ISM_PERM_READ | VIRTIO_ISM_PERM_WRITE | VIRTIO_ISM_PERM_ATTACH | VIRTIO_ISM_PERM_MANAGE) for this region.
> +
> +For a certain device, if there is dedicated permissions, the default
> +permissions are illegal to this device.
> +
> +\subsubsection{Inform Update Event IRQ Vector}\label{sec:Device Types / ISM Device / Device Operation / Inform Update Event IRQ Vector}
> +
> +For the ism region update event, the driver can bind an interrupt to the ism
> +region. Then this ism region's update event will no longer be passed through
> +eventq, but the interrupt will be directly triggered.
> +
> +The \field{command_specific_data} has the following format:
> +\begin{lstlisting}
> +struct virtio_ism_ctrl_irq_vector {
> +	le64 token;
> +	le64 vector;
> +};
> +\end{lstlisting}
> +
> +The \field{class} and \field{command}:
> +\begin{lstlisting}
> +#define VIRTIO_ISM_CTRL_EVENT_VECTOR  5
> +	#define VIRTIO_ISM_CTRL_EVENT_VECTOR_SET 0
> +\end{lstlisting}
> +
> +
> +\devicenormative{\subparagraph}{Inform Event IRQ Vector}{Device Types / ISM Device / Device Operation / Inform Event IRQ Vector}
> +
> +The device MUST record the relationship between the ism region and the vector
> +notified by the driver, and notify the update event of this region to driver
> +by the corresponding vector. And the device MUST NOT use eventq to send the
> +update event of this ism region.
> +
> +
> --
> 2.32.0.3.g01195cf9f
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-04-26  7:42 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09  3:30 [PATCH v3 0/1] introduce virtio-ism: internal shared memory device Xuan Zhuo
2023-02-09  3:30 ` [PATCH v3 1/1] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2023-02-09  3:35   ` [virtio-comment] " Parav Pandit
2023-02-09  3:36     ` Xuan Zhuo
2023-03-07 11:15   ` [virtio-dev] " Xuan Zhuo
2023-03-15 11:15     ` Xuan Zhuo
2023-03-23 14:46   ` [virtio-dev] Re: [virtio-comment] " Halil Pasic
2023-03-24  3:08     ` Xuan Zhuo
2023-03-24  4:03     ` Wen Gu
     [not found]       ` <7a9ebec0-5e87-b80f-4f2c-c4db7ae4fe84@linux.ibm.com>
2023-04-05 12:52         ` Michael S. Tsirkin
2023-04-07  3:22           ` Xuan Zhuo
2023-04-07 11:13             ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
2023-04-10  1:47               ` Xuan Zhuo
2023-04-10  1:23             ` Jason Wang
2023-04-10  1:53               ` Xuan Zhuo
2023-04-10  2:04                 ` Jason Wang
2023-03-24  4:51   ` Parav Pandit
2023-03-24  6:35     ` Xuan Zhuo
2023-03-24  9:10     ` Michael S. Tsirkin
2023-04-26  7:41   ` [virtio-dev] " Xuan Zhuo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).