All of lore.kernel.org
 help / color / mirror / Atom feed
From: marcandre.lureau@redhat.com
To: peter.maydell@linaro.org
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
	qemu-devel@nongnu.org,
	"David Marchand" <david.marchand@6wind.com>
Subject: [Qemu-devel] [PULL 35/48] docs: update ivshmem device spec
Date: Tue,  6 Oct 2015 21:19:31 +0200	[thread overview]
Message-ID: <1444159184-18153-36-git-send-email-marcandre.lureau@redhat.com> (raw)
In-Reply-To: <1444159184-18153-1-git-send-email-marcandre.lureau@redhat.com>

From: David Marchand <david.marchand@6wind.com>

Add some notes on the parts needed to use ivshmem devices: more specifically,
explain the purpose of an ivshmem server and the basic concept to use the
ivshmem devices in guests.
Move some parts of the documentation and re-organise it.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 docs/specs/ivshmem_device_spec.txt | 124 +++++++++++++++++++++++++++----------
 1 file changed, 93 insertions(+), 31 deletions(-)

diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
index 667a862..12f338e 100644
--- a/docs/specs/ivshmem_device_spec.txt
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -2,30 +2,103 @@
 Device Specification for Inter-VM shared memory device
 ------------------------------------------------------
 
-The Inter-VM shared memory device is designed to share a region of memory to
-userspace in multiple virtual guests.  The memory region does not belong to any
-guest, but is a POSIX memory object on the host.  Optionally, the device may
-support sending interrupts to other guests sharing the same memory region.
+The Inter-VM shared memory device is designed to share a memory region (created
+on the host via the POSIX shared memory API) between multiple QEMU processes
+running different guests. In order for all guests to be able to pick up the
+shared memory area, it is modeled by QEMU as a PCI device exposing said memory
+to the guest as a PCI BAR.
+The memory region does not belong to any guest, but is a POSIX memory object on
+the host. The host can access this shared memory if needed.
+
+The device also provides an optional communication mechanism between guests
+sharing the same memory object. More details about that in the section 'Guest to
+guest communication' section.
 
 
 The Inter-VM PCI device
 -----------------------
 
-*BARs*
+From the VM point of view, the ivshmem PCI device supports three BARs.
+
+- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
+  not used.
+- BAR1 is used for MSI-X when it is enabled in the device.
+- BAR2 is used to access the shared memory object.
+
+It is your choice how to use the device but you must choose between two
+behaviors :
+
+- basically, if you only need the shared memory part, you will map BAR2.
+  This way, you have access to the shared memory in guest and can use it as you
+  see fit (memnic, for example, uses it in userland
+  http://dpdk.org/browse/memnic).
+
+- BAR0 and BAR1 are used to implement an optional communication mechanism
+  through interrupts in the guests. If you need an event mechanism between the
+  guests accessing the shared memory, you will most likely want to write a
+  kernel driver that will handle interrupts. See details in the section 'Guest
+  to guest communication' section.
+
+The behavior is chosen when starting your QEMU processes:
+- no communication mechanism needed, the first QEMU to start creates the shared
+  memory on the host, subsequent QEMU processes will use it.
+
+- communication mechanism needed, an ivshmem server must be started before any
+  QEMU processes, then each QEMU process connects to the server unix socket.
+
+For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
+
+
+Guest to guest communication
+----------------------------
+
+This section details the communication mechanism between the guests accessing
+the ivhsmem shared memory.
 
-The device supports three BARs.  BAR0 is a 1 Kbyte MMIO region to support
-registers.  BAR1 is used for MSI-X when it is enabled in the device.  BAR2 is
-used to map the shared memory object from the host.  The size of BAR2 is
-specified when the guest is started and must be a power of 2 in size.
+*ivshmem server*
 
-*Registers*
+This server code is available in qemu.git/contrib/ivshmem-server.
 
-The device currently supports 4 registers of 32-bits each.  Registers
-are used for synchronization between guests sharing the same memory object when
-interrupts are supported (this requires using the shared memory server).
+The server must be started on the host before any guest.
+It creates a shared memory object then waits for clients to connect on a unix
+socket.
 
-The server assigns each VM an ID number and sends this ID number to the QEMU
-process when the guest starts.
+For each client (QEMU process) that connects to the server:
+- the server assigns an ID for this client and sends this ID to him as the first
+  message,
+- the server sends a fd to the shared memory object to this client,
+- the server creates a new set of host eventfds associated to the new client and
+  sends this set to all already connected clients,
+- finally, the server sends all the eventfds sets for all clients to the new
+  client.
+
+The server signals all clients when one of them disconnects.
+
+The client IDs are limited to 16 bits because of the current implementation (see
+Doorbell register in 'PCI device registers' subsection). Hence only 65536
+clients are supported.
+
+All the file descriptors (fd to the shared memory, eventfds for each client)
+are passed to clients using SCM_RIGHTS over the server unix socket.
+
+Apart from the current ivshmem implementation in QEMU, an ivshmem client has
+been provided in qemu.git/contrib/ivshmem-client for debug.
+
+*QEMU as an ivshmem client*
+
+At initialisation, when creating the ivshmem device, QEMU gets its ID from the
+server then makes it available through BAR0 IVPosition register for the VM to
+use (see 'PCI device registers' subsection).
+QEMU then uses the fd to the shared memory to map it to BAR2.
+eventfds for all other clients received from the server are stored to implement
+BAR0 Doorbell register (see 'PCI device registers' subsection).
+Finally, eventfds assigned to this QEMU process are used to send interrupts in
+this VM.
+
+*PCI device registers*
+
+From the VM point of view, the ivshmem PCI device supports 4 registers of
+32-bits each.
 
 enum ivshmem_registers {
     IntrMask = 0,
@@ -49,8 +122,8 @@ bit to 0 and unmasked by setting the first bit to 1.
 IVPosition Register: The IVPosition register is read-only and reports the
 guest's ID number.  The guest IDs are non-negative integers.  When using the
 server, since the server is a separate process, the VM ID will only be set when
-the device is ready (shared memory is received from the server and accessible via
-the device).  If the device is not ready, the IVPosition will return -1.
+the device is ready (shared memory is received from the server and accessible
+via the device).  If the device is not ready, the IVPosition will return -1.
 Applications should ensure that they have a valid VM ID before accessing the
 shared memory.
 
@@ -59,8 +132,8 @@ Doorbell register.  The doorbell register is 32-bits, logically divided into
 two 16-bit fields.  The high 16-bits are the guest ID to interrupt and the low
 16-bits are the interrupt vector to trigger.  The semantics of the value
 written to the doorbell depends on whether the device is using MSI or a regular
-pin-based interrupt.  In short, MSI uses vectors while regular interrupts set the
-status register.
+pin-based interrupt.  In short, MSI uses vectors while regular interrupts set
+the status register.
 
 Regular Interrupts
 
@@ -71,7 +144,7 @@ interrupt in the destination guest.
 
 Message Signalled Interrupts
 
-A ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
+An ivshmem device may support multiple MSI vectors.  If so, the lower 16-bits
 written to the Doorbell register must be between 0 and the maximum number of
 vectors the guest supports.  The lower 16 bits written to the doorbell is the
 MSI vector that will be raised in the destination guest.  The number of MSI
@@ -83,14 +156,3 @@ interrupt itself should be communicated via the shared memory region.  Devices
 supporting multiple MSI vectors can use different vectors to indicate different
 events have occurred.  The semantics of interrupt vectors are left to the
 user's discretion.
-
-
-Usage in the Guest
-------------------
-
-The shared memory device is intended to be used with the provided UIO driver.
-Very little configuration is needed.  The guest should map BAR0 to access the
-registers (an array of 32-bit ints allows simple writing) and map BAR2 to
-access the shared memory region itself.  The size of the shared memory region
-is specified when the guest (or shared memory server) is started.  A guest may
-map the whole shared memory region or only part of it.
-- 
2.4.3

  parent reply	other threads:[~2015-10-06 21:21 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-06 19:18 [Qemu-devel] [PULL 00/48] ivshmem series marcandre.lureau
2015-10-06 19:18 ` [Qemu-devel] [PULL 01/48] char: add qemu_chr_free() marcandre.lureau
2015-10-06 19:18 ` [Qemu-devel] [PULL 02/48] msix: add VMSTATE_MSIX_TEST marcandre.lureau
2015-10-06 19:18 ` [Qemu-devel] [PULL 03/48] ivhsmem: read do not accept more than sizeof(long) marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 04/48] ivshmem: fix number of bytes to push to fifo marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 05/48] ivshmem: factor out the incoming fifo handling marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 06/48] ivshmem: remove unnecessary dup() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 07/48] ivshmem: remove superflous ivshmem_attr field marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 08/48] ivshmem: remove useless doorbell field marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 09/48] ivshmem: more qdev conversion marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 10/48] ivshmem: remove last exit(1) marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 11/48] ivshmem: limit maximum number of peers to G_MAXUINT16 marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 12/48] ivshmem: simplify around increase_dynamic_storage() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 13/48] ivshmem: allocate eventfds in resize_peers() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 14/48] ivshmem: remove useless ivshmem_update_irq() val argument marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 15/48] ivshmem: initialize max_peer to -1 marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 16/48] ivshmem: remove max_peer field marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 17/48] ivshmem: improve debug messages marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 18/48] ivshmem: improve error handling marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 19/48] ivshmem: print error on invalid peer id marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 20/48] ivshmem: simplify a bit the code marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 21/48] ivshmem: use common return marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 22/48] ivshmem: use common is_power_of_2() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 23/48] ivshmem: migrate with VMStateDescription marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 24/48] ivshmem: shmfd can be 0 marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 25/48] ivshmem: check shm isn't already initialized marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 26/48] ivshmem: add device description marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 27/48] ivshmem: fix pci_ivshmem_exit() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 28/48] ivshmem: replace 'guest' for 'peer' appropriately marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 29/48] ivshmem: error on too many eventfd received marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 30/48] ivshmem: reset mask on device reset marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 31/48] contrib: add ivshmem client and server marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 32/48] ivshmem-client: check the number of vectors marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 33/48] ivshmem-server: use a uint16 for client ID marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 34/48] ivshmem-server: fix hugetlbfs support marcandre.lureau
2015-10-06 19:19 ` marcandre.lureau [this message]
2015-10-06 19:19 ` [Qemu-devel] [PULL 36/48] ivshmem: add check on protocol version in QEMU marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 37/48] contrib: remove unnecessary strdup() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 38/48] msix: implement pba write (but read-only) marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 39/48] qtest: add qtest_add_abrt_handler() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 40/48] glib-compat: add 2.38/2.40/2.46 asserts marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 41/48] tests: add ivshmem qtest marcandre.lureau
2015-10-10 21:29   ` Michael Roth
2015-10-12 13:53     ` Marc-André Lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 42/48] ivshmem: do not keep shm_fd open marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 43/48] ivshmem: use qemu_strtosz() marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 44/48] ivshmem: add hostmem backend marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 45/48] ivshmem: remove EventfdEntry.vector marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 46/48] ivshmem: rename MSI eventfd_table marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 47/48] ivshmem: use kvm irqfd for msi notifications marcandre.lureau
2015-10-06 19:19 ` [Qemu-devel] [PULL 48/48] ivshmem: use little-endian int64_t for the protocol marcandre.lureau
2015-10-07 12:11 ` [Qemu-devel] [PULL 00/48] ivshmem series Andreas Färber
2015-10-07 12:16   ` Marc-André Lureau
2015-10-07 12:31     ` Andreas Färber
2015-10-07 12:44       ` Marc-André Lureau
2015-10-10 22:18         ` [Qemu-devel] [PATCH 1/2] tests: Add ivshmem qtest Andreas Färber
2015-10-10 22:18           ` [Qemu-devel] [PATCH 2/2] ivshmem-test: Implement tests Andreas Färber
2015-10-10 22:28             ` Andreas Färber
2015-10-10 23:10           ` [Qemu-devel] [PATCH 1/2] tests: Add ivshmem qtest Marc-André Lureau
2015-10-07 22:24       ` [Qemu-devel] [PULL 00/48] ivshmem series Paolo Bonzini
2015-10-07 12:42     ` Andrew Jones
2015-10-07 13:05       ` Andreas Färber
2015-10-07 13:26         ` Marc-André Lureau
2015-10-07 22:00         ` Peter Maydell
2015-10-09 11:55 ` Pavel Fedin
2015-10-09 12:12   ` Marc-André Lureau
2015-10-09 12:29     ` Pavel Fedin
2015-10-09 12:45       ` Marc-André Lureau
2015-10-09 12:47         ` Paolo Bonzini
2015-10-09 13:02           ` Pavel Fedin
2015-10-09 13:07             ` Paolo Bonzini
2015-10-09 12:51         ` Pavel Fedin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1444159184-18153-36-git-send-email-marcandre.lureau@redhat.com \
    --to=marcandre.lureau@redhat.com \
    --cc=david.marchand@6wind.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.