All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: hans@linux.alibaba.com, herongguang@linux.alibaba.com,
	zmlcc@linux.alibaba.com, dust.li@linux.alibaba.com,
	tonylu@linux.alibaba.com, zhenzao@linux.alibaba.com,
	helinguo@linux.alibaba.com, gerry@linux.alibaba.com,
	mst@redhat.com, cohuck@redhat.com, jasowang@redhat.com,
	virtio-dev@lists.oasis-open.org
Subject: Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device
Date: Thu, 24 Nov 2022 10:32:42 +0800	[thread overview]
Message-ID: <1669257162.6154113-1-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <412f1bdf-724e-8b9d-4f28-213d82654f83@siemens.com>

On Wed, 23 Nov 2022 16:27:00 +0100, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 16.11.22 03:13, Xuan Zhuo wrote:
> > On Mon, 14 Nov 2022 22:30:53 +0100, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >> On 18.10.22 09:32, Jan Kiszka wrote:
> >>> On 17.10.22 09:47, Xuan Zhuo wrote:
> >>>> Hello everyone,
> >>>>
> >>>> # Background
> >>>>
> >>>> Nowadays, there is a common scenario to accelerate communication between
> >>>> different VMs and containers, including light weight virtual machine based
> >>>> containers. One way to achieve this is to colocate them on the same host.
> >>>> However, the performance of inter-VM communication through network stack is not
> >>>> optimal and may also waste extra CPU cycles. This scenario has been discussed
> >>>> many times, but still no generic solution available [1] [2] [3].
> >>>>
> >>>> With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> >>>> We found that by changing the communication channel between VMs from TCP to SMC
> >>>> with shared memory, we can achieve superior performance for a common
> >>>> socket-based application[5]:
> >>>>   - latency reduced by about 50%
> >>>>   - throughput increased by about 300%
> >>>>   - CPU consumption reduced by about 50%
> >>>>
> >>>> Since there is no particularly suitable shared memory management solution
> >>>> matches the need for SMC(See ## Comparison with existing technology), and virtio
> >>>> is the standard for communication in the virtualization world, we want to
> >>>> implement a virtio-ism device based on virtio, which can support on-demand
> >>>> memory sharing across VMs, containers or VM-container. To match the needs of SMC,
> >>>> the virtio-ism device need to support:
> >>>>
> >>>> 1. Dynamic provision: shared memory regions are dynamically allocated and
> >>>>    provisioned.
> >>>> 2. Multi-region management: the shared memory is divided into regions,
> >>>>    and a peer may allocate one or more regions from the same shared memory
> >>>>    device.
> >>>> 3. Permission control: The permission of each region can be set seperately.
> >>>>
> >>>> # Virtio ism device
> >>>>
> >>>> ISM devices provide the ability to share memory between different guests on a
> >>>> host. A guest's memory got from ism device can be shared with multiple peers at
> >>>> the same time. This shared relationship can be dynamically created and released.
> >>>>
> >>>> The shared memory obtained from the device is divided into multiple ism regions
> >>>> for share. ISM device provides a mechanism to notify other ism region referrers
> >>>> of content update events.
> >>>>
> >>>> # Usage (SMC as example)
> >>>>
> >>>> Maybe there is one of possible use cases:
> >>>>
> >>>> 1. SMC calls the interface ism_alloc_region() of the ism driver to return the
> >>>>    location of a memory region in the PCI space and a token.
> >>>> 2. The ism driver mmap the memory region and return to SMC with the token
> >>>> 3. SMC passes the token to the connected peer
> >>>> 3. the peer calls the ism driver interface ism_attach_region(token) to
> >>>>    get the location of the PCI space of the shared memory
> >>>>
> >>>>
> >>>> # About hot plugging of the ism device
> >>>>
> >>>>    Hot plugging of devices is a heavier, possibly failed, time-consuming, and
> >>>>    less scalable operation. So, we don't plan to support it for now.
> >>>>
> >>>> # Comparison with existing technology
> >>>>
> >>>> ## ivshmem or ivshmem 2.0 of Qemu
> >>>>
> >>>>    1. ivshmem 1.0 is a large piece of memory that can be seen by all devices that
> >>>>    use this VM, so the security is not enough.
> >>>>
> >>>>    2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by all
> >>>>    other VMs that use the ivshmem 2.0 shared memory device, which also does not
> >>>>    meet our needs in terms of security.
> >>>
> >>> This is addressed by establishing separate links between VMs (modeled
> >>> with separate devices). That is a trade-off between simplicity of the
> >>> model and convenience, for sure.
> >>
> >> BTW, simplicity can also brings security because it reduces the trusted
> >> code base.
> >>
> >> Another feature of ivshmem-v2 is permitting direct access to essential
> >> resources of the device from /unprivileged/ userspace, including to the
> >> event triggering registers. Is your model designed for that as well? It
> >> not only permits VM-to-VM, it actually makes app-to-app (in VMs) very cheap.
> >
> > Yes, there are two actual application scenarios or design goals:
> >
> > * As we mentioned above, docking with SMC inside the linux kernel to achieve high-speed communication.
> > * Virtio-ISM also has an interface below /dev. Ordinary users can also directly
> >   obtain the SHM region resources to achieve sharing with other APPs on other
> >   VMs.
> >
> > https://github.com/fengidri/linux-kernel-virtio-ism/commit/55a8ed21344e26f574dd81b0213b0d61d80e2ecb
> > https://github.com/fengidri/linux-kernel-virtio-ism/commit/6518739f9a9a36f25d5709da940b7a7938f8e0ee
> >
>
> An example for the missing detach notification in ISM.

Yes, I agree. I also think this is a good point. We should introduce the
management function of life status in the next version. For example, attach,
detach, permissions change notification.

I think it may be a more appropriate way to use virtio-virtqueue to receive
these messages.

The current ISM Spec defines vq that can be used to receive a shm update. We
can add some new events.

struct event {
	u32 ev_type; // update event, detach event or attach event
	......
}


>
> And the model of ivshmem-v2 permits syscall-free notification (sending,
> not IRQ-based receiving, obviously). On reception, it avoids one vmexit
> to throttle incoming events if you have continuously firing sender in
> combination with a non-reacting unprivileged receiver task.

I guess that you are talking about the SHM update event. We did not introduce
similar models in the ISM framework, because we hope that the user will realize
such a mechanism in the shared memory, such as SMC.

If the user uses the shared memory at user space, I think it is also convenient
to use a part of the shared memory as a notification area.

>
> Would be great to combine the best of all worlds here. But specifically
> missing life-cycle management on the detach side makes the ISM not
> better than legacy ivshmem IHMO.

I will add such a mechanism in the next version.

Thanks.


>
> Jan
>
> --
> Siemens AG, Technology
> Competence Center Embedded Linux
>


      reply	other threads:[~2022-11-24  2:32 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17  7:47 [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device Xuan Zhuo
2022-10-17  7:47 ` [virtio-dev] [PATCH 1/2] Reserve device id for ISM device Xuan Zhuo
2022-10-17  7:47 ` [PATCH 2/2] virtio-ism: introduce new device virtio-ism Xuan Zhuo
2022-10-17  8:17 ` [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device Jason Wang
2022-10-17 12:26   ` Xuan Zhuo
2022-10-18  6:54     ` Jason Wang
2022-10-18  8:33       ` Gerry
2022-10-19  3:55         ` Jason Wang
2022-10-19  5:29           ` Gerry
2022-10-18  8:55       ` He Rongguang
2022-10-19  4:16         ` Jason Wang
2022-10-19  6:43       ` Xuan Zhuo
2022-10-19  8:01         ` Jason Wang
2022-10-19  8:03           ` Gerry
2022-10-19  8:14             ` Xuan Zhuo
2022-10-19  8:21             ` Dust Li
2022-10-19  9:08               ` Jason Wang
2022-10-19  9:10                 ` Xuan Zhuo
2022-10-19  9:15                   ` Jason Wang
2022-10-19  9:23                     ` Xuan Zhuo
2022-10-21  2:41                       ` Jason Wang
2022-10-21  2:53                         ` Gerry
2022-10-21  3:30                         ` Dust Li
2022-10-21  6:37                           ` Jason Wang
2022-10-21  9:26                             ` Dust Li
2022-10-19  8:13           ` Xuan Zhuo
2022-10-19  8:15             ` Xuan Zhuo
2022-10-19  9:11               ` Jason Wang
2022-10-19  9:15                 ` Xuan Zhuo
2022-10-21  2:42                   ` Jason Wang
2022-10-21  3:03                     ` Xuan Zhuo
2022-10-21  6:35                       ` Jason Wang
2022-10-18  3:15   ` dust.li
2022-10-18  7:29     ` Jason Wang
2022-10-19  2:34   ` Xuan Zhuo
2022-10-19  3:56     ` Jason Wang
2022-10-19  4:08       ` Xuan Zhuo
2022-10-19  4:36         ` Jason Wang
2022-10-19  6:02           ` Xuan Zhuo
2022-10-19  8:07             ` Tony Lu
2022-10-19  9:04               ` Jason Wang
2022-10-19  9:10                 ` Gerry
2022-10-19  9:13                   ` Jason Wang
2022-10-19 10:01                 ` Tony Lu
2022-10-21  2:47                   ` Jason Wang
2022-10-21  3:05                     ` Tony Lu
2022-10-21  3:07                       ` Jason Wang
2022-10-21  3:23                         ` Tony Lu
2022-10-21  3:09                       ` Jason Wang
2022-10-21  3:53                         ` Tony Lu
2022-10-21  4:54                           ` Dust Li
2022-10-21  5:13                             ` Tony Lu
2022-10-21  6:38                               ` Jason Wang
2022-10-19  4:30       ` Xuan Zhuo
2022-10-19  5:10         ` Jason Wang
2022-10-19  6:13           ` Xuan Zhuo
2022-10-18  7:32 ` Jan Kiszka
2022-11-14 21:30   ` Jan Kiszka
2022-11-16  2:13     ` Xuan Zhuo
2022-11-23 15:27       ` Jan Kiszka
2022-11-24  2:32         ` Xuan Zhuo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1669257162.6154113-1-xuanzhuo@linux.alibaba.com \
    --to=xuanzhuo@linux.alibaba.com \
    --cc=cohuck@redhat.com \
    --cc=dust.li@linux.alibaba.com \
    --cc=gerry@linux.alibaba.com \
    --cc=hans@linux.alibaba.com \
    --cc=helinguo@linux.alibaba.com \
    --cc=herongguang@linux.alibaba.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=zhenzao@linux.alibaba.com \
    --cc=zmlcc@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.