Inter-VM device emulation (call on Mon 20th July 2020)

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Inter-VM device emulation (call on Mon 20th July 2020)
       [not found] <86d42090-f042-06a1-efba-d46d449df280@arrikto.com>
@ 2020-07-15 11:23 ` Stefan Hajnoczi
  2020-07-15 11:28   ` Jan Kiszka
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-07-15 11:23 UTC (permalink / raw)
  To: Nikos Dragazis, Jan Kiszka
  Cc: Michael S. Tsirkin, Thanos Makatos, John G. Johnson,
	Andra-Irina Paraschiv, Alexander Graf, qemu-devel, kvm,
	Maxime Coquelin, Alex Bennée

[-- Attachment #1: Type: text/plain, Size: 1758 bytes --]

Hi,
Several projects are underway to create an inter-VM device emulation
interface:

 * ivshmem v2
   https://www.mail-archive.com/qemu-devel@nongnu.org/msg706465.html

   A PCI device that provides shared-memory communication between VMs.
   This device already exists but is limited in its current form. The
   "v2" project updates IVSHMEM's capabilities and makes it suitable as
   a VIRTIO transport.

   Jan Kiszka is working on this and has posted specs for review.

 * virtio-vhost-user
   https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg06429.html

   A VIRTIO device that transports the vhost-user protocol. Allows
   vhost-user device emulation to be implemented by another VM.

   Nikos Dragazis is working on this with QEMU, DPDK, and VIRTIO patches
   posted.

 * VFIO-over-socket
   https://github.com/tmakatos/qemu/blob/master/docs/devel/vfio-over-socket.rst

   Similar to the vhost-user protocol in spirit but for any PCI device.
   Uses the Linux VFIO ioctl API as the protocol instead of vhost.

   It doesn't have a virtio-vhost-user equivalent yet, but the same
   approach could be applied to VFIO-over-socket too.

   Thanos Makatos and John G. Johnson are working on this. The draft
   spec is available.

Let's have a call to figure out:

1. What is unique about these approaches and how do they overlap?
2. Can we focus development and code review efforts to get something
   merged sooner?

Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
UTC?
https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500

Video call URL: https://bluejeans.com/240406010

It would be nice if Thanos and/or JJ could join the call too. Others
welcome too (feel free to forward this email)!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 11:23 ` Inter-VM device emulation (call on Mon 20th July 2020) Stefan Hajnoczi
@ 2020-07-15 11:28   ` Jan Kiszka
  2020-07-15 15:38     ` Stefan Hajnoczi
  2020-07-15 16:20   ` Thanos Makatos
  2020-07-20 17:11   ` Stefan Hajnoczi
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2020-07-15 11:28 UTC (permalink / raw)
  To: Stefan Hajnoczi, Nikos Dragazis
  Cc: Michael S. Tsirkin, Thanos Makatos, John G. Johnson,
	Andra-Irina Paraschiv, Alexander Graf, qemu-devel, kvm,
	Maxime Coquelin, Alex Bennée

[-- Attachment #1: Type: text/plain, Size: 2152 bytes --]

On 15.07.20 13:23, Stefan Hajnoczi wrote:
> Hi,
> Several projects are underway to create an inter-VM device emulation
> interface:
> 
>  * ivshmem v2
>    https://www.mail-archive.com/qemu-devel@nongnu.org/msg706465.html
> 
>    A PCI device that provides shared-memory communication between VMs.
>    This device already exists but is limited in its current form. The
>    "v2" project updates IVSHMEM's capabilities and makes it suitable as
>    a VIRTIO transport.
> 
>    Jan Kiszka is working on this and has posted specs for review.
> 
>  * virtio-vhost-user
>    https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg06429.html
> 
>    A VIRTIO device that transports the vhost-user protocol. Allows
>    vhost-user device emulation to be implemented by another VM.
> 
>    Nikos Dragazis is working on this with QEMU, DPDK, and VIRTIO patches
>    posted.
> 
>  * VFIO-over-socket
>    https://github.com/tmakatos/qemu/blob/master/docs/devel/vfio-over-socket.rst
> 
>    Similar to the vhost-user protocol in spirit but for any PCI device.
>    Uses the Linux VFIO ioctl API as the protocol instead of vhost.
> 
>    It doesn't have a virtio-vhost-user equivalent yet, but the same
>    approach could be applied to VFIO-over-socket too.
> 
>    Thanos Makatos and John G. Johnson are working on this. The draft
>    spec is available.
> 
> Let's have a call to figure out:
> 
> 1. What is unique about these approaches and how do they overlap?
> 2. Can we focus development and code review efforts to get something
>    merged sooner?
> 
> Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
> UTC?
> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
> 

Not at that slot, but one hour earlier or later would work for me (so far).

Jan

> Video call URL: https://bluejeans.com/240406010
> 
> It would be nice if Thanos and/or JJ could join the call too. Others
> welcome too (feel free to forward this email)!
> 
> Stefan
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 8492 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 11:28   ` Jan Kiszka
@ 2020-07-15 15:38     ` Stefan Hajnoczi
  2020-07-15 16:44       ` Alex Bennée
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-07-15 15:38 UTC (permalink / raw)
  To: Nikos Dragazis
  Cc: Michael S. Tsirkin, Thanos Makatos, John G. Johnson,
	Andra-Irina Paraschiv, Alexander Graf, qemu-devel, kvm,
	Maxime Coquelin, Alex Bennée, Jan Kiszka

On Wed, Jul 15, 2020 at 01:28:07PM +0200, Jan Kiszka wrote:
> On 15.07.20 13:23, Stefan Hajnoczi wrote:
> > Let's have a call to figure out:
> > 
> > 1. What is unique about these approaches and how do they overlap?
> > 2. Can we focus development and code review efforts to get something
> >    merged sooner?
> > 
> > Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
> > UTC?
> > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
> > 
> 
> Not at that slot, but one hour earlier or later would work for me (so far).

Nikos: Please let us know which of Jan's timeslots works best for you.

Thanks,
Stefan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 15:38     ` Stefan Hajnoczi
@ 2020-07-15 16:44       ` Alex Bennée
  2020-07-17  8:58         ` Nikos Dragazis
  0 siblings, 1 reply; 15+ messages in thread
From: Alex Bennée @ 2020-07-15 16:44 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Nikos Dragazis, Michael S. Tsirkin, Thanos Makatos,
	John G. Johnson, Andra-Irina Paraschiv, Alexander Graf,
	qemu-devel, kvm, Maxime Coquelin, Jan Kiszka


Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Wed, Jul 15, 2020 at 01:28:07PM +0200, Jan Kiszka wrote:
>> On 15.07.20 13:23, Stefan Hajnoczi wrote:
>> > Let's have a call to figure out:
>> > 
>> > 1. What is unique about these approaches and how do they overlap?
>> > 2. Can we focus development and code review efforts to get something
>> >    merged sooner?
>> > 
>> > Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
>> > UTC?
>> > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
>> > 
>> 
>> Not at that slot, but one hour earlier or later would work for me (so far).
>
> Nikos: Please let us know which of Jan's timeslots works best for you.

I'm in - the earlier slot would be preferential for me to avoid clashing with
family time.

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 16:44       ` Alex Bennée
@ 2020-07-17  8:58         ` Nikos Dragazis
  2020-07-17 17:10           ` Stefan Hajnoczi
  0 siblings, 1 reply; 15+ messages in thread
From: Nikos Dragazis @ 2020-07-17  8:58 UTC (permalink / raw)
  To: Alex Bennée, Stefan Hajnoczi
  Cc: John G. Johnson, Andra-Irina Paraschiv, kvm, Michael S. Tsirkin,
	Jan Kiszka, qemu-devel, Maxime Coquelin, Alexander Graf,
	Thanos Makatos

On 15/7/20 7:44 μ.μ., Alex Bennée wrote:

> Stefan Hajnoczi <stefanha@redhat.com> writes:
>
>> On Wed, Jul 15, 2020 at 01:28:07PM +0200, Jan Kiszka wrote:
>>> On 15.07.20 13:23, Stefan Hajnoczi wrote:
>>>> Let's have a call to figure out:
>>>>
>>>> 1. What is unique about these approaches and how do they overlap?
>>>> 2. Can we focus development and code review efforts to get something
>>>>     merged sooner?
>>>>
>>>> Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
>>>> UTC?
>>>> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
>>>>
>>> Not at that slot, but one hour earlier or later would work for me (so far).
>> Nikos: Please let us know which of Jan's timeslots works best for you.
> I'm in - the earlier slot would be preferential for me to avoid clashing with
> family time.
>

I'm OK with all timeslots.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-17  8:58         ` Nikos Dragazis
@ 2020-07-17 17:10           ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-07-17 17:10 UTC (permalink / raw)
  To: Nikos Dragazis
  Cc: Alex Bennée, John G. Johnson, Andra-Irina Paraschiv, kvm,
	Michael S. Tsirkin, Jan Kiszka, qemu-devel, Maxime Coquelin,
	Alexander Graf, Thanos Makatos

[-- Attachment #1: Type: text/plain, Size: 1305 bytes --]

On Fri, Jul 17, 2020 at 11:58:40AM +0300, Nikos Dragazis wrote:
> On 15/7/20 7:44 μ.μ., Alex Bennée wrote:
> 
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > 
> > > On Wed, Jul 15, 2020 at 01:28:07PM +0200, Jan Kiszka wrote:
> > > > On 15.07.20 13:23, Stefan Hajnoczi wrote:
> > > > > Let's have a call to figure out:
> > > > > 
> > > > > 1. What is unique about these approaches and how do they overlap?
> > > > > 2. Can we focus development and code review efforts to get something
> > > > >     merged sooner?
> > > > > 
> > > > > Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
> > > > > UTC?
> > > > > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
> > > > > 
> > > > Not at that slot, but one hour earlier or later would work for me (so far).
> > > Nikos: Please let us know which of Jan's timeslots works best for you.
> > I'm in - the earlier slot would be preferential for me to avoid clashing with
> > family time.
> > 
> 
> I'm OK with all timeslots.

Great, let's do 16:00 UTC.

I have a meeting at 14:00 UTC so I can't make the earlier slot and it
sounds like Andra-Irina and Alexander Graf do too. Sorry, Alex (Bennée),
not optimal but it's hard to find a slot that is perfect for everyone.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 11:23 ` Inter-VM device emulation (call on Mon 20th July 2020) Stefan Hajnoczi
  2020-07-15 11:28   ` Jan Kiszka
@ 2020-07-15 16:20   ` Thanos Makatos
  2020-07-20 17:11   ` Stefan Hajnoczi
  2 siblings, 0 replies; 15+ messages in thread
From: Thanos Makatos @ 2020-07-15 16:20 UTC (permalink / raw)
  To: Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka
  Cc: Michael S. Tsirkin, John G. Johnson, Andra-Irina Paraschiv,
	Alexander Graf, qemu-devel, kvm, Maxime Coquelin,
	Alex Bennée, Felipe Franciosi, Swapnil Ingle



> -----Original Message-----
> From: kvm-owner@vger.kernel.org <kvm-owner@vger.kernel.org> On
> Behalf Of Stefan Hajnoczi
> Sent: 15 July 2020 12:24
> To: Nikos Dragazis <ndragazis@arrikto.com>; Jan Kiszka
> <jan.kiszka@siemens.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>; Thanos Makatos
> <thanos.makatos@nutanix.com>; John G. Johnson
> <john.g.johnson@oracle.com>; Andra-Irina Paraschiv
> <andraprs@amazon.com>; Alexander Graf <graf@amazon.com>; qemu-
> devel@nongnu.org; kvm@vger.kernel.org; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Alex Bennée <alex.bennee@linaro.org>
> Subject: Inter-VM device emulation (call on Mon 20th July 2020)
> 
> Hi,
> Several projects are underway to create an inter-VM device emulation
> interface:
> 
>  * ivshmem v2
>    https://www.mail-archive.com/qemu-devel@nongnu.org/msg706465.html
> 
>    A PCI device that provides shared-memory communication between VMs.
>    This device already exists but is limited in its current form. The
>    "v2" project updates IVSHMEM's capabilities and makes it suitable as
>    a VIRTIO transport.
> 
>    Jan Kiszka is working on this and has posted specs for review.
> 
>  * virtio-vhost-user
>    https://www.mail-archive.com/virtio-dev@lists.oasis-
> open.org/msg06429.html
> 
>    A VIRTIO device that transports the vhost-user protocol. Allows
>    vhost-user device emulation to be implemented by another VM.
> 
>    Nikos Dragazis is working on this with QEMU, DPDK, and VIRTIO patches
>    posted.
> 
>  * VFIO-over-socket
>    https://github.com/tmakatos/qemu/blob/master/docs/devel/vfio-over-
> socket.rst
> 
>    Similar to the vhost-user protocol in spirit but for any PCI device.
>    Uses the Linux VFIO ioctl API as the protocol instead of vhost.
> 
>    It doesn't have a virtio-vhost-user equivalent yet, but the same
>    approach could be applied to VFIO-over-socket too.
> 
>    Thanos Makatos and John G. Johnson are working on this. The draft
>    spec is available.
> 
> Let's have a call to figure out:
> 
> 1. What is unique about these approaches and how do they overlap?
> 2. Can we focus development and code review efforts to get something
>    merged sooner?
> 
> Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
> UTC?
> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1
> 500
> 
> Video call URL: https://bluejeans.com/240406010
> 
> It would be nice if Thanos and/or JJ could join the call too. Others
> welcome too (feel free to forward this email)!

Sure!

> 
> Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-15 11:23 ` Inter-VM device emulation (call on Mon 20th July 2020) Stefan Hajnoczi
  2020-07-15 11:28   ` Jan Kiszka
  2020-07-15 16:20   ` Thanos Makatos
@ 2020-07-20 17:11   ` Stefan Hajnoczi
  2020-07-21 10:49     ` Alex Bennée
  2 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-07-20 17:11 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Nikos Dragazis, Jan Kiszka, John G. Johnson,
	Andra-Irina Paraschiv, kvm, Michael S. Tsirkin, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos,
	Alex Bennée, Jag Raman, Philippe Mathieu-Daudé

Thank you everyone who joined!

I didn't take notes but two things stood out:

1. The ivshmem v2 and virtio-vhost-user use cases are quite different
so combining them does not seem realistic. ivshmem v2 needs to be as
simple for the hypervisor to implement as possible even if this
involves some sacrifices (e.g. not transparent to the Driver VM that
is accessing the device, performance). virtio-vhost-user is more aimed
at general-purpose device emulation although support for arbitrary
devices (e.g. PCI) would be important to serve all use cases.

2. Alexander Graf's idea for a new Linux driver that provides an
enforcing software IOMMU. This would be a character device driver that
is mmapped by the device emulation process (either vhost-user-style on
the host or another VMM for inter-VM device emulation). The Driver VMM
can program mappings into the device and the page tables in the device
emulation process will be updated. This way the Driver VMM can share
memory specific regions of guest RAM with the device emulation process
and revoke those mappings later.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-20 17:11   ` Stefan Hajnoczi
@ 2020-07-21 10:49     ` Alex Bennée
  2020-07-21 19:08       ` Jan Kiszka
  2020-07-27 10:14       ` Stefan Hajnoczi
  0 siblings, 2 replies; 15+ messages in thread
From: Alex Bennée @ 2020-07-21 10:49 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka, John G. Johnson,
	Andra-Irina Paraschiv, kvm, Michael S. Tsirkin, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé

Stefan Hajnoczi <stefanha@gmail.com> writes:

> Thank you everyone who joined!
>
> I didn't take notes but two things stood out:
>
> 1. The ivshmem v2 and virtio-vhost-user use cases are quite different
> so combining them does not seem realistic. ivshmem v2 needs to be as
> simple for the hypervisor to implement as possible even if this
> involves some sacrifices (e.g. not transparent to the Driver VM that
> is accessing the device, performance). virtio-vhost-user is more aimed
> at general-purpose device emulation although support for arbitrary
> devices (e.g. PCI) would be important to serve all use cases.

I believe my phone gave up on the last few minutes of the call so I'll
just say we are interested in being able to implement arbitrary devices
in the inter-VM silos. Devices we are looking at:

  virtio-audio
  virtio-video

these are performance sensitive devices which provide a HAL abstraction
to a common software core.

  virtio-rpmb

this is a secure device where the backend may need to reside in a secure
virtualised world.

  virtio-scmi

this is a more complex device which allows the guest to make power and
clock demands from the firmware. Needless to say this starts to become
complex with multiple moving parts.

The flexibility of vhost-user seems to match up quite well with wanting
to have a reasonably portable backend that just needs to be fed signals
and a memory mapping. However we don't want daemons to automatically
have a full view of the whole of the guests system memory.

> 2. Alexander Graf's idea for a new Linux driver that provides an
> enforcing software IOMMU. This would be a character device driver that
> is mmapped by the device emulation process (either vhost-user-style on
> the host or another VMM for inter-VM device emulation). The Driver VMM
> can program mappings into the device and the page tables in the device
> emulation process will be updated. This way the Driver VMM can share
> memory specific regions of guest RAM with the device emulation process
> and revoke those mappings later.

I'm wondering if there is enough plumbing on the guest side so a guest
can use the virtio-iommu to mark out exactly which bits of memory the
virtual device can have access to? At a minimum the virtqueues need to
be accessible and for larger transfers maybe a bounce buffer. However
for speed you want as wide as possible mapping but no more. It would be
nice for example if a block device could load data directly into the
guests block cache (zero-copy) but without getting a view of the kernels
internal data structures.

Another thing that came across in the call was quite a lot of
assumptions about QEMU and Linux w.r.t virtio. While our project will
likely have Linux as a guest OS we are looking specifically at enabling
virtio for Type-1 hypervisors like Xen and the various safety certified
proprietary ones. It is unlikely that QEMU would be used as the VMM for
these deployments. We want to work out what sort of common facilities
hypervisors need to support to enable virtio so the daemons can be
re-usable and maybe setup with a minimal shim for the particular
hypervisor in question.

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-21 10:49     ` Alex Bennée
@ 2020-07-21 19:08       ` Jan Kiszka
  2020-07-27 10:14       ` Stefan Hajnoczi
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2020-07-21 19:08 UTC (permalink / raw)
  To: Alex Bennée, Stefan Hajnoczi
  Cc: Stefan Hajnoczi, Nikos Dragazis, John G. Johnson,
	Andra-Irina Paraschiv, kvm, Michael S. Tsirkin, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé

On 21.07.20 12:49, Alex Bennée wrote:
> 
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
>> Thank you everyone who joined!
>>
>> I didn't take notes but two things stood out:
>>
>> 1. The ivshmem v2 and virtio-vhost-user use cases are quite different
>> so combining them does not seem realistic. ivshmem v2 needs to be as
>> simple for the hypervisor to implement as possible even if this
>> involves some sacrifices (e.g. not transparent to the Driver VM that
>> is accessing the device, performance). virtio-vhost-user is more aimed
>> at general-purpose device emulation although support for arbitrary
>> devices (e.g. PCI) would be important to serve all use cases.
> 
> I believe my phone gave up on the last few minutes of the call so I'll
> just say we are interested in being able to implement arbitrary devices
> in the inter-VM silos. Devices we are looking at:
> 
>    virtio-audio
>    virtio-video
> 
> these are performance sensitive devices which provide a HAL abstraction
> to a common software core.
> 
>    virtio-rpmb
> 
> this is a secure device where the backend may need to reside in a secure
> virtualised world.
> 
>    virtio-scmi
> 
> this is a more complex device which allows the guest to make power and
> clock demands from the firmware. Needless to say this starts to become
> complex with multiple moving parts.
> 
> The flexibility of vhost-user seems to match up quite well with wanting
> to have a reasonably portable backend that just needs to be fed signals
> and a memory mapping. However we don't want daemons to automatically
> have a full view of the whole of the guests system memory.
> 
>> 2. Alexander Graf's idea for a new Linux driver that provides an
>> enforcing software IOMMU. This would be a character device driver that
>> is mmapped by the device emulation process (either vhost-user-style on
>> the host or another VMM for inter-VM device emulation). The Driver VMM
>> can program mappings into the device and the page tables in the device
>> emulation process will be updated. This way the Driver VMM can share
>> memory specific regions of guest RAM with the device emulation process
>> and revoke those mappings later.
> 
> I'm wondering if there is enough plumbing on the guest side so a guest
> can use the virtio-iommu to mark out exactly which bits of memory the
> virtual device can have access to? At a minimum the virtqueues need to
> be accessible and for larger transfers maybe a bounce buffer. However
> for speed you want as wide as possible mapping but no more. It would be
> nice for example if a block device could load data directly into the
> guests block cache (zero-copy) but without getting a view of the kernels
> internal data structures.

Welcome to a classic optimization triangle:

  - speed -> direct mappings
  - security -> restricted mapping
  - simplicity -> static mapping

Pick two, you can't have them all. Well, you could try a little bit more 
of one, at the price of losing on another. But that's it.

We chose the last two, ending up with probably the simplest but not 
fastest solution for type-1 hypervisors like Jailhouse. Specifically for 
non-Linux use cases, legacy RTOSes, often with limited driver stacks, 
having not only virtio but also even simpler channels over 
application-defined shared memory layouts is a requirement.

> 
> Another thing that came across in the call was quite a lot of
> assumptions about QEMU and Linux w.r.t virtio. While our project will
> likely have Linux as a guest OS we are looking specifically at enabling
> virtio for Type-1 hypervisors like Xen and the various safety certified
> proprietary ones. It is unlikely that QEMU would be used as the VMM for
> these deployments. We want to work out what sort of common facilities
> hypervisors need to support to enable virtio so the daemons can be
> re-usable and maybe setup with a minimal shim for the particular
> hypervisor in question.
> 

I'm with you regarding stacks that are mappable not only on QEMU/Linux. 
And also one that does not let the certification costs sky-rocket 
because of its mandated implementation complexity.

I'm not sure anymore if there will be only one device model. Maybe we 
should eventually think about a backend layer that can sit on something 
like virtio-vhost-user as well as on ivshmem-virtio, allowing the same 
device backend code to be plumbed into both transports. Why shouldn't 
work what already works well under Linux with the frontend device 
drivers vs. virtio transports?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-21 10:49     ` Alex Bennée
  2020-07-21 19:08       ` Jan Kiszka
@ 2020-07-27 10:14       ` Stefan Hajnoczi
  2020-07-27 10:30         ` Alex Bennée
  2020-07-27 11:52         ` Jean-Philippe Brucker
  1 sibling, 2 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-07-27 10:14 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka, John G. Johnson,
	Andra-Irina Paraschiv, kvm, Michael S. Tsirkin, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé,
	Jean-Philippe Brucker, Eric Auger

[-- Attachment #1: Type: text/plain, Size: 2348 bytes --]

On Tue, Jul 21, 2020 at 11:49:04AM +0100, Alex Bennée wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> > 2. Alexander Graf's idea for a new Linux driver that provides an
> > enforcing software IOMMU. This would be a character device driver that
> > is mmapped by the device emulation process (either vhost-user-style on
> > the host or another VMM for inter-VM device emulation). The Driver VMM
> > can program mappings into the device and the page tables in the device
> > emulation process will be updated. This way the Driver VMM can share
> > memory specific regions of guest RAM with the device emulation process
> > and revoke those mappings later.
> 
> I'm wondering if there is enough plumbing on the guest side so a guest
> can use the virtio-iommu to mark out exactly which bits of memory the
> virtual device can have access to? At a minimum the virtqueues need to
> be accessible and for larger transfers maybe a bounce buffer. However
> for speed you want as wide as possible mapping but no more. It would be
> nice for example if a block device could load data directly into the
> guests block cache (zero-copy) but without getting a view of the kernels
> internal data structures.

Maybe Jean-Philippe or Eric can answer that?

> Another thing that came across in the call was quite a lot of
> assumptions about QEMU and Linux w.r.t virtio. While our project will
> likely have Linux as a guest OS we are looking specifically at enabling
> virtio for Type-1 hypervisors like Xen and the various safety certified
> proprietary ones. It is unlikely that QEMU would be used as the VMM for
> these deployments. We want to work out what sort of common facilities
> hypervisors need to support to enable virtio so the daemons can be
> re-usable and maybe setup with a minimal shim for the particular
> hypervisor in question.

The vhost-user protocol together with the backend program conventions
define the wire protocol and command-line interface (see
docs/interop/vhost-user.rst).

vhost-user is already used by other VMMs today. For example,
cloud-hypervisor implements vhost-user.

I'm sure there is room for improvement, but it seems like an incremental
step given that vhost-user already tries to cater for this scenario.

Are there any specific gaps you have identified?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-27 10:14       ` Stefan Hajnoczi
@ 2020-07-27 10:30         ` Alex Bennée
  2020-07-27 11:37           ` Michael S. Tsirkin
  2020-07-27 11:52         ` Jean-Philippe Brucker
  1 sibling, 1 reply; 15+ messages in thread
From: Alex Bennée @ 2020-07-27 10:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka, John G. Johnson,
	Andra-Irina Paraschiv, kvm, Michael S. Tsirkin, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé,
	Jean-Philippe Brucker, Eric Auger


Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Tue, Jul 21, 2020 at 11:49:04AM +0100, Alex Bennée wrote:
>> Stefan Hajnoczi <stefanha@gmail.com> writes:
>> > 2. Alexander Graf's idea for a new Linux driver that provides an
>> > enforcing software IOMMU. This would be a character device driver that
>> > is mmapped by the device emulation process (either vhost-user-style on
>> > the host or another VMM for inter-VM device emulation). The Driver VMM
>> > can program mappings into the device and the page tables in the device
>> > emulation process will be updated. This way the Driver VMM can share
>> > memory specific regions of guest RAM with the device emulation process
>> > and revoke those mappings later.
>> 
>> I'm wondering if there is enough plumbing on the guest side so a guest
>> can use the virtio-iommu to mark out exactly which bits of memory the
>> virtual device can have access to? At a minimum the virtqueues need to
>> be accessible and for larger transfers maybe a bounce buffer. However
>> for speed you want as wide as possible mapping but no more. It would be
>> nice for example if a block device could load data directly into the
>> guests block cache (zero-copy) but without getting a view of the kernels
>> internal data structures.
>
> Maybe Jean-Philippe or Eric can answer that?
>
>> Another thing that came across in the call was quite a lot of
>> assumptions about QEMU and Linux w.r.t virtio. While our project will
>> likely have Linux as a guest OS we are looking specifically at enabling
>> virtio for Type-1 hypervisors like Xen and the various safety certified
>> proprietary ones. It is unlikely that QEMU would be used as the VMM for
>> these deployments. We want to work out what sort of common facilities
>> hypervisors need to support to enable virtio so the daemons can be
>> re-usable and maybe setup with a minimal shim for the particular
>> hypervisor in question.
>
> The vhost-user protocol together with the backend program conventions
> define the wire protocol and command-line interface (see
> docs/interop/vhost-user.rst).
>
> vhost-user is already used by other VMMs today. For example,
> cloud-hypervisor implements vhost-user.

Ohh that's a new one for me. I see it is a KVM only project but it's
nice to see another VMM using the common rust-vmm backend. There is
interest in using rust-vmm to implement VMMs for type-1 hypervisors but
we need to work out if there are two many type-2 concepts backed into
the lower level rust crates.

> I'm sure there is room for improvement, but it seems like an incremental
> step given that vhost-user already tries to cater for this scenario.
>
> Are there any specific gaps you have identified?

Aside from the desire to limit the shared memory footprint between the
backend daemon and a guest not yet.

I suspect the eventfd mechanism might just end up being simulated by the
VMM as a result of whatever comes from the type-1 interface indicating a
doorbell has been rung. It is after all just a FD you consume numbers
over right?

Not all setups will have an equivalent of a Dom0 "master" guest to do
orchestration. Highly embedded are likely to have fixed domains created
as the firmware/hypervisor start up.

>
> Stefan


-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-27 10:30         ` Alex Bennée
@ 2020-07-27 11:37           ` Michael S. Tsirkin
  2020-07-27 12:22             ` Alex Bennée
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2020-07-27 11:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Stefan Hajnoczi, Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka,
	John G. Johnson, Andra-Irina Paraschiv, kvm, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé,
	Jean-Philippe Brucker, Eric Auger

On Mon, Jul 27, 2020 at 11:30:24AM +0100, Alex Bennée wrote:
> 
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > On Tue, Jul 21, 2020 at 11:49:04AM +0100, Alex Bennée wrote:
> >> Stefan Hajnoczi <stefanha@gmail.com> writes:
> >> > 2. Alexander Graf's idea for a new Linux driver that provides an
> >> > enforcing software IOMMU. This would be a character device driver that
> >> > is mmapped by the device emulation process (either vhost-user-style on
> >> > the host or another VMM for inter-VM device emulation). The Driver VMM
> >> > can program mappings into the device and the page tables in the device
> >> > emulation process will be updated. This way the Driver VMM can share
> >> > memory specific regions of guest RAM with the device emulation process
> >> > and revoke those mappings later.
> >> 
> >> I'm wondering if there is enough plumbing on the guest side so a guest
> >> can use the virtio-iommu to mark out exactly which bits of memory the
> >> virtual device can have access to? At a minimum the virtqueues need to
> >> be accessible and for larger transfers maybe a bounce buffer. However
> >> for speed you want as wide as possible mapping but no more. It would be
> >> nice for example if a block device could load data directly into the
> >> guests block cache (zero-copy) but without getting a view of the kernels
> >> internal data structures.
> >
> > Maybe Jean-Philippe or Eric can answer that?
> >
> >> Another thing that came across in the call was quite a lot of
> >> assumptions about QEMU and Linux w.r.t virtio. While our project will
> >> likely have Linux as a guest OS we are looking specifically at enabling
> >> virtio for Type-1 hypervisors like Xen and the various safety certified
> >> proprietary ones. It is unlikely that QEMU would be used as the VMM for
> >> these deployments. We want to work out what sort of common facilities
> >> hypervisors need to support to enable virtio so the daemons can be
> >> re-usable and maybe setup with a minimal shim for the particular
> >> hypervisor in question.
> >
> > The vhost-user protocol together with the backend program conventions
> > define the wire protocol and command-line interface (see
> > docs/interop/vhost-user.rst).
> >
> > vhost-user is already used by other VMMs today. For example,
> > cloud-hypervisor implements vhost-user.
> 
> Ohh that's a new one for me. I see it is a KVM only project but it's
> nice to see another VMM using the common rust-vmm backend. There is
> interest in using rust-vmm to implement VMMs for type-1 hypervisors but
> we need to work out if there are two many type-2 concepts backed into
> the lower level rust crates.
> 
> > I'm sure there is room for improvement, but it seems like an incremental
> > step given that vhost-user already tries to cater for this scenario.
> >
> > Are there any specific gaps you have identified?
> 
> Aside from the desire to limit the shared memory footprint between the
> backend daemon and a guest not yet.

So it's certainly nice for security but not really a requirement for a
type-1 HV, right?

> I suspect the eventfd mechanism might just end up being simulated by the
> VMM as a result of whatever comes from the type-1 interface indicating a
> doorbell has been rung. It is after all just a FD you consume numbers
> over right?

Does not even have to be numbers. We need a way to be woken up, a way to
stop/start listening for wakeups and a way to detect that there was a
wakeup while we were not listening.

Though there are special tricks for offloads where we poke through
layers in order to map things directly to hardware.

> Not all setups will have an equivalent of a Dom0 "master" guest to do
> orchestration. Highly embedded are likely to have fixed domains created
> as the firmware/hypervisor start up.
> 
> >
> > Stefan
> 
> 
> -- 
> Alex Bennée


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-27 11:37           ` Michael S. Tsirkin
@ 2020-07-27 12:22             ` Alex Bennée
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2020-07-27 12:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka,
	John G. Johnson, Andra-Irina Paraschiv, kvm, qemu-devel,
	Maxime Coquelin, Alexander Graf, Thanos Makatos, Jag Raman,
	Philippe Mathieu-Daudé,
	Jean-Philippe Brucker, Eric Auger


Michael S. Tsirkin <mst@redhat.com> writes:

> On Mon, Jul 27, 2020 at 11:30:24AM +0100, Alex BennÃ©e wrote:
>> 
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > On Tue, Jul 21, 2020 at 11:49:04AM +0100, Alex BennÃ©e wrote:
>> >> Stefan Hajnoczi <stefanha@gmail.com> writes:
<snip>
>> >> Another thing that came across in the call was quite a lot of
>> >> assumptions about QEMU and Linux w.r.t virtio. While our project will
>> >> likely have Linux as a guest OS we are looking specifically at enabling
>> >> virtio for Type-1 hypervisors like Xen and the various safety certified
>> >> proprietary ones. It is unlikely that QEMU would be used as the VMM for
>> >> these deployments. We want to work out what sort of common facilities
>> >> hypervisors need to support to enable virtio so the daemons can be
>> >> re-usable and maybe setup with a minimal shim for the particular
>> >> hypervisor in question.
>> >
>> > The vhost-user protocol together with the backend program conventions
>> > define the wire protocol and command-line interface (see
>> > docs/interop/vhost-user.rst).
>> >
>> > vhost-user is already used by other VMMs today. For example,
>> > cloud-hypervisor implements vhost-user.
>> 
>> Ohh that's a new one for me. I see it is a KVM only project but it's
>> nice to see another VMM using the common rust-vmm backend. There is
>> interest in using rust-vmm to implement VMMs for type-1 hypervisors but
>> we need to work out if there are two many type-2 concepts backed into
>> the lower level rust crates.
>> 
>> > I'm sure there is room for improvement, but it seems like an incremental
>> > step given that vhost-user already tries to cater for this scenario.
>> >
>> > Are there any specific gaps you have identified?
>> 
>> Aside from the desire to limit the shared memory footprint between the
>> backend daemon and a guest not yet.
>
> So it's certainly nice for security but not really a requirement for a
> type-1 HV, right?

Not a requirement per-se but type-1 setups don't assume a "one userspace
to rule them all" approach.

>> I suspect the eventfd mechanism might just end up being simulated by the
>> VMM as a result of whatever comes from the type-1 interface indicating a
>> doorbell has been rung. It is after all just a FD you consume numbers
>> over right?
>
> Does not even have to be numbers. We need a way to be woken up, a way to
> stop/start listening for wakeups and a way to detect that there was a
> wakeup while we were not listening.
>
> Though there are special tricks for offloads where we poke through
> layers in order to map things directly to hardware.
>
>> Not all setups will have an equivalent of a Dom0 "master" guest to do
>> orchestration. Highly embedded are likely to have fixed domains created
>> as the firmware/hypervisor start up.
>> 
>> >
>> > Stefan
>> 
>> 
>> -- 
>> Alex BennÃ©e


-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Inter-VM device emulation (call on Mon 20th July 2020)
  2020-07-27 10:14       ` Stefan Hajnoczi
  2020-07-27 10:30         ` Alex Bennée
@ 2020-07-27 11:52         ` Jean-Philippe Brucker
  1 sibling, 0 replies; 15+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-27 11:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Alex Bennée, Stefan Hajnoczi, Nikos Dragazis, Jan Kiszka,
	John G. Johnson, Andra-Irina Paraschiv, kvm, Michael S. Tsirkin,
	qemu-devel, Maxime Coquelin, Alexander Graf, Thanos Makatos,
	Jag Raman, Philippe Mathieu-Daudé,
	Eric Auger

On Mon, Jul 27, 2020 at 11:14:03AM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 21, 2020 at 11:49:04AM +0100, Alex Bennée wrote:
> > Stefan Hajnoczi <stefanha@gmail.com> writes:
> > > 2. Alexander Graf's idea for a new Linux driver that provides an
> > > enforcing software IOMMU. This would be a character device driver that
> > > is mmapped by the device emulation process (either vhost-user-style on
> > > the host or another VMM for inter-VM device emulation). The Driver VMM
> > > can program mappings into the device and the page tables in the device
> > > emulation process will be updated. This way the Driver VMM can share
> > > memory specific regions of guest RAM with the device emulation process
> > > and revoke those mappings later.
> > 
> > I'm wondering if there is enough plumbing on the guest side so a guest
> > can use the virtio-iommu to mark out exactly which bits of memory the
> > virtual device can have access to? At a minimum the virtqueues need to
> > be accessible and for larger transfers maybe a bounce buffer. However

Just to make sure I didn't misunderstand - do you want to tell the guest
precisely where the buffers are, like "address X is the used ring, address
Y is the descriptor table", or do you want to specify a range of memory
where the guest can allocate DMA buffers, in no specific order, for a
given device?  So far I've assumed we're talking about the latter.

> > for speed you want as wide as possible mapping but no more. It would be
> > nice for example if a block device could load data directly into the
> > guests block cache (zero-copy) but without getting a view of the kernels
> > internal data structures.
> 
> Maybe Jean-Philippe or Eric can answer that?

Virtio-iommu could describe which bits of guest-physical memory is
available for DMA for a given device. It already provides a mechanism for
describing per-device memory properties (the PROBE request) which is
extensible. And I think the virtio-iommu device could be used exclusively
for this, too, by having DMA bypass the VA->PA translation
(VIRTIO_IOMMU_F_BYPASS) and only enforcing guest-physical boundaries. Or
just describe the memory and not enforce anything.

I don't know how to plug this into the DMA layer of a Linux guest, though,
but there seems to exist a per-device DMA pool infrastructure. Have you
looked at rproc_add_virtio_dev()?  It seems to allocates a specific DMA
region per device, from a "memory-region" device-tree property, so perhaps
you could simply reuse this.

Thanks,
Jean

> 
> > Another thing that came across in the call was quite a lot of
> > assumptions about QEMU and Linux w.r.t virtio. While our project will
> > likely have Linux as a guest OS we are looking specifically at enabling
> > virtio for Type-1 hypervisors like Xen and the various safety certified
> > proprietary ones. It is unlikely that QEMU would be used as the VMM for
> > these deployments. We want to work out what sort of common facilities
> > hypervisors need to support to enable virtio so the daemons can be
> > re-usable and maybe setup with a minimal shim for the particular
> > hypervisor in question.
> 
> The vhost-user protocol together with the backend program conventions
> define the wire protocol and command-line interface (see
> docs/interop/vhost-user.rst).
> 
> vhost-user is already used by other VMMs today. For example,
> cloud-hypervisor implements vhost-user.
> 
> I'm sure there is room for improvement, but it seems like an incremental
> step given that vhost-user already tries to cater for this scenario.
> 
> Are there any specific gaps you have identified?
> 
> Stefan



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-07-27 12:22 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <86d42090-f042-06a1-efba-d46d449df280@arrikto.com>
2020-07-15 11:23 ` Inter-VM device emulation (call on Mon 20th July 2020) Stefan Hajnoczi
2020-07-15 11:28   ` Jan Kiszka
2020-07-15 15:38     ` Stefan Hajnoczi
2020-07-15 16:44       ` Alex Bennée
2020-07-17  8:58         ` Nikos Dragazis
2020-07-17 17:10           ` Stefan Hajnoczi
2020-07-15 16:20   ` Thanos Makatos
2020-07-20 17:11   ` Stefan Hajnoczi
2020-07-21 10:49     ` Alex Bennée
2020-07-21 19:08       ` Jan Kiszka
2020-07-27 10:14       ` Stefan Hajnoczi
2020-07-27 10:30         ` Alex Bennée
2020-07-27 11:37           ` Michael S. Tsirkin
2020-07-27 12:22             ` Alex Bennée
2020-07-27 11:52         ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).