linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] hypercall-vsock: add a new vsock transport
       [not found] <71d7b0463629471e9d4887d7fcef1d8d@intel.com>
@ 2021-11-10  9:34 ` Stefan Hajnoczi
  2021-11-11  8:02   ` Wang, Wei W
  2021-11-10 10:50 ` Michael S. Tsirkin
  2021-11-10 11:17 ` Stefano Garzarella
  2 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2021-11-10  9:34 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: sgarzare, davem, kuba, mst, Paolo Bonzini, kys, linux-kernel,
	virtualization, Yamahata, Isaku, Nakajima, Jun, Kleen, Andi,
	Andra Paraschiv

[-- Attachment #1: Type: text/plain, Size: 1708 bytes --]

On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
> We plan to add a new vsock transport based on hypercall (e.g. vmcall on Intel CPUs).
> It transports AF_VSOCK packets between the guest and host, which is similar to
> virtio-vsock, vmci-vsock and hyperv-vsock.
> 
> Compared to the above listed vsock transports which are designed for high performance,
> the main advantages of hypercall-vsock are:
> 
> 1)       It is VMM agnostic. For example, one guest working on hypercall-vsock can run on
> 
> either KVM, Hyperv, or VMware.
> 
> 2)       It is simpler. It doesn't rely on any complex bus enumeration
> 
> (e.g. virtio-pci based vsock device may need the whole implementation of PCI).
> 
> An example usage is the communication between MigTD and host (Page 8 at
> https://static.sched.com/hosted_files/kvmforum2021/ef/TDX%20Live%20Migration_Wei%20Wang.pdf).
> MigTD communicates to host to assist the migration of the target (user) TD.
> MigTD is part of the TCB, so its implementation is expected to be as simple as possible
> (e.g. bare mental implementation without OS, no PCI driver support).

AF_VSOCK is designed to allow multiple transports, so why not. There is
a cost to developing and maintaining a vsock transport though.

I think Amazon Nitro enclaves use virtio-vsock and I've CCed Andra in
case she has thoughts on the pros/cons and how to minimize the trusted
computing base.

If simplicity is the top priority then VIRTIO's MMIO transport without
indirect descriptors and using the packed virtqueue layout reduces the
size of the implementation:
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1440002

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
       [not found] <71d7b0463629471e9d4887d7fcef1d8d@intel.com>
  2021-11-10  9:34 ` [RFC] hypercall-vsock: add a new vsock transport Stefan Hajnoczi
@ 2021-11-10 10:50 ` Michael S. Tsirkin
  2021-11-11  7:58   ` Wang, Wei W
  2021-11-10 11:17 ` Stefano Garzarella
  2 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-11-10 10:50 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: sgarzare, davem, kuba, Stefan Hajnoczi, Paolo Bonzini, kys,
	linux-kernel, virtualization, Yamahata, Isaku, Nakajima, Jun,
	Kleen, Andi

On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
> Hi,
> 
>  
> 
> We plan to add a new vsock transport based on hypercall (e.g. vmcall on Intel
> CPUs).
> 
> It transports AF_VSOCK packets between the guest and host, which is similar to
> 
> virtio-vsock, vmci-vsock and hyperv-vsock.
> 
>  
> 
> Compared to the above listed vsock transports which are designed for high
> performance,
> 
> the main advantages of hypercall-vsock are:
> 
> 1)       It is VMM agnostic. For example, one guest working on hypercall-vsock
> can run on
> 
> either KVM, Hyperv, or VMware.

hypercalls are fundamentally hypervisor dependent though.
Assuming you can carve up a hypervisor independent hypercall,
using it for something as mundane and specific as vsock for TDX
seems like a huge overkill. For example, virtio could benefit from
faster vmexits that hypercalls give you for signalling.
How about a combination of virtio-mmio and hypercalls for fast-path
signalling then?

> 2)       It is simpler. It doesn’t rely on any complex bus enumeration
> 
> (e.g. virtio-pci based vsock device may need the whole implementation of PCI).
> 

Next thing people will try to do is implement a bunch of other device on
top of it.  virtio used pci simply because everyone implements pci.  And
the reason for *that* is because implementing a basic pci bus is dead
simple, whole of pci.c in qemu is <3000 LOC.

> 
> An example usage is the communication between MigTD and host (Page 8 at
> 
> https://static.sched.com/hosted_files/kvmforum2021/ef/
> TDX%20Live%20Migration_Wei%20Wang.pdf).
> 
> MigTD communicates to host to assist the migration of the target (user) TD.
> 
> MigTD is part of the TCB, so its implementation is expected to be as simple as
> possible
> 
> (e.g. bare mental implementation without OS, no PCI driver support).
> 
>  

Try to list drawbacks? For example, passthrough for nested virt
isn't possible unlike pci, neither are hardware implementations.


> Looking forward to your feedbacks.
> 
>  
> 
> Thanks,
> 
> Wei
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
       [not found] <71d7b0463629471e9d4887d7fcef1d8d@intel.com>
  2021-11-10  9:34 ` [RFC] hypercall-vsock: add a new vsock transport Stefan Hajnoczi
  2021-11-10 10:50 ` Michael S. Tsirkin
@ 2021-11-10 11:17 ` Stefano Garzarella
  2021-11-10 21:45   ` Paraschiv, Andra-Irina
  2021-11-11  8:14   ` Wang, Wei W
  2 siblings, 2 replies; 12+ messages in thread
From: Stefano Garzarella @ 2021-11-10 11:17 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: davem, kuba, Stefan Hajnoczi, mst, Paolo Bonzini, kys,
	linux-kernel, virtualization, Yamahata, Isaku, Nakajima, Jun,
	Kleen, Andi, Andra Paraschiv, Sergio Lopez Pascual

On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
>Hi,
>
>We plan to add a new vsock transport based on hypercall (e.g. vmcall on Intel CPUs).
>It transports AF_VSOCK packets between the guest and host, which is similar to
>virtio-vsock, vmci-vsock and hyperv-vsock.
>
>Compared to the above listed vsock transports which are designed for high performance,
>the main advantages of hypercall-vsock are:
>
>1)       It is VMM agnostic. For example, one guest working on hypercall-vsock can run on
>
>either KVM, Hyperv, or VMware.
>
>2)       It is simpler. It doesn't rely on any complex bus enumeration
>
>(e.g. virtio-pci based vsock device may need the whole implementation of PCI).
>
>An example usage is the communication between MigTD and host (Page 8 at
>https://static.sched.com/hosted_files/kvmforum2021/ef/TDX%20Live%20Migration_Wei%20Wang.pdf).
>MigTD communicates to host to assist the migration of the target (user) 
>TD.
>MigTD is part of the TCB, so its implementation is expected to be as simple as possible
>(e.g. bare mental implementation without OS, no PCI driver support).

Adding Andra and Sergio, because IIRC Firecracker and libkrun emulates 
virtio-vsock with virtio-mmio so the implementation should be simple and 
also not directly tied to a specific VMM.

Maybe this fit for your use case too, in this way we don't have to 
maintain another driver.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-10 11:17 ` Stefano Garzarella
@ 2021-11-10 21:45   ` Paraschiv, Andra-Irina
  2021-11-11  8:14   ` Wang, Wei W
  1 sibling, 0 replies; 12+ messages in thread
From: Paraschiv, Andra-Irina @ 2021-11-10 21:45 UTC (permalink / raw)
  To: Stefano Garzarella, Wang, Wei W, Stefan Hajnoczi
  Cc: davem, kuba, mst, Paolo Bonzini, kys, linux-kernel,
	virtualization, Yamahata, Isaku, Nakajima, Jun, Kleen, Andi,
	Sergio Lopez Pascual



On 10/11/2021 13:17, Stefano Garzarella wrote:
> 
> On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
>> Hi,
>>
>> We plan to add a new vsock transport based on hypercall (e.g. vmcall 
>> on Intel CPUs).
>> It transports AF_VSOCK packets between the guest and host, which is 
>> similar to
>> virtio-vsock, vmci-vsock and hyperv-vsock.
>>
>> Compared to the above listed vsock transports which are designed for 
>> high performance,
>> the main advantages of hypercall-vsock are:
>>
>> 1)       It is VMM agnostic. For example, one guest working on 
>> hypercall-vsock can run on
>>
>> either KVM, Hyperv, or VMware.
>>
>> 2)       It is simpler. It doesn't rely on any complex bus enumeration
>>
>> (e.g. virtio-pci based vsock device may need the whole implementation 
>> of PCI).
>>
>> An example usage is the communication between MigTD and host (Page 8 at
>> https://static.sched.com/hosted_files/kvmforum2021/ef/TDX%20Live%20Migration_Wei%20Wang.pdf). 
>>
>> MigTD communicates to host to assist the migration of the target (user)
>> TD.
>> MigTD is part of the TCB, so its implementation is expected to be as 
>> simple as possible
>> (e.g. bare mental implementation without OS, no PCI driver support).

Thanks for CC. Mixing both threads.

 From Stefan:

"
AF_VSOCK is designed to allow multiple transports, so why not. There is
a cost to developing and maintaining a vsock transport though.

I think Amazon Nitro enclaves use virtio-vsock and I've CCed Andra in
case she has thoughts on the pros/cons and how to minimize the trusted
computing base.

If simplicity is the top priority then VIRTIO's MMIO transport without
indirect descriptors and using the packed virtqueue layout reduces the
size of the implementation:
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1440002

Stefan
"


On the Nitro Enclaves project side, virtio-mmio is used for the vsock 
device setup for the enclave. That has worked fine, it has helped to 
have an already available implementation (e.g. virtio-mmio / virtio-pci) 
for adoption and ease of use in different types of setups (e.g. distros, 
kernel versions).

 From Stefano:

> 
> Adding Andra and Sergio, because IIRC Firecracker and libkrun emulates
> virtio-vsock with virtio-mmio so the implementation should be simple and
> also not directly tied to a specific VMM.
> 
> Maybe this fit for your use case too, in this way we don't have to
> maintain another driver.
> 
> Thanks,
> Stefano
> 

Indeed, on the Firecracker side, the vsock device is setup using 
virtio-mmio [1][2][3]. One specific thing is that on the host, instead 
of using vhost, AF_UNIX sockets are used [4].

Thanks,
Andra

[1] 
https://github.com/firecracker-microvm/firecracker/blob/main/src/devices/src/virtio/vsock/mod.rs#L30
[2] 
https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/builder.rs#L936
[3] 
https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/builder.rs#L859
[4] 
https://github.com/firecracker-microvm/firecracker/blob/main/docs/vsock.md



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-10 10:50 ` Michael S. Tsirkin
@ 2021-11-11  7:58   ` Wang, Wei W
  2021-11-11 15:19     ` Michael S. Tsirkin
  2021-11-25  6:37     ` Jason Wang
  0 siblings, 2 replies; 12+ messages in thread
From: Wang, Wei W @ 2021-11-11  7:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: sgarzare, davem, kuba, Stefan Hajnoczi, Paolo Bonzini, kys,
	linux-kernel, virtualization, Yamahata, Isaku, Nakajima, Jun,
	Kleen, Andi, srutherford, erdemaktas

On Wednesday, November 10, 2021 6:50 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
>
> hypercalls are fundamentally hypervisor dependent though.

Yes, each hypervisor needs to support it.
We could simplify the design and implementation to the minimal, so that each hypervisor can easily support it.
Once every hypervisor has the support, the guest (MigTD) could be a unified version.
(e.g. no need for each hypervisor user to develop their own MigTD using their own vsock transport)

> Assuming you can carve up a hypervisor independent hypercall, using it for
> something as mundane and specific as vsock for TDX seems like a huge overkill.
> For example, virtio could benefit from faster vmexits that hypercalls give you
> for signalling.
> How about a combination of virtio-mmio and hypercalls for fast-path signalling
> then?

We thought about virtio-mmio. There are some barriers:
1) It wasn't originally intended for x86 machines. The only machine type in QEMU
that supports it (to run on x86) is microvm. But "microvm" doesn’t support TDX currently,
and adding this support might need larger effort.
2) It's simpler than virtio-pci, but still more complex than hypercall.
3) Some CSPs don't have virtio support in their software, so this might add too much development effort for them.

This usage doesn’t need high performance, so faster hypercall for signalling isn't required, I think.
(but if hypercall has been verified to be much faster than the current EPT misconfig based notification,
it could be added for the general virtio usages)

> 
> > 2)       It is simpler. It doesn’t rely on any complex bus enumeration
> >
> > (e.g. virtio-pci based vsock device may need the whole implementation of
> PCI).
> >
> 
> Next thing people will try to do is implement a bunch of other device on top of
> it.  virtio used pci simply because everyone implements pci.  And the reason
> for *that* is because implementing a basic pci bus is dead simple, whole of
> pci.c in qemu is <3000 LOC.

This doesn’t include the PCI enumeration in seaBIOS and the PCI driver in the guest though.

Virtio has high performance, I think that's an important reason that more devices are continually added.
For this transport, I couldn’t envision that a bunch of devices would be added. It's a simple PV method.


> 
> >
> > An example usage is the communication between MigTD and host (Page 8
> > at
> >
> > https://static.sched.com/hosted_files/kvmforum2021/ef/
> > TDX%20Live%20Migration_Wei%20Wang.pdf).
> >
> > MigTD communicates to host to assist the migration of the target (user) TD.
> >
> > MigTD is part of the TCB, so its implementation is expected to be as
> > simple as possible
> >
> > (e.g. bare mental implementation without OS, no PCI driver support).
> >
> >
> 
> Try to list drawbacks? For example, passthrough for nested virt isn't possible
> unlike pci, neither are hardware implementations.
> 

Why hypercall wouldn't be possible for nested virt?
L2 hypercall goes to L0 directly and L0 can decide whether to forward the call the L1 (in our case, I think no need as the packet will go out), right?

Its drawbacks are obvious (e.g. low performance). 
In general, I think it could be considered as a complement to virtio.
I think most usages would choose virtio as they don’t worry about the complexity and they purse high performance.
For some special usages that think virtio is too complex to suffice and they want something simpler, they would consider to use this transport。

Thanks,
Wei


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-10  9:34 ` [RFC] hypercall-vsock: add a new vsock transport Stefan Hajnoczi
@ 2021-11-11  8:02   ` Wang, Wei W
  0 siblings, 0 replies; 12+ messages in thread
From: Wang, Wei W @ 2021-11-11  8:02 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: sgarzare, davem, kuba, mst, Paolo Bonzini, kys, linux-kernel,
	virtualization, Yamahata, Isaku, Nakajima, Jun, Kleen, Andi,
	Andra Paraschiv, Sergio Lopez Pascual

> From: Stefan Hajnoczi <stefanha@redhat.com>
On Wednesday, November 10, 2021 5:35 PM, Stefan Hajnoczi wrote:
> AF_VSOCK is designed to allow multiple transports, so why not. There is a cost
> to developing and maintaining a vsock transport though.

Yes. The effort could be reduced via simplifying the design as much as possible:
e.g. no ring operations - guest just sends a packet each time for the host to read.
(this transport isn't targeting for high performance)

> 
> I think Amazon Nitro enclaves use virtio-vsock and I've CCed Andra in case she
> has thoughts on the pros/cons and how to minimize the trusted computing
> base.

Thanks for adding more related person to the discussion loop.

> 
> If simplicity is the top priority then VIRTIO's MMIO transport without indirect
> descriptors and using the packed virtqueue layout reduces the size of the
> implementation:
> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1
> 440002

I listed some considerations for virtio-mmio in the response to Michael.
Please have a check if any different thoughts.

Thanks,
Wei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-10 11:17 ` Stefano Garzarella
  2021-11-10 21:45   ` Paraschiv, Andra-Irina
@ 2021-11-11  8:14   ` Wang, Wei W
  2021-11-11  8:24     ` Paolo Bonzini
  1 sibling, 1 reply; 12+ messages in thread
From: Wang, Wei W @ 2021-11-11  8:14 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: davem, kuba, Stefan Hajnoczi, mst, Paolo Bonzini, kys,
	linux-kernel, virtualization, Yamahata, Isaku, Nakajima, Jun,
	Kleen, Andi, Andra Paraschiv, Sergio Lopez Pascual

On Wednesday, November 10, 2021 7:17 PM, Stefano Garzarella wrote:


> Adding Andra and Sergio, because IIRC Firecracker and libkrun emulates
> virtio-vsock with virtio-mmio so the implementation should be simple and also
> not directly tied to a specific VMM.
> 

OK. This would be OK for KVM based guests.
For Hyperv and VMWare based guests, they don't have virtio-mmio support.
If the MigTD (a special guest) we provide is based on virtio-mmio, it would not be usable to them.

Thanks,
Wei


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-11  8:14   ` Wang, Wei W
@ 2021-11-11  8:24     ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-11  8:24 UTC (permalink / raw)
  To: Wang, Wei W, Stefano Garzarella
  Cc: davem, kuba, Stefan Hajnoczi, mst, kys, linux-kernel,
	virtualization, Yamahata, Isaku, Nakajima, Jun, Kleen, Andi,
	Andra Paraschiv, Sergio Lopez Pascual

On 11/11/21 09:14, Wang, Wei W wrote:
>> Adding Andra and Sergio, because IIRC Firecracker and libkrun 
>> emulates virtio-vsock with virtio-mmio so the implementation
>> should be simple and also not directly tied to a specific VMM.
>> 
> OK. This would be OK for KVM based guests. For Hyperv and VMWare 
> based guests, they don't have virtio-mmio support. If the MigTD (a 
> special guest) we provide is based on virtio-mmio, it would not be 
> usable to them.

Hyper-V and VMware (and KVM) would have to add support for
hypercall-vsock anyway.  Why can't they just implement a subset of
virtio-mmio?  It's not hard and there's even plenty of permissively-
licensed code in the various VMMs for the *BSDs.

In fact, instead of defining your own transport for vsock, my first idea
would have been the opposite: reuse virtio-mmio for the registers and
the virtqueue format, and define your own virtio device for the MigTD!

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-11  7:58   ` Wang, Wei W
@ 2021-11-11 15:19     ` Michael S. Tsirkin
  2021-11-25  6:37     ` Jason Wang
  1 sibling, 0 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-11-11 15:19 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: sgarzare, davem, kuba, Stefan Hajnoczi, Paolo Bonzini, kys,
	linux-kernel, virtualization, Yamahata, Isaku, Nakajima, Jun,
	Kleen, Andi, srutherford, erdemaktas

On Thu, Nov 11, 2021 at 07:58:29AM +0000, Wang, Wei W wrote:
> On Wednesday, November 10, 2021 6:50 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
> >
> > hypercalls are fundamentally hypervisor dependent though.
> 
> Yes, each hypervisor needs to support it.
> We could simplify the design and implementation to the minimal, so that each hypervisor can easily support it.
> Once every hypervisor has the support, the guest (MigTD) could be a unified version.
> (e.g. no need for each hypervisor user to develop their own MigTD using their own vsock transport)
> 
> > Assuming you can carve up a hypervisor independent hypercall, using it for
> > something as mundane and specific as vsock for TDX seems like a huge overkill.
> > For example, virtio could benefit from faster vmexits that hypercalls give you
> > for signalling.
> > How about a combination of virtio-mmio and hypercalls for fast-path signalling
> > then?
> 
> We thought about virtio-mmio. There are some barriers:
> 1) It wasn't originally intended for x86 machines. The only machine type in QEMU
> that supports it (to run on x86) is microvm. But "microvm" doesn’t support TDX currently,
> and adding this support might need larger effort.
> 2) It's simpler than virtio-pci, but still more complex than hypercall.
> 3) Some CSPs don't have virtio support in their software, so this might add too much development effort for them.
> 
> This usage doesn’t need high performance, so faster hypercall for signalling isn't required, I think.
> (but if hypercall has been verified to be much faster than the current EPT misconfig based notification,
> it could be added for the general virtio usages)
> 
> > 
> > > 2)       It is simpler. It doesn’t rely on any complex bus enumeration
> > >
> > > (e.g. virtio-pci based vsock device may need the whole implementation of
> > PCI).
> > >
> > 
> > Next thing people will try to do is implement a bunch of other device on top of
> > it.  virtio used pci simply because everyone implements pci.  And the reason
> > for *that* is because implementing a basic pci bus is dead simple, whole of
> > pci.c in qemu is <3000 LOC.
> 
> This doesn’t include the PCI enumeration in seaBIOS and the PCI driver in the guest though.

Do we really need to worry about migrating guests that did not complete
PCI enumeration yet?

Anyway, kvm unit test has a ~500 LOC pci driver.  It does not support pci bridges
or interrupts though - if you want to do that then requiring that the device in
question is on bus 0 and using polling seems like a reasonable limitation?

> Virtio has high performance, I think that's an important reason that more devices are continually added.
> For this transport, I couldn’t envision that a bunch of devices would be added. It's a simple PV method.

Famous last words. My point is adding a vendor agnostic hypercall needs
a bunch of negotiation and agreement between vendors. If you are going
to all the trouble, it seems like a waste to make it single use.

> 
> > 
> > >
> > > An example usage is the communication between MigTD and host (Page 8
> > > at
> > >
> > > https://static.sched.com/hosted_files/kvmforum2021/ef/
> > > TDX%20Live%20Migration_Wei%20Wang.pdf).
> > >
> > > MigTD communicates to host to assist the migration of the target (user) TD.
> > >
> > > MigTD is part of the TCB, so its implementation is expected to be as
> > > simple as possible
> > >
> > > (e.g. bare mental implementation without OS, no PCI driver support).
> > >
> > >
> > 
> > Try to list drawbacks? For example, passthrough for nested virt isn't possible
> > unlike pci, neither are hardware implementations.
> > 
> 
> Why hypercall wouldn't be possible for nested virt?
> L2 hypercall goes to L0 directly and L0 can decide whether to forward the call the L1 (in our case, I think no need as the packet will go out), right?
> 
> Its drawbacks are obvious (e.g. low performance). 

Exactly.

> In general, I think it could be considered as a complement to virtio.
> I think most usages would choose virtio as they don’t worry about the complexity and they purse high performance.
> For some special usages that think virtio is too complex to suffice and they want something simpler, they would consider to use this transport。
> 
> Thanks,
> Wei

So implement a small subset of virtio then, no one forces you to use all
its features. virtio mmio is about 30 registers, most can be stubbed to
constants. packed ring is much simpler than split one.

-- 
MST


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-11  7:58   ` Wang, Wei W
  2021-11-11 15:19     ` Michael S. Tsirkin
@ 2021-11-25  6:37     ` Jason Wang
  2021-11-25  8:43       ` Wang, Wei W
  1 sibling, 1 reply; 12+ messages in thread
From: Jason Wang @ 2021-11-25  6:37 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: Michael S. Tsirkin, sgarzare, davem, kuba, Stefan Hajnoczi,
	Paolo Bonzini, kys, linux-kernel, virtualization, Yamahata,
	Isaku, Nakajima, Jun, Kleen, Andi, srutherford, erdemaktas

On Thu, Nov 11, 2021 at 3:59 PM Wang, Wei W <wei.w.wang@intel.com> wrote:
>
> On Wednesday, November 10, 2021 6:50 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 10, 2021 at 07:12:36AM +0000, Wang, Wei W wrote:
> >
> > hypercalls are fundamentally hypervisor dependent though.
>
> Yes, each hypervisor needs to support it.
> We could simplify the design and implementation to the minimal, so that each hypervisor can easily support it.
> Once every hypervisor has the support, the guest (MigTD) could be a unified version.
> (e.g. no need for each hypervisor user to develop their own MigTD using their own vsock transport)
>
> > Assuming you can carve up a hypervisor independent hypercall, using it for
> > something as mundane and specific as vsock for TDX seems like a huge overkill.
> > For example, virtio could benefit from faster vmexits that hypercalls give you
> > for signalling.
> > How about a combination of virtio-mmio and hypercalls for fast-path signalling
> > then?
>
> We thought about virtio-mmio. There are some barriers:
> 1) It wasn't originally intended for x86 machines. The only machine type in QEMU
> that supports it (to run on x86) is microvm. But "microvm" doesn’t support TDX currently,
> and adding this support might need larger effort.

Can you explain why microvm needs larger effort? It looks to me it
fits for TDX perfectly since it has less attack surface.

Thanks

> 2) It's simpler than virtio-pci, but still more complex than hypercall.
> 3) Some CSPs don't have virtio support in their software, so this might add too much development effort for them.
>
> This usage doesn’t need high performance, so faster hypercall for signalling isn't required, I think.
> (but if hypercall has been verified to be much faster than the current EPT misconfig based notification,
> it could be added for the general virtio usages)
>
> >
> > > 2)       It is simpler. It doesn’t rely on any complex bus enumeration
> > >
> > > (e.g. virtio-pci based vsock device may need the whole implementation of
> > PCI).
> > >
> >
> > Next thing people will try to do is implement a bunch of other device on top of
> > it.  virtio used pci simply because everyone implements pci.  And the reason
> > for *that* is because implementing a basic pci bus is dead simple, whole of
> > pci.c in qemu is <3000 LOC.
>
> This doesn’t include the PCI enumeration in seaBIOS and the PCI driver in the guest though.
>
> Virtio has high performance, I think that's an important reason that more devices are continually added.
> For this transport, I couldn’t envision that a bunch of devices would be added. It's a simple PV method.
>
>
> >
> > >
> > > An example usage is the communication between MigTD and host (Page 8
> > > at
> > >
> > > https://static.sched.com/hosted_files/kvmforum2021/ef/
> > > TDX%20Live%20Migration_Wei%20Wang.pdf).
> > >
> > > MigTD communicates to host to assist the migration of the target (user) TD.
> > >
> > > MigTD is part of the TCB, so its implementation is expected to be as
> > > simple as possible
> > >
> > > (e.g. bare mental implementation without OS, no PCI driver support).
> > >
> > >
> >
> > Try to list drawbacks? For example, passthrough for nested virt isn't possible
> > unlike pci, neither are hardware implementations.
> >
>
> Why hypercall wouldn't be possible for nested virt?
> L2 hypercall goes to L0 directly and L0 can decide whether to forward the call the L1 (in our case, I think no need as the packet will go out), right?
>
> Its drawbacks are obvious (e.g. low performance).
> In general, I think it could be considered as a complement to virtio.
> I think most usages would choose virtio as they don’t worry about the complexity and they purse high performance.
> For some special usages that think virtio is too complex to suffice and they want something simpler, they would consider to use this transport。
>
> Thanks,
> Wei
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-25  6:37     ` Jason Wang
@ 2021-11-25  8:43       ` Wang, Wei W
  2021-11-25 12:04         ` Gerd Hoffmann
  0 siblings, 1 reply; 12+ messages in thread
From: Wang, Wei W @ 2021-11-25  8:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, sgarzare, davem, kuba, Stefan Hajnoczi,
	Paolo Bonzini, kys, linux-kernel, virtualization, Yamahata,
	Isaku, Nakajima, Jun, Kleen, Andi, srutherford, erdemaktas

On Thursday, November 25, 2021 2:38 PM, Jason Wang wrote:
> > We thought about virtio-mmio. There are some barriers:
> > 1) It wasn't originally intended for x86 machines. The only machine
> > type in QEMU that supports it (to run on x86) is microvm. But
> > "microvm" doesn’t support TDX currently, and adding this support might
> need larger effort.
> 
> Can you explain why microvm needs larger effort? It looks to me it fits for TDX
> perfectly since it has less attack surface.

The main thing is TDVF doesn’t support microvm so far (the based OVMF
support for microvm is still under their community discussion).

Do you guys think it is possible to add virtio-mmio support for q35?
(e.g. create a special platform bus in some fashion for memory mapped devices)
Not sure if the effort would be larger.

Thanks,
Wei




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] hypercall-vsock: add a new vsock transport
  2021-11-25  8:43       ` Wang, Wei W
@ 2021-11-25 12:04         ` Gerd Hoffmann
  0 siblings, 0 replies; 12+ messages in thread
From: Gerd Hoffmann @ 2021-11-25 12:04 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: Jason Wang, Yamahata, Isaku, Michael S. Tsirkin, srutherford,
	linux-kernel, virtualization, erdemaktas, Stefan Hajnoczi,
	Paolo Bonzini, Kleen, Andi, kuba, davem

On Thu, Nov 25, 2021 at 08:43:55AM +0000, Wang, Wei W wrote:
> On Thursday, November 25, 2021 2:38 PM, Jason Wang wrote:
> > > We thought about virtio-mmio. There are some barriers:
> > > 1) It wasn't originally intended for x86 machines. The only machine
> > > type in QEMU that supports it (to run on x86) is microvm. But
> > > "microvm" doesn’t support TDX currently, and adding this support might
> > need larger effort.
> > 
> > Can you explain why microvm needs larger effort? It looks to me it fits for TDX
> > perfectly since it has less attack surface.
> 
> The main thing is TDVF doesn’t support microvm so far (the based OVMF
> support for microvm is still under their community discussion).

Initial microvm support (direct kernel boot only) is merged in upstream
OVMF.  Better device support is underway: virtio-mmio patches are out
for review, patches for pcie support exist.

TDX patches for OVMF are under review upstream, I havn't noticed
anything which would be a blocker for microvm.  If it doesn't work
out-of-the-box it should be mostly wiring up things needed on guest
(ovmf) and/or host (qemu) side.

(same goes for sev btw).

> Do you guys think it is possible to add virtio-mmio support for q35?
> (e.g. create a special platform bus in some fashion for memory mapped devices)
> Not sure if the effort would be larger.

I'd rather explore the microvm path than making q35 even more
frankenstein than it already is.

Also the pcie host bridge is present in q35 no matter what, so one of
the reasons to use virtio-mmio ("we can reduce the attach surface by
turning off pcie") goes away.

take care,
  Gerd


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-11-25 12:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <71d7b0463629471e9d4887d7fcef1d8d@intel.com>
2021-11-10  9:34 ` [RFC] hypercall-vsock: add a new vsock transport Stefan Hajnoczi
2021-11-11  8:02   ` Wang, Wei W
2021-11-10 10:50 ` Michael S. Tsirkin
2021-11-11  7:58   ` Wang, Wei W
2021-11-11 15:19     ` Michael S. Tsirkin
2021-11-25  6:37     ` Jason Wang
2021-11-25  8:43       ` Wang, Wei W
2021-11-25 12:04         ` Gerd Hoffmann
2021-11-10 11:17 ` Stefano Garzarella
2021-11-10 21:45   ` Paraschiv, Andra-Irina
2021-11-11  8:14   ` Wang, Wei W
2021-11-11  8:24     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).