From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56860) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eN08T-0000Up-Mb for qemu-devel@nongnu.org; Thu, 07 Dec 2017 12:39:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eN08P-0007Ps-OS for qemu-devel@nongnu.org; Thu, 07 Dec 2017 12:39:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43020) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eN08P-0007Pk-Es for qemu-devel@nongnu.org; Thu, 07 Dec 2017 12:38:57 -0500 Date: Thu, 7 Dec 2017 19:38:49 +0200 From: "Michael S. Tsirkin" Message-ID: <20171207193003-mutt-send-email-mst@kernel.org> References: <286AC319A985734F985F78AFA26841F73937B57F@shsmsx102.ccr.corp.intel.com> <5A28BC2D.6000308@intel.com> <5A290398.60508@intel.com> <20171207153454-mutt-send-email-mst@kernel.org> <20171207183945-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [virtio-dev] [PATCH v3 0/7] Vhost-pci for inter-VM communication List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Wei Wang , "virtio-dev@lists.oasis-open.org" , "Yang, Zhiyong" , "jan.kiszka@siemens.com" , "jasowang@redhat.com" , "avi.cohen@huawei.com" , "qemu-devel@nongnu.org" , Stefan Hajnoczi , "pbonzini@redhat.com" , "marcandre.lureau@redhat.com" On Thu, Dec 07, 2017 at 05:29:14PM +0000, Stefan Hajnoczi wrote: > On Thu, Dec 7, 2017 at 4:47 PM, Michael S. Tsirkin wrote: > > On Thu, Dec 07, 2017 at 04:29:45PM +0000, Stefan Hajnoczi wrote: > >> On Thu, Dec 7, 2017 at 2:02 PM, Michael S. Tsirkin wrote: > >> > On Thu, Dec 07, 2017 at 01:08:04PM +0000, Stefan Hajnoczi wrote: > >> >> Instead of responding individually to these points, I hope this will > >> >> explain my perspective. Let me know if you do want individual > >> >> responses, I'm happy to talk more about the points above but I think > >> >> the biggest difference is our perspective on this: > >> >> > >> >> Existing vhost-user slave code should be able to run on top of > >> >> vhost-pci. For example, QEMU's > >> >> contrib/vhost-user-scsi/vhost-user-scsi.c should work inside the guest > >> >> with only minimal changes to the source file (i.e. today it explicitly > >> >> opens a UNIX domain socket and that should be done by libvhost-user > >> >> instead). It shouldn't be hard to add vhost-pci vfio support to > >> >> contrib/libvhost-user/ alongside the existing UNIX domain socket code. > >> >> > >> >> This seems pretty easy to achieve with the vhost-pci PCI adapter that > >> >> I've described but I'm not sure how to implement libvhost-user on top > >> >> of vhost-pci vfio if the device doesn't expose the vhost-user > >> >> protocol. > >> >> > >> >> I think this is a really important goal. Let's use a single > >> >> vhost-user software stack instead of creating a separate one for guest > >> >> code only. > >> >> > >> >> Do you agree that the vhost-user software stack should be shared > >> >> between host userspace and guest code as much as possible? > >> > > >> > > >> > > >> > The sharing you propose is not necessarily practical because the security goals > >> > of the two are different. > >> > > >> > It seems that the best motivation presentation is still the original rfc > >> > > >> > http://virtualization.linux-foundation.narkive.com/A7FkzAgp/rfc-vhost-user-enhancements-for-vm2vm-communication > >> > > >> > So comparing with vhost-user iotlb handling is different: > >> > > >> > With vhost-user guest trusts the vhost-user backend on the host. > >> > > >> > With vhost-pci we can strive to limit the trust to qemu only. > >> > The switch running within a VM does not have to be trusted. > >> > >> Can you give a concrete example? > >> > >> I have an idea about what you're saying but it may be wrong: > >> > >> Today the iotlb mechanism in vhost-user does not actually enforce > >> memory permissions. The vhost-user slave has full access to mmapped > >> memory regions even when iotlb is enabled. Currently the iotlb just > >> adds an indirection layer but no real security. (Is this correct?) > > > > Not exactly. iotlb protects against malicious drivers within guest. > > But yes, not against a vhost-user driver on the host. > > > >> Are you saying the vhost-pci device code in QEMU should enforce iotlb > >> permissions so the vhost-user slave guest only has access to memory > >> regions that are allowed by the iotlb? > > > > Yes. > > Okay, thanks for confirming. > > This can be supported by the approach I've described. The vhost-pci > QEMU code has control over the BAR memory so it can prevent the guest > from accessing regions that are not allowed by the iotlb. > > Inside the guest the vhost-user slave still has the memory region > descriptions and sends iotlb messages. This is completely compatible > with the libvirt-user APIs and existing vhost-user slave code can run > fine. The only unique thing is that guest accesses to memory regions > not allowed by the iotlb do not work because QEMU has prevented it. I don't think this can work since suddenly you need to map full IOMMU address space into BAR. Besides, this means implementing iotlb in both qemu and guest. > If better performance is needed then it might be possible to optimize > this interface by handling most or even all of the iotlb stuff in QEMU > vhost-pci code and not exposing it to the vhost-user slave in the > guest. But it doesn't change the fact that the vhost-user protocol > can be used and the same software stack works. For one, the iotlb part would be out of scope then. Instead you would have code to offset from BAR. > Do you have a concrete example of why sharing the same vhost-user > software stack inside the guest can't work? With enough dedication some code might be shared. OTOH reusing virtio gains you a ready feature negotiation and discovery protocol. I'm not convinced which has more value, and the second proposal has been implemented already. > >> and QEMU generally doesn't > >> implement things this way. > > > > Not sure what does this mean. > > It's the reason why virtio-9p has a separate virtfs-proxy-helper > program. Root is needed to set file uid/gids. Instead of running > QEMU as root, there is a separate helper process that handles the > privileged operations. It slows things down and makes the codebase > larger but it prevents the guest from getting root in case of QEMU > bugs. > > The reason why VMs are considered more secure than containers is > because of the extra level of isolation provided by running device > emulation in an unprivileged userspace process. If you change this > model then QEMU loses the "security in depth" advantage. > > Stefan I don't see where vhost-pci needs QEMU to run as root though. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: virtio-dev-return-2788-cohuck=redhat.com@lists.oasis-open.org Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [66.179.20.138]) by lists.oasis-open.org (Postfix) with ESMTP id 22E295819120 for ; Thu, 7 Dec 2017 09:38:58 -0800 (PST) Date: Thu, 7 Dec 2017 19:38:49 +0200 From: "Michael S. Tsirkin" Message-ID: <20171207193003-mutt-send-email-mst@kernel.org> References: <286AC319A985734F985F78AFA26841F73937B57F@shsmsx102.ccr.corp.intel.com> <5A28BC2D.6000308@intel.com> <5A290398.60508@intel.com> <20171207153454-mutt-send-email-mst@kernel.org> <20171207183945-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: [virtio-dev] Re: [Qemu-devel] [virtio-dev] [PATCH v3 0/7] Vhost-pci for inter-VM communication To: Stefan Hajnoczi Cc: Wei Wang , "virtio-dev@lists.oasis-open.org" , "Yang, Zhiyong" , "jan.kiszka@siemens.com" , "jasowang@redhat.com" , "avi.cohen@huawei.com" , "qemu-devel@nongnu.org" , Stefan Hajnoczi , "pbonzini@redhat.com" , "marcandre.lureau@redhat.com" List-ID: On Thu, Dec 07, 2017 at 05:29:14PM +0000, Stefan Hajnoczi wrote: > On Thu, Dec 7, 2017 at 4:47 PM, Michael S. Tsirkin wrote: > > On Thu, Dec 07, 2017 at 04:29:45PM +0000, Stefan Hajnoczi wrote: > >> On Thu, Dec 7, 2017 at 2:02 PM, Michael S. Tsirkin wrote: > >> > On Thu, Dec 07, 2017 at 01:08:04PM +0000, Stefan Hajnoczi wrote: > >> >> Instead of responding individually to these points, I hope this will > >> >> explain my perspective. Let me know if you do want individual > >> >> responses, I'm happy to talk more about the points above but I think > >> >> the biggest difference is our perspective on this: > >> >> > >> >> Existing vhost-user slave code should be able to run on top of > >> >> vhost-pci. For example, QEMU's > >> >> contrib/vhost-user-scsi/vhost-user-scsi.c should work inside the guest > >> >> with only minimal changes to the source file (i.e. today it explicitly > >> >> opens a UNIX domain socket and that should be done by libvhost-user > >> >> instead). It shouldn't be hard to add vhost-pci vfio support to > >> >> contrib/libvhost-user/ alongside the existing UNIX domain socket code. > >> >> > >> >> This seems pretty easy to achieve with the vhost-pci PCI adapter that > >> >> I've described but I'm not sure how to implement libvhost-user on top > >> >> of vhost-pci vfio if the device doesn't expose the vhost-user > >> >> protocol. > >> >> > >> >> I think this is a really important goal. Let's use a single > >> >> vhost-user software stack instead of creating a separate one for guest > >> >> code only. > >> >> > >> >> Do you agree that the vhost-user software stack should be shared > >> >> between host userspace and guest code as much as possible? > >> > > >> > > >> > > >> > The sharing you propose is not necessarily practical because the security goals > >> > of the two are different. > >> > > >> > It seems that the best motivation presentation is still the original rfc > >> > > >> > http://virtualization.linux-foundation.narkive.com/A7FkzAgp/rfc-vhost-user-enhancements-for-vm2vm-communication > >> > > >> > So comparing with vhost-user iotlb handling is different: > >> > > >> > With vhost-user guest trusts the vhost-user backend on the host. > >> > > >> > With vhost-pci we can strive to limit the trust to qemu only. > >> > The switch running within a VM does not have to be trusted. > >> > >> Can you give a concrete example? > >> > >> I have an idea about what you're saying but it may be wrong: > >> > >> Today the iotlb mechanism in vhost-user does not actually enforce > >> memory permissions. The vhost-user slave has full access to mmapped > >> memory regions even when iotlb is enabled. Currently the iotlb just > >> adds an indirection layer but no real security. (Is this correct?) > > > > Not exactly. iotlb protects against malicious drivers within guest. > > But yes, not against a vhost-user driver on the host. > > > >> Are you saying the vhost-pci device code in QEMU should enforce iotlb > >> permissions so the vhost-user slave guest only has access to memory > >> regions that are allowed by the iotlb? > > > > Yes. > > Okay, thanks for confirming. > > This can be supported by the approach I've described. The vhost-pci > QEMU code has control over the BAR memory so it can prevent the guest > from accessing regions that are not allowed by the iotlb. > > Inside the guest the vhost-user slave still has the memory region > descriptions and sends iotlb messages. This is completely compatible > with the libvirt-user APIs and existing vhost-user slave code can run > fine. The only unique thing is that guest accesses to memory regions > not allowed by the iotlb do not work because QEMU has prevented it. I don't think this can work since suddenly you need to map full IOMMU address space into BAR. Besides, this means implementing iotlb in both qemu and guest. > If better performance is needed then it might be possible to optimize > this interface by handling most or even all of the iotlb stuff in QEMU > vhost-pci code and not exposing it to the vhost-user slave in the > guest. But it doesn't change the fact that the vhost-user protocol > can be used and the same software stack works. For one, the iotlb part would be out of scope then. Instead you would have code to offset from BAR. > Do you have a concrete example of why sharing the same vhost-user > software stack inside the guest can't work? With enough dedication some code might be shared. OTOH reusing virtio gains you a ready feature negotiation and discovery protocol. I'm not convinced which has more value, and the second proposal has been implemented already. > >> and QEMU generally doesn't > >> implement things this way. > > > > Not sure what does this mean. > > It's the reason why virtio-9p has a separate virtfs-proxy-helper > program. Root is needed to set file uid/gids. Instead of running > QEMU as root, there is a separate helper process that handles the > privileged operations. It slows things down and makes the codebase > larger but it prevents the guest from getting root in case of QEMU > bugs. > > The reason why VMs are considered more secure than containers is > because of the extra level of isolation provided by running device > emulation in an unprivileged userspace process. If you change this > model then QEMU loses the "security in depth" advantage. > > Stefan I don't see where vhost-pci needs QEMU to run as root though. --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org