* [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM @ 2020-02-20 16:06 Halil Pasic 2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic ` (5 more replies) 0 siblings, 6 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw) To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy, Christoph Hellwig Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Viktor Mihajlovski, David Gibson Currently if one intends to run a memory protection enabled VM with virtio devices and linux as the guest OS, one needs to specify the VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest linux use the DMA API, which in turn handles the memory encryption/protection stuff if the guest decides to turn itself into a protected one. This however makes no sense due to multiple reasons: * The device is not changed by the fact that the guest RAM is protected. The so called IOMMU bypass quirk is not affected. * This usage is not congruent with standardised semantics of VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason for using DMA API in virtio (orthogonal with respect to what is expressed by VIRTIO_F_IOMMU_PLATFORM). This series aims to decouple 'have to use DMA API because my (guest) RAM is protected' and 'have to use DMA API because the device told me VIRTIO_F_IOMMU_PLATFORM'. Please find more detailed explanations about the conceptual aspects in the individual patches. There is however also a very practical problem that is addressed by this series. For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side effect The vhost code assumes it the addresses on the virtio descriptor ring are not guest physical addresses but iova's, and insists on doing a translation of these regardless of what transport is used (e.g. whether we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b "vhost: new device IOTLB API".) On s390 this results in severe performance degradation (c.a. factor 10). BTW with ccw I/O there is (architecturally) no IOMMU, so the whole address translation makes no sense in the context of virtio-ccw. Halil Pasic (2): mm: move force_dma_unencrypted() to mem_encrypt.h virtio: let virtio use DMA API when guest RAM is protected drivers/virtio/virtio_ring.c | 3 +++ include/linux/dma-direct.h | 9 --------- include/linux/mem_encrypt.h | 10 ++++++++++ 3 files changed, 13 insertions(+), 9 deletions(-) base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic @ 2020-02-20 16:06 ` Halil Pasic 2020-02-20 16:11 ` Christoph Hellwig 2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic ` (4 subsequent siblings) 5 siblings, 1 reply; 58+ messages in thread From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw) To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy, Christoph Hellwig Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Viktor Mihajlovski, David Gibson Currently force_dma_unencrypted() is only used by the direct implementation of the DMA API, and thus resides in dma-direct.h. But there is nothing dma-direct specific about it: if one was -- for whatever reason -- to implement custom DMA ops that have to in the encrypted/protected scenarios dma-direct currently deals with, one would need exactly this kind of information. More importantly, virtio has to use DMA API (so that the memory encryption (x86) or protection (power, s390) is handled) under the very same circumstances force_dma_unencrypted() returns true. Furthermore, the in these cases the reason to go via the DMA API is distinct, compared to the reason indicated by VIRTIO_F_IOMMU_PLATFORM: we need to use DMA API independently of the device's properties with regards to access to memory. I.e. the addresses in the descriptors are still guest physical addresses, the device may still be implemented by a SMP CPU, and thus the device shall use those without any further translation. See [1]. Let's move force_dma_unencrypted() the so virtio, or other implementations of DMA ops can make the right decisions. [1] https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-4100006 (In the spec VIRTIO_F_IOMMU_PLATFORM is called VIRTIO_F_ACCESS_PLATFORM). Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Ram Pai <linuxram@us.ibm.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> --- include/linux/dma-direct.h | 9 --------- include/linux/mem_encrypt.h | 10 ++++++++++ 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 24b8684aa21d..590b15d881b0 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -26,15 +26,6 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr) } #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */ -#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED -bool force_dma_unencrypted(struct device *dev); -#else -static inline bool force_dma_unencrypted(struct device *dev) -{ - return false; -} -#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */ - /* * If memory encryption is supported, phys_to_dma will set the memory encryption * bit in the DMA address, and dma_to_phys will clear it. The raw __phys_to_dma diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h index 5c4a18a91f89..64a48c4b01a2 100644 --- a/include/linux/mem_encrypt.h +++ b/include/linux/mem_encrypt.h @@ -22,6 +22,16 @@ static inline bool mem_encrypt_active(void) { return false; } #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */ +struct device; +#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED +bool force_dma_unencrypted(struct device *dev); +#else +static inline bool force_dma_unencrypted(struct device *dev) +{ + return false; +} +#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */ + #ifdef CONFIG_AMD_MEM_ENCRYPT /* * The __sme_set() and __sme_clr() macros are useful for adding or removing -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic @ 2020-02-20 16:11 ` Christoph Hellwig 2020-02-20 16:23 ` Christian Borntraeger 0 siblings, 1 reply; 58+ messages in thread From: Christoph Hellwig @ 2020-02-20 16:11 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote: > Currently force_dma_unencrypted() is only used by the direct > implementation of the DMA API, and thus resides in dma-direct.h. But > there is nothing dma-direct specific about it: if one was -- for > whatever reason -- to implement custom DMA ops that have to in the > encrypted/protected scenarios dma-direct currently deals with, one would > need exactly this kind of information. I really don't think it has business being anywhre else, and your completely bogus second patch just proves the point. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:11 ` Christoph Hellwig @ 2020-02-20 16:23 ` Christian Borntraeger 2020-02-20 16:31 ` Christoph Hellwig 0 siblings, 1 reply; 58+ messages in thread From: Christian Borntraeger @ 2020-02-20 16:23 UTC (permalink / raw) To: Christoph Hellwig, Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy On 20.02.20 17:11, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote: >> Currently force_dma_unencrypted() is only used by the direct >> implementation of the DMA API, and thus resides in dma-direct.h. But >> there is nothing dma-direct specific about it: if one was -- for >> whatever reason -- to implement custom DMA ops that have to in the >> encrypted/protected scenarios dma-direct currently deals with, one would >> need exactly this kind of information. > > I really don't think it has business being anywhre else, and your completely > bogus second patch just proves the point. From a users perspective it makes absolutely perfect sense to use the bounce buffers when they are NEEDED. Forcing the user to specify iommu_platform just because you need bounce buffers really feels wrong. And obviously we have a severe performance issue because of the indirections. Now: I understand that you want to get this fixes differently, but maybe you could help to outline how this could be fixed proper. Christian _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:23 ` Christian Borntraeger @ 2020-02-20 16:31 ` Christoph Hellwig 2020-02-20 17:00 ` Christian Borntraeger 2020-02-21 3:27 ` David Gibson 0 siblings, 2 replies; 58+ messages in thread From: Christoph Hellwig @ 2020-02-20 16:31 UTC (permalink / raw) To: Christian Borntraeger Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > >From a users perspective it makes absolutely perfect sense to use the > bounce buffers when they are NEEDED. > Forcing the user to specify iommu_platform just because you need bounce buffers > really feels wrong. And obviously we have a severe performance issue > because of the indirections. The point is that the user should not have to specify iommu_platform. We need to make sure any new hypervisor (especially one that might require bounce buffering) always sets it, as was a rather bogus legacy hack that isn't extensibe for cases that for example require bounce buffering. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:31 ` Christoph Hellwig @ 2020-02-20 17:00 ` Christian Borntraeger 2020-02-21 3:27 ` David Gibson 1 sibling, 0 replies; 58+ messages in thread From: Christian Borntraeger @ 2020-02-20 17:00 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy On 20.02.20 17:31, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: >> >From a users perspective it makes absolutely perfect sense to use the >> bounce buffers when they are NEEDED. >> Forcing the user to specify iommu_platform just because you need bounce buffers >> really feels wrong. And obviously we have a severe performance issue >> because of the indirections. > > The point is that the user should not have to specify iommu_platform. I (and Halil) agree on that. From a user perspective we want to have the right thing in the guest without any command option. The iommu_plaform thing was indeed a first step to make things work. I might mis-read Halils patch, but isnt one effect of this patch set that things WILL work without iommu_platform? > We need to make sure any new hypervisor (especially one that might require > bounce buffering) always sets it, as was a rather bogus legacy hack > that isn't extensibe for cases that for example require bounce buffering. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-20 16:31 ` Christoph Hellwig 2020-02-20 17:00 ` Christian Borntraeger @ 2020-02-21 3:27 ` David Gibson 2020-02-21 13:06 ` Halil Pasic 1 sibling, 1 reply; 58+ messages in thread From: David Gibson @ 2020-02-21 3:27 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy [-- Attachment #1.1: Type: text/plain, Size: 2284 bytes --] On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > >From a users perspective it makes absolutely perfect sense to use the > > bounce buffers when they are NEEDED. > > Forcing the user to specify iommu_platform just because you need bounce buffers > > really feels wrong. And obviously we have a severe performance issue > > because of the indirections. > > The point is that the user should not have to specify iommu_platform. > We need to make sure any new hypervisor (especially one that might require > bounce buffering) always sets it, So, I have draft qemu patches which enable iommu_platform by default. But that's really because of other problems with !iommu_platform, not anything to do with bounce buffering or secure VMs. The thing is that the hypervisor *doesn't* require bounce buffering. In the POWER (and maybe s390 as well) models for Secure VMs, it's the *guest*'s choice to enter secure mode, so the hypervisor has no reason to know whether the guest needs bounce buffering. As far as the hypervisor and qemu are concerned that's a guest internal detail, it just expects to get addresses it can access whether those are GPAs (iommu_platform=off) or IOVAs (iommu_platform=on). > as was a rather bogus legacy hack It was certainly a bad idea, but it was a bad idea that went into a public spec and has been widely deployed for many years. We can't just pretend it didn't happen and move on. Turning iommu_platform=on by default breaks old guests, some of which we still care about. We can't (automatically) do it only for guests that need bounce buffering, because the hypervisor doesn't know that ahead of time. > that isn't extensibe for cases that for example require bounce buffering. In fact bounce buffering isn't really the issue from the hypervisor (or spec's) point of view. It's the fact that not all of guest memory is accessible to the hypervisor. Bounce buffering is just one way the guest might deal with that. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-21 3:27 ` David Gibson @ 2020-02-21 13:06 ` Halil Pasic 2020-02-21 15:48 ` Michael S. Tsirkin 0 siblings, 1 reply; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:06 UTC (permalink / raw) To: David Gibson Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 2498 bytes --] On Fri, 21 Feb 2020 14:27:27 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > >From a users perspective it makes absolutely perfect sense to use the > > > bounce buffers when they are NEEDED. > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > really feels wrong. And obviously we have a severe performance issue > > > because of the indirections. > > > > The point is that the user should not have to specify iommu_platform. > > We need to make sure any new hypervisor (especially one that might require > > bounce buffering) always sets it, > > So, I have draft qemu patches which enable iommu_platform by default. > But that's really because of other problems with !iommu_platform, not > anything to do with bounce buffering or secure VMs. > > The thing is that the hypervisor *doesn't* require bounce buffering. > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > *guest*'s choice to enter secure mode, so the hypervisor has no reason > to know whether the guest needs bounce buffering. As far as the > hypervisor and qemu are concerned that's a guest internal detail, it > just expects to get addresses it can access whether those are GPAs > (iommu_platform=off) or IOVAs (iommu_platform=on). I very much agree! > > > as was a rather bogus legacy hack > > It was certainly a bad idea, but it was a bad idea that went into a > public spec and has been widely deployed for many years. We can't > just pretend it didn't happen and move on. > > Turning iommu_platform=on by default breaks old guests, some of which > we still care about. We can't (automatically) do it only for guests > that need bounce buffering, because the hypervisor doesn't know that > ahead of time. Turning iommu_platform=on for virtio-ccw makes no sense whatsover, because for CCW I/O there is no such thing as IOMMU and the addresses are always physical addresses. > > > that isn't extensibe for cases that for example require bounce buffering. > > In fact bounce buffering isn't really the issue from the hypervisor > (or spec's) point of view. It's the fact that not all of guest memory > is accessible to the hypervisor. Bounce buffering is just one way the > guest might deal with that. > Agreed. Regards, Halil [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-21 13:06 ` Halil Pasic @ 2020-02-21 15:48 ` Michael S. Tsirkin 2020-02-21 18:07 ` Halil Pasic 0 siblings, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-21 15:48 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, David Gibson, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > On Fri, 21 Feb 2020 14:27:27 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > bounce buffers when they are NEEDED. > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > really feels wrong. And obviously we have a severe performance issue > > > > because of the indirections. > > > > > > The point is that the user should not have to specify iommu_platform. > > > We need to make sure any new hypervisor (especially one that might require > > > bounce buffering) always sets it, > > > > So, I have draft qemu patches which enable iommu_platform by default. > > But that's really because of other problems with !iommu_platform, not > > anything to do with bounce buffering or secure VMs. > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > to know whether the guest needs bounce buffering. As far as the > > hypervisor and qemu are concerned that's a guest internal detail, it > > just expects to get addresses it can access whether those are GPAs > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > I very much agree! > > > > > > as was a rather bogus legacy hack > > > > It was certainly a bad idea, but it was a bad idea that went into a > > public spec and has been widely deployed for many years. We can't > > just pretend it didn't happen and move on. > > > > Turning iommu_platform=on by default breaks old guests, some of which > > we still care about. We can't (automatically) do it only for guests > > that need bounce buffering, because the hypervisor doesn't know that > > ahead of time. > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > because for CCW I/O there is no such thing as IOMMU and the addresses > are always physical addresses. Fix the name then. The spec calls is ACCESS_PLATFORM now, which makes much more sense. > > > > > that isn't extensibe for cases that for example require bounce buffering. > > > > In fact bounce buffering isn't really the issue from the hypervisor > > (or spec's) point of view. It's the fact that not all of guest memory > > is accessible to the hypervisor. Bounce buffering is just one way the > > guest might deal with that. > > > > Agreed. > > Regards, > Halil > > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-21 15:48 ` Michael S. Tsirkin @ 2020-02-21 18:07 ` Halil Pasic 2020-02-24 3:33 ` David Gibson 0 siblings, 1 reply; 58+ messages in thread From: Halil Pasic @ 2020-02-21 18:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, David Gibson, Robin Murphy, Christoph Hellwig On Fri, 21 Feb 2020 10:48:15 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > On Fri, 21 Feb 2020 14:27:27 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > bounce buffers when they are NEEDED. > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > because of the indirections. > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > We need to make sure any new hypervisor (especially one that might require > > > > bounce buffering) always sets it, > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > But that's really because of other problems with !iommu_platform, not > > > anything to do with bounce buffering or secure VMs. > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > to know whether the guest needs bounce buffering. As far as the > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > just expects to get addresses it can access whether those are GPAs > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > I very much agree! > > > > > > > > > as was a rather bogus legacy hack > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > public spec and has been widely deployed for many years. We can't > > > just pretend it didn't happen and move on. > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > we still care about. We can't (automatically) do it only for guests > > > that need bounce buffering, because the hypervisor doesn't know that > > > ahead of time. > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > > because for CCW I/O there is no such thing as IOMMU and the addresses > > are always physical addresses. > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which > makes much more sense. I don't quite get it. Sorry. Maybe I will revisit this later. Regards, Halil > > > > > > > > that isn't extensibe for cases that for example require bounce buffering. > > > > > > In fact bounce buffering isn't really the issue from the hypervisor > > > (or spec's) point of view. It's the fact that not all of guest memory > > > is accessible to the hypervisor. Bounce buffering is just one way the > > > guest might deal with that. > > > > > > > Agreed. > > > > Regards, > > Halil > > > > > > > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-21 18:07 ` Halil Pasic @ 2020-02-24 3:33 ` David Gibson 2020-02-24 18:49 ` Halil Pasic 0 siblings, 1 reply; 58+ messages in thread From: David Gibson @ 2020-02-24 3:33 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 4446 bytes --] On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote: > On Fri, 21 Feb 2020 10:48:15 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > > On Fri, 21 Feb 2020 14:27:27 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > > bounce buffers when they are NEEDED. > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > > because of the indirections. > > > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > > We need to make sure any new hypervisor (especially one that might require > > > > > bounce buffering) always sets it, > > > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > > But that's really because of other problems with !iommu_platform, not > > > > anything to do with bounce buffering or secure VMs. > > > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > > to know whether the guest needs bounce buffering. As far as the > > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > > just expects to get addresses it can access whether those are GPAs > > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > > > I very much agree! > > > > > > > > > > > > as was a rather bogus legacy hack > > > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > > public spec and has been widely deployed for many years. We can't > > > > just pretend it didn't happen and move on. > > > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > > we still care about. We can't (automatically) do it only for guests > > > > that need bounce buffering, because the hypervisor doesn't know that > > > > ahead of time. > > > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > > > because for CCW I/O there is no such thing as IOMMU and the addresses > > > are always physical addresses. > > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which > > makes much more sense. > > I don't quite get it. Sorry. Maybe I will revisit this later. Halil, I think I can clarify this. The "iommu_platform" flag doesn't necessarily have anything to do with an iommu, although it often will. Basically it means "access guest memory via the bus's normal DMA mechanism" rather than "access guest memory using GPA, because you're the hypervisor and you can do that". For the case of ccw, both mechanisms end up being the same thing, since CCW's normal DMA *is* untranslated GPA access. For this reason, the flag in the spec was renamed to ACCESS_PLATFORM, but the flag in qemu still has the old name. AIUI, Michael is saying you could trivially change the name in qemu (obviously you'd need to alias the old name to the new one for compatibility). Actually, the fact that ccw has no translation makes things easier for you: you don't really have any impediment to turning ACCESS_PLATFORM on by default, since it doesn't make any real change to how things work. The remaining difficulty is that the virtio driver - since it can sit on multiple buses - won't know this, and will reject the ACCESS_PLATFORM flag, even though it could just do what it normally does on ccw and it would work. For that case, we could consider a hack in qemu where for virtio-ccw devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and carry on anyway. Normally we insist that the guest accept the ACCESS_PLATFORM flag if offered, because on most platforms they *don't* amount to the same thing. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-24 3:33 ` David Gibson @ 2020-02-24 18:49 ` Halil Pasic 2020-02-25 18:08 ` Cornelia Huck 0 siblings, 1 reply; 58+ messages in thread From: Halil Pasic @ 2020-02-24 18:49 UTC (permalink / raw) To: David Gibson Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 7234 bytes --] On Mon, 24 Feb 2020 14:33:14 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote: > > On Fri, 21 Feb 2020 10:48:15 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > > > On Fri, 21 Feb 2020 14:27:27 +1100 > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > > > bounce buffers when they are NEEDED. > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > > > because of the indirections. > > > > > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > > > We need to make sure any new hypervisor (especially one that might require > > > > > > bounce buffering) always sets it, > > > > > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > > > But that's really because of other problems with !iommu_platform, not > > > > > anything to do with bounce buffering or secure VMs. > > > > > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > > > to know whether the guest needs bounce buffering. As far as the > > > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > > > just expects to get addresses it can access whether those are GPAs > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > > > > > I very much agree! > > > > > > > > > > > > > > > as was a rather bogus legacy hack > > > > > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > > > public spec and has been widely deployed for many years. We can't > > > > > just pretend it didn't happen and move on. > > > > > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > > > we still care about. We can't (automatically) do it only for guests > > > > > that need bounce buffering, because the hypervisor doesn't know that > > > > > ahead of time. > > > > > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > > > > because for CCW I/O there is no such thing as IOMMU and the addresses > > > > are always physical addresses. > > > > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which > > > makes much more sense. > > > > I don't quite get it. Sorry. Maybe I will revisit this later. > > Halil, I think I can clarify this. > > The "iommu_platform" flag doesn't necessarily have anything to do with > an iommu, although it often will. Basically it means "access guest > memory via the bus's normal DMA mechanism" rather than "access guest > memory using GPA, because you're the hypervisor and you can do that". > Unfortunately, I don't think this is what is conveyed to the end users. Let's see what do we have documented: Neither Qemu user documentation (https://www.qemu.org/docs/master/qemu-doc.html) nor online help: qemu-system-s390x -device virtio-net-ccw,?|grep iommu iommu_platform=<bool> - on/off (default: false) has any documentation on it. But libvirt does have have documenttion on the knob that contros iommu_platform for QEMU (when QEMU is managed by libvirt): """ Virtio-related options QEMU's virtio devices have some attributes related to the virtio transport under the driver element: The iommu attribute enables the use of emulated IOMMU by the device. The attribute ats controls the Address Translation Service support for PCIe devices. This is needed to make use of IOTLB support (see IOMMU device). Possible values are on or off. Since 3.5.0 """ (https://libvirt.org/formatdomain.html#elementsVirtio) Thus it seems the only available documentation says that it "enables the use of emulated IOMMU by the device". And for vhost-user we have """ When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated: * Guest addresses map to the vhost memory region containing that guest address. When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated: * Guest addresses are also called I/O virtual addresses (IOVAs). They are translated to user addresses via the IOTLB. """ (docs/interop/vhost-user.rst) > For the case of ccw, both mechanisms end up being the same thing, > since CCW's normal DMA *is* untranslated GPA access. > Nod. > For this reason, the flag in the spec was renamed to ACCESS_PLATFORM, > but the flag in qemu still has the old name. > Yes, the name in the spec is more neutral. > AIUI, Michael is saying you could trivially change the name in qemu > (obviously you'd need to alias the old name to the new one for > compatibility). > I could, and the I could also ask the libvirt guys to change <driver iommu='X'> to <driver access_platform='X'> or similar and to change their documentation to something that is harder to comprehend. Although I'm not sure they would like the idea. > > Actually, the fact that ccw has no translation makes things easier for > you: you don't really have any impediment to turning ACCESS_PLATFORM > on by default, since it doesn't make any real change to how things > work. Yeah, it should not, in theory, but currently it does in practice. Currently vhost will try to do the IOTLB dance (Jason has a patch that should help with that), and we get the 'use dma api' side effects in the guest (e.g. virtqueue's data go <2G + some overhead). > > The remaining difficulty is that the virtio driver - since it can sit > on multiple buses - won't know this, and will reject the > ACCESS_PLATFORM flag, even though it could just do what it normally > does on ccw and it would work. Right ACCESS_PLATFORM is a funny feature where the device refuses to work if the driver does not ack. > > For that case, we could consider a hack in qemu where for virtio-ccw > devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and > carry on anyway. Normally we insist that the guest accept the > ACCESS_PLATFORM flag if offered, because on most platforms they > *don't* amount to the same thing. Jason found a nice way to differentiate between needs translation and does not need translation. But that patch still requires the ack by the driver (and as Michael has pointed out we have to consider migration). I'm afraid that F_IOMMU_PLATFORM means different things in different contexts, and that this ain't sufficiently documented. I'm tempted to do a proper write-up on this (let's hope my motivation will and my time will allow). I would also very much like to have Conny's opinion on this. Regards, Halil [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-24 18:49 ` Halil Pasic @ 2020-02-25 18:08 ` Cornelia Huck 2020-02-28 0:23 ` David Gibson 0 siblings, 1 reply; 58+ messages in thread From: Cornelia Huck @ 2020-02-25 18:08 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Michael S. Tsirkin, Jason Wang, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, David Gibson, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 8297 bytes --] On Mon, 24 Feb 2020 19:49:53 +0100 Halil Pasic <pasic@linux.ibm.com> wrote: > On Mon, 24 Feb 2020 14:33:14 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote: > > > On Fri, 21 Feb 2020 10:48:15 -0500 > > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > > > > On Fri, 21 Feb 2020 14:27:27 +1100 > > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > > > > bounce buffers when they are NEEDED. > > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > > > > because of the indirections. > > > > > > > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > > > > We need to make sure any new hypervisor (especially one that might require > > > > > > > bounce buffering) always sets it, > > > > > > > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > > > > But that's really because of other problems with !iommu_platform, not > > > > > > anything to do with bounce buffering or secure VMs. > > > > > > > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > > > > to know whether the guest needs bounce buffering. As far as the > > > > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > > > > just expects to get addresses it can access whether those are GPAs > > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > > > > > > > I very much agree! > > > > > > > > > > > > > > > > > > as was a rather bogus legacy hack > > > > > > > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > > > > public spec and has been widely deployed for many years. We can't > > > > > > just pretend it didn't happen and move on. > > > > > > > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > > > > we still care about. We can't (automatically) do it only for guests > > > > > > that need bounce buffering, because the hypervisor doesn't know that > > > > > > ahead of time. We could default to iommu_platform=on on s390 when the host has active support for protected virtualization... but that's just another kind of horrible, so let's just pretend I didn't suggest it. > > > > > > > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > > > > > because for CCW I/O there is no such thing as IOMMU and the addresses > > > > > are always physical addresses. > > > > > > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which > > > > makes much more sense. > > > > > > I don't quite get it. Sorry. Maybe I will revisit this later. > > > > Halil, I think I can clarify this. > > > > The "iommu_platform" flag doesn't necessarily have anything to do with > > an iommu, although it often will. Basically it means "access guest > > memory via the bus's normal DMA mechanism" rather than "access guest > > memory using GPA, because you're the hypervisor and you can do that". > > > > Unfortunately, I don't think this is what is conveyed to the end users. > Let's see what do we have documented: > > Neither Qemu user documentation > (https://www.qemu.org/docs/master/qemu-doc.html) nor online help: > qemu-system-s390x -device virtio-net-ccw,?|grep iommu > iommu_platform=<bool> - on/off (default: false) > has any documentation on it. Now, that's 'helpful' -- this certainly calls out for a bit of doc... > > But libvirt does have have documenttion on the knob that contros > iommu_platform for QEMU (when QEMU is managed by libvirt): > """ > Virtio-related options > > QEMU's virtio devices have some attributes related to the virtio > transport under the driver element: The iommu attribute enables the use > of emulated IOMMU by the device. The attribute ats controls the Address > Translation Service support for PCIe devices. This is needed to make use > of IOTLB support (see IOMMU device). Possible values are on or off. > Since 3.5.0 > """ > (https://libvirt.org/formatdomain.html#elementsVirtio) > > Thus it seems the only available documentation says that it "enables the use > of emulated IOMMU by the device". > > And for vhost-user we have > """ > When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated: > > * Guest addresses map to the vhost memory region containing that guest > address. > > When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated: > > * Guest addresses are also called I/O virtual addresses (IOVAs). They are > translated to user addresses via the IOTLB. > """ > (docs/interop/vhost-user.rst) > > > For the case of ccw, both mechanisms end up being the same thing, > > since CCW's normal DMA *is* untranslated GPA access. > > > > Nod. > > > For this reason, the flag in the spec was renamed to ACCESS_PLATFORM, > > but the flag in qemu still has the old name. > > > > Yes, the name in the spec is more neutral. > > > AIUI, Michael is saying you could trivially change the name in qemu > > (obviously you'd need to alias the old name to the new one for > > compatibility). > > > > I could, and the I could also ask the libvirt guys to change <driver > iommu='X'> to <driver access_platform='X'> or similar and to change > their documentation to something that is harder to comprehend. Although > I'm not sure they would like the idea. Hopefully, the documentation can be changed to something that is _not_ harder to comprehend :) (with a bit more text, I suppose.) Renaming to something like access_platform seems like a good idea, even with the required compat dance. > > > > > Actually, the fact that ccw has no translation makes things easier for > > you: you don't really have any impediment to turning ACCESS_PLATFORM > > on by default, since it doesn't make any real change to how things > > work. > > Yeah, it should not, in theory, but currently it does in practice. > Currently vhost will try to do the IOTLB dance (Jason has a patch that > should help with that), and we get the 'use dma api' side effects in the > guest (e.g. virtqueue's data go <2G + some overhead). Nod. > > > > > The remaining difficulty is that the virtio driver - since it can sit > > on multiple buses - won't know this, and will reject the > > ACCESS_PLATFORM flag, even though it could just do what it normally > > does on ccw and it would work. > > Right ACCESS_PLATFORM is a funny feature where the device refuses to > work if the driver does not ack. > > > > > For that case, we could consider a hack in qemu where for virtio-ccw > > devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and > > carry on anyway. Normally we insist that the guest accept the > > ACCESS_PLATFORM flag if offered, because on most platforms they > > *don't* amount to the same thing. > > Jason found a nice way to differentiate between needs translation and > does not need translation. But that patch still requires the ack by the > driver (and as Michael has pointed out we have to consider migration). > > I'm afraid that F_IOMMU_PLATFORM means different things in different > contexts, and that this ain't sufficiently documented. I'm tempted to do > a proper write-up on this (let's hope my motivation will and my time > will allow). I would also very much like to have Conny's opinion on this. More documentation is never a bad idea; but I'm afraid I don't have any further insights at the moment. [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h 2020-02-25 18:08 ` Cornelia Huck @ 2020-02-28 0:23 ` David Gibson 0 siblings, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-28 0:23 UTC (permalink / raw) To: Cornelia Huck Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 3626 bytes --] On Tue, Feb 25, 2020 at 07:08:02PM +0100, Cornelia Huck wrote: > On Mon, 24 Feb 2020 19:49:53 +0100 > Halil Pasic <pasic@linux.ibm.com> wrote: > > > On Mon, 24 Feb 2020 14:33:14 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote: > > > > On Fri, 21 Feb 2020 10:48:15 -0500 > > > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > > > > > On Fri, 21 Feb 2020 14:27:27 +1100 > > > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > > > > > bounce buffers when they are NEEDED. > > > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > > > > > because of the indirections. > > > > > > > > > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > > > > > We need to make sure any new hypervisor (especially one that might require > > > > > > > > bounce buffering) always sets it, > > > > > > > > > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > > > > > But that's really because of other problems with !iommu_platform, not > > > > > > > anything to do with bounce buffering or secure VMs. > > > > > > > > > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > > > > > to know whether the guest needs bounce buffering. As far as the > > > > > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > > > > > just expects to get addresses it can access whether those are GPAs > > > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > > > > > > > > > I very much agree! > > > > > > > > > > > > > > > > > > > > > as was a rather bogus legacy hack > > > > > > > > > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > > > > > public spec and has been widely deployed for many years. We can't > > > > > > > just pretend it didn't happen and move on. > > > > > > > > > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > > > > > we still care about. We can't (automatically) do it only for guests > > > > > > > that need bounce buffering, because the hypervisor doesn't know that > > > > > > > ahead of time. > > We could default to iommu_platform=on on s390 when the host has active > support for protected virtualization... but that's just another kind of > horrible, so let's just pretend I didn't suggest it. Yeah, that would break migration between hosts with the feature and hosts without - for everything, not just protected guests. In general any kind of guest visible configuration change based on host properties is incompatible with the qemu/KVM migration model. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic 2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic @ 2020-02-20 16:06 ` Halil Pasic 2020-02-20 16:13 ` Christoph Hellwig 2020-02-20 20:55 ` Michael S. Tsirkin 2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin ` (3 subsequent siblings) 5 siblings, 2 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw) To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy, Christoph Hellwig Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Viktor Mihajlovski, David Gibson Currently the advanced guest memory protection technologies (AMD SEV, powerpc secure guest technology and s390 Protected VMs) abuse the VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which is in turn necessary, to make IO work with guest memory protection. But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a different beast: with virtio devices whose implementation runs on an SMP CPU we are still fine with doing all the usual optimizations, it is just that we need to make sure that the memory protection mechanism does not get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the side of the guest (and possibly he host side as well) than we actually need. An additional benefit of teaching the guest to make the right decision (and use DMA API) on it's own is: removing the need, to mandate special VM configuration for guests that may run with protection. This is especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all the virtio control structures into the first 2G of guest memory: something we don't necessarily want to do per-default. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Ram Pai <linuxram@us.ibm.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> --- drivers/virtio/virtio_ring.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 867c7ebd3f10..fafc8f924955 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) if (!virtio_has_iommu_quirk(vdev)) return true; + if (force_dma_unencrypted(&vdev->dev)) + return true; + /* Otherwise, we are left to guess. */ /* * In theory, it's possible to have a buggy QEMU-supposed -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic @ 2020-02-20 16:13 ` Christoph Hellwig 2020-02-21 2:59 ` David Gibson 2020-02-21 14:33 ` Halil Pasic 2020-02-20 20:55 ` Michael S. Tsirkin 1 sibling, 2 replies; 58+ messages in thread From: Christoph Hellwig @ 2020-02-20 16:13 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 867c7ebd3f10..fafc8f924955 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > if (!virtio_has_iommu_quirk(vdev)) > return true; > > + if (force_dma_unencrypted(&vdev->dev)) > + return true; Hell no. This is a detail of the platform DMA direct implementation. Drivers have no business looking at this flag, and virtio finally needs to be fixed to use the DMA API properly for everything but legacy devices. No amount of desparate hacks is going to fix that fundamental problem, and I'm starting to get really sick and tired of all the crap patches published in this area. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 16:13 ` Christoph Hellwig @ 2020-02-21 2:59 ` David Gibson 2020-02-21 3:41 ` Jason Wang ` (2 more replies) 2020-02-21 14:33 ` Halil Pasic 1 sibling, 3 replies; 58+ messages in thread From: David Gibson @ 2020-02-21 2:59 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy [-- Attachment #1.1: Type: text/plain, Size: 2208 bytes --] On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 867c7ebd3f10..fafc8f924955 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > + if (force_dma_unencrypted(&vdev->dev)) > > + return true; > > Hell no. This is a detail of the platform DMA direct implementation. > Drivers have no business looking at this flag, and virtio finally needs > to be fixed to use the DMA API properly for everything but legacy devices. So, this patch definitely isn't right as it stands, but I'm struggling to understand what it is you're saying is the right way. By "legacy devices" I assume you mean pre-virtio-1.0 devices, that lack the F_VERSION_1 feature flag. Is that right? Because I don't see how being a legacy device or not relates to use of the DMA API. I *think* what you are suggesting here is that virtio devices that have !F_IOMMU_PLATFORM should have their dma_ops set up so that the DMA API treats IOVA==PA, which will satisfy what the device expects. Then the virtio driver can use the DMA API the same way for both F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices. But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices, then AFAICT it will work equally well for legacy devices. Using the DMA API for *everything* in virtio, legacy or not, seems like a reasonable approach to me. But, AFAICT, that does require the DMA layer to have some kind of explicit call to turn on this behaviour, which the virtio driver would call during initializsation. I don't think we can do it 100% within the DMA layer, because only the driver can reasonably know when a device has this weird non-standard DMA behaviour. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 2:59 ` David Gibson @ 2020-02-21 3:41 ` Jason Wang 2020-02-21 13:31 ` Halil Pasic 2020-02-21 13:27 ` Halil Pasic 2020-02-21 16:36 ` Christoph Hellwig 2 siblings, 1 reply; 58+ messages in thread From: Jason Wang @ 2020-02-21 3:41 UTC (permalink / raw) To: David Gibson, Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy On 2020/2/21 上午10:59, David Gibson wrote: > On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote: >> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index 867c7ebd3f10..fafc8f924955 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>> if (!virtio_has_iommu_quirk(vdev)) >>> return true; >>> >>> + if (force_dma_unencrypted(&vdev->dev)) >>> + return true; >> Hell no. This is a detail of the platform DMA direct implementation. >> Drivers have no business looking at this flag, and virtio finally needs >> to be fixed to use the DMA API properly for everything but legacy devices. > So, this patch definitely isn't right as it stands, but I'm struggling > to understand what it is you're saying is the right way. > > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that > lack the F_VERSION_1 feature flag. Is that right? Because I don't > see how being a legacy device or not relates to use of the DMA API. > > I *think* what you are suggesting here is that virtio devices that > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the > DMA API treats IOVA==PA, which will satisfy what the device expects. Can this work for swiotlb? Thanks > Then the virtio driver can use the DMA API the same way for both > F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices. > > But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices, > then AFAICT it will work equally well for legacy devices. > > Using the DMA API for *everything* in virtio, legacy or not, seems > like a reasonable approach to me. But, AFAICT, that does require the > DMA layer to have some kind of explicit call to turn on this > behaviour, which the virtio driver would call during initializsation. > I don't think we can do it 100% within the DMA layer, because only the > driver can reasonably know when a device has this weird non-standard > DMA behaviour. > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 3:41 ` Jason Wang @ 2020-02-21 13:31 ` Halil Pasic 0 siblings, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:31 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, David Gibson, Robin Murphy, Christoph Hellwig On Fri, 21 Feb 2020 11:41:57 +0800 Jason Wang <jasowang@redhat.com> wrote: > > I *think* what you are suggesting here is that virtio devices that > > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the > > DMA API treats IOVA==PA, which will satisfy what the device expects. > > > Can this work for swiotlb? It works on s390. I guess it would be the responsibility of however provides the dma ops for the virtio device to ensure that if !F_IOMMU_PLATFORM the addresses are GPA like *mandated* by the VIRTIO specification. Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 2:59 ` David Gibson 2020-02-21 3:41 ` Jason Wang @ 2020-02-21 13:27 ` Halil Pasic 2020-02-21 16:36 ` Christoph Hellwig 2 siblings, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:27 UTC (permalink / raw) To: David Gibson Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 3282 bytes --] On Fri, 21 Feb 2020 13:59:15 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote: > > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index 867c7ebd3f10..fafc8f924955 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > if (!virtio_has_iommu_quirk(vdev)) > > > return true; > > > > > > + if (force_dma_unencrypted(&vdev->dev)) > > > + return true; > > > > Hell no. This is a detail of the platform DMA direct implementation. > > Drivers have no business looking at this flag, and virtio finally needs > > to be fixed to use the DMA API properly for everything but legacy devices. > > So, this patch definitely isn't right as it stands, but I'm struggling > to understand what it is you're saying is the right way. > > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that > lack the F_VERSION_1 feature flag. Is that right? Because I don't > see how being a legacy device or not relates to use of the DMA API. > > I *think* what you are suggesting here is that virtio devices that > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the > DMA API treats IOVA==PA, which will satisfy what the device expects. > Then the virtio driver can use the DMA API the same way for both > F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices. I've considered this idea, and as a matter a fact would be my preferred solution. It would be, if we could use GFP_DMA when allocating coherent memory through the DMA API. For CCW devices we *must* have some of the device accessible memory 31 bit addressable (in physical address space), because the s390 architecture *mandates it*. I had patches (https://lkml.org/lkml/2019/9/23/313) that do that but Christoph was not open to the idea. Always pushing all the stuff allocated by virtio devices into the DMA zone is not a good option for us. We did that for protected guests, because that was what Christoph was willing to accept, but not happy with it. I'm willing to go down that rute, but I really need the capability to tell the DMA API 'do this allocation from ZONE_DMA'. I don't care how. > > But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices, > then AFAICT it will work equally well for legacy devices. I agree. > > Using the DMA API for *everything* in virtio, legacy or not, seems > like a reasonable approach to me. But, AFAICT, that does require the > DMA layer to have some kind of explicit call to turn on this > behaviour, which the virtio driver would call during initializsation. > I don't think we can do it 100% within the DMA layer, because only the > driver can reasonably know when a device has this weird non-standard > DMA behaviour. > IMHO one could make the DMA API work with different DMA ops. The virtio transport or the virtio core can probably intervene. But then, we (virtio-ccw) would need the iommu=off DMA ops to do the bounce buffering iff force_dma_unencrypted(). Regards, Halil [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 2:59 ` David Gibson 2020-02-21 3:41 ` Jason Wang 2020-02-21 13:27 ` Halil Pasic @ 2020-02-21 16:36 ` Christoph Hellwig 2020-02-24 6:50 ` David Gibson 2020-02-24 18:59 ` Halil Pasic 2 siblings, 2 replies; 58+ messages in thread From: Christoph Hellwig @ 2020-02-21 16:36 UTC (permalink / raw) To: David Gibson Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 01:59:15PM +1100, David Gibson wrote: > > Hell no. This is a detail of the platform DMA direct implementation. > > Drivers have no business looking at this flag, and virtio finally needs > > to be fixed to use the DMA API properly for everything but legacy devices. > > So, this patch definitely isn't right as it stands, but I'm struggling > to understand what it is you're saying is the right way. > > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that > lack the F_VERSION_1 feature flag. Is that right? Because I don't > see how being a legacy device or not relates to use of the DMA API. No. "legacy" is anything that does not set F_ACCESS_PLATFORM. > I *think* what you are suggesting here is that virtio devices that > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the > DMA API treats IOVA==PA, which will satisfy what the device expects. > Then the virtio driver can use the DMA API the same way for both > F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices. No. Those should just keep using the existing legacy non-dma ops case and be done with it. No changes to that and most certainly no new features. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 16:36 ` Christoph Hellwig @ 2020-02-24 6:50 ` David Gibson 2020-02-24 18:59 ` Halil Pasic 1 sibling, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-24 6:50 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy [-- Attachment #1.1: Type: text/plain, Size: 1758 bytes --] On Fri, Feb 21, 2020 at 05:36:45PM +0100, Christoph Hellwig wrote: > On Fri, Feb 21, 2020 at 01:59:15PM +1100, David Gibson wrote: > > > Hell no. This is a detail of the platform DMA direct implementation. > > > Drivers have no business looking at this flag, and virtio finally needs > > > to be fixed to use the DMA API properly for everything but legacy devices. > > > > So, this patch definitely isn't right as it stands, but I'm struggling > > to understand what it is you're saying is the right way. > > > > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that > > lack the F_VERSION_1 feature flag. Is that right? Because I don't > > see how being a legacy device or not relates to use of the DMA API. > > No. "legacy" is anything that does not set F_ACCESS_PLATFORM. Hm, I see. The trouble is I think we can only reasonably call things "legacy" when essentially all currently in use OSes have support for the new, better way of doing things. That is, alas, not really the case for F_ACCESS_PLATFORM. > > I *think* what you are suggesting here is that virtio devices that > > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the > > DMA API treats IOVA==PA, which will satisfy what the device expects. > > Then the virtio driver can use the DMA API the same way for both > > F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices. > > No. Those should just keep using the existing legacy non-dma ops > case and be done with it. No changes to that and most certainly > no new features. > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 16:36 ` Christoph Hellwig 2020-02-24 6:50 ` David Gibson @ 2020-02-24 18:59 ` Halil Pasic 1 sibling, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-24 18:59 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, David Gibson, Robin Murphy On Fri, 21 Feb 2020 17:36:45 +0100 Christoph Hellwig <hch@lst.de> wrote: > > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that > > lack the F_VERSION_1 feature flag. Is that right? Because I don't > > see how being a legacy device or not relates to use of the DMA API. > > No. "legacy" is anything that does not set F_ACCESS_PLATFORM. FYI in virtio-speak the term 'legacy devices' is already taken and it ain't congruent with your intended semantics. Please look it up https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-60001 But with that understood your statement does provide insisting in how do you see F_ACCESS_PLATFORM. I.e. something that should be enabled in general, except for legacy reasons. But I'm afraid Michael sees it differently: i.e. something that should be enabled when necessary, and otherwise not (because it is likely to cost performance). Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 16:13 ` Christoph Hellwig 2020-02-21 2:59 ` David Gibson @ 2020-02-21 14:33 ` Halil Pasic 2020-02-21 16:39 ` Christoph Hellwig 2020-02-22 19:07 ` Michael S. Tsirkin 1 sibling, 2 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 14:33 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy On Thu, 20 Feb 2020 17:13:09 +0100 Christoph Hellwig <hch@lst.de> wrote: > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 867c7ebd3f10..fafc8f924955 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > + if (force_dma_unencrypted(&vdev->dev)) > > + return true; > > Hell no. This is a detail of the platform DMA direct implementation. I beg to differ. If it was a detail of the DMA direct implementation, it should have/would have been private to kernel/dma/direct.c. A look at $ git grep -e force_dma_unencrypted arch/powerpc/include/asm/mem_encrypt.h:static inline bool force_dma_unencrypted(struct device *dev) arch/s390/mm/init.c:bool force_dma_unencrypted(struct device *dev) arch/x86/mm/mem_encrypt.c:bool force_dma_unencrypted(struct device *dev) include/linux/dma-direct.h:bool force_dma_unencrypted(struct device *dev); include/linux/dma-direct.h:static inline bool force_dma_unencrypted(struct device *dev) kernel/dma/direct.c: if (force_dma_unencrypted(dev)) kernel/dma/direct.c: if (force_dma_unencrypted(dev)) kernel/dma/direct.c: !force_dma_unencrypted(dev)) { kernel/dma/direct.c: if (force_dma_unencrypted(dev)) kernel/dma/direct.c: if (force_dma_unencrypted(dev)) kernel/dma/direct.c: !force_dma_unencrypted(dev)) { kernel/dma/direct.c: if (force_dma_unencrypted(dev)) tells you, that force_dma_unencrypted() is *consumed* by dma direct, but *provided* by the memory encryption or memory management code. I.e. platform code tells the dma (direct) code what decisions to make under certain circumstances. Consider what would we have to do to make PCI devices do I/O trough pages that were shared when the guest is running in a protected VM. The s390_pci_dma_ops would also need to know whether to 'force dma uencrypted' or not, and it's the exact same logic. I doubt simply using DMA direct for zPCI would do, because we still have to do all the Z specific IOMMU management. > Drivers have no business looking at this flag, and virtio finally needs > to be fixed to use the DMA API properly for everything but legacy devices. See the follow on discussion with David Gibson. In short: I'm in favor of always using DMA API iff we keep conformance with the VIRTIO spec and if it does not imply any degradations for s390 (virtio-ccw), or any other regressions. > > No amount of desparate hacks is going to fix that fundamental problem, > and I'm starting to get really sick and tired of all the crap patches > published in this area. I don't think I deserve the offensive language. AFAIU you have a positive attitude towards the idea, that !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' should be scrapped. I would like to accomplish that without adverse effects to virtio-ccw (because caring for virtio-ccw is a part of job description). Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 14:33 ` Halil Pasic @ 2020-02-21 16:39 ` Christoph Hellwig 2020-02-21 18:16 ` Halil Pasic 2020-02-22 19:07 ` Michael S. Tsirkin 1 sibling, 1 reply; 58+ messages in thread From: Christoph Hellwig @ 2020-02-21 16:39 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: > > Hell no. This is a detail of the platform DMA direct implementation. > > I beg to differ. If it was a detail of the DMA direct implementation, it > should have/would have been private to kernel/dma/direct.c. It can't given that platforms have to implement it. It is an arch hook for dma-direct. > Consider what would we have to do to make PCI devices do I/O trough > pages that were shared when the guest is running in a protected VM. The > s390_pci_dma_ops would also need to know whether to 'force dma uencrypted' > or not, and it's the exact same logic. I doubt simply using DMA direct > for zPCI would do, because we still have to do all the Z specific IOMMU > management. And your IOMMU can't deal with the encryption bit? In the case we could think of allowing IOMMU implementation to access it. But the point that it is an internal detail of the DMA implementation and by now means for drivers. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 16:39 ` Christoph Hellwig @ 2020-02-21 18:16 ` Halil Pasic 0 siblings, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 18:16 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy On Fri, 21 Feb 2020 17:39:38 +0100 Christoph Hellwig <hch@lst.de> wrote: > On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: > > > Hell no. This is a detail of the platform DMA direct implementation. > > > > I beg to differ. If it was a detail of the DMA direct implementation, it > > should have/would have been private to kernel/dma/direct.c. > > It can't given that platforms have to implement it. It is an arch hook > for dma-direct. > > > Consider what would we have to do to make PCI devices do I/O trough > > pages that were shared when the guest is running in a protected VM. The > > s390_pci_dma_ops would also need to know whether to 'force dma uencrypted' > > or not, and it's the exact same logic. I doubt simply using DMA direct > > for zPCI would do, because we still have to do all the Z specific IOMMU > > management. > > And your IOMMU can't deal with the encryption bit? There is no encrypt bit, and our memory is not encrypted, but protected. Means e.g. when buggy/malicious hypervisor tries to read a protected page it wont get ciphertext, but a slap on its finger. In order do make memory accessible to the hypervisor (or another guest, or a real device) the guest must make a so called utlravisor call (talk to the firmware) and share the respective page. We tapped into the memory encryption infrastructure, because both is protecting the guest memory form the host (just by different means), and because it made no sense to build up something in parallel when most of the stuff we need was already there. But most unfortunately the names are deceiving when it comes to s390 protected virtualization and it's guest I/O enablement. > In the case we > could think of allowing IOMMU implementation to access it. But the > point that it is an internal detail of the DMA implementation and by > now means for drivers. From the perspective, that any driver that does anything remotely DMAish, that is, some external entity (possibly a hypervisor, possibly a channel subsystem, possibly a DMA controller) to should the memory, should do DMA API first, to make sure, the DMAish goes well, your argument makes perfect sense. But form that perspective !F_ACCESS_PLATFORM is also a DMAish. And the virtio spec mandates that !F_ACCESS_PLATFORM implies GPA's. For virtio-ccw I want GPA's and not IOVA's on s390, for virtio-pci, which we also support in general but not with protected virtualization, well, it's a different story. With protected visualization however I must make sure all I/O goes through shared pages. We use swiotlb for that. But then the old infrastructure won't cut it. Jet we still need GPA's on the ring (with the extra requirement that the page must be shared). DMA API is a nice fit there because we can allocate DMA coherent memory (such that what comes back from our DMA ops is a GPA), so we have shared memory that the hypervisor and the guest is allowed to look at concurrently, and for the buffers that are going to be put on the vring, we can use the streaming API, which uses bounce buffers. The returned IOVA (in DMA API speak) is a GPA of the bounce buffer, and the guest is not allowed to peek until it unmaps, so everything is cozy. But for that to work, we all (AMD SEV, power, and s390) must go through the DMA API, because the old infrastructure in virtio core simply won't cut it. And it has nothing to do with the device. David explained it very well. My series is about controlling virtio-core's usage of DMA API. I believe, I did it in a way that doesn't hurt any arch at the moment. Maybe the conflict can be resolved if the transport gets a say in whether to use the DMA API or not. In the end the VIRTIO spec does say that "Whether accesses are actually limited or translated is described by platform-specific means." Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 14:33 ` Halil Pasic 2020-02-21 16:39 ` Christoph Hellwig @ 2020-02-22 19:07 ` Michael S. Tsirkin 2020-02-24 17:16 ` Christoph Hellwig 1 sibling, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-22 19:07 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: > AFAIU you have a positive attitude towards the idea, that > !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' > should be scrapped. > > I would like to accomplish that without adverse effects to virtio-ccw > (because caring for virtio-ccw is a part of job description). > > Regards, > Halil It is possible, in theory. IIRC the main challenge is that DMA API has overhead of indirect function calls even when all it does it return back the PA without changes. So that will lead to a measureable performance degradation. That might be fixable, possibly using some kind of logic along the lines of if (iova is pa) return pa else indirect call and for unmapping, we might need an API that says "unmap is a nop, safe to skip" so we don't maintain the dma address until unmap. -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-22 19:07 ` Michael S. Tsirkin @ 2020-02-24 17:16 ` Christoph Hellwig 2020-10-28 14:24 ` Alexander Graf via iommu 0 siblings, 1 reply; 58+ messages in thread From: Christoph Hellwig @ 2020-02-24 17:16 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote: > On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: > > AFAIU you have a positive attitude towards the idea, that > > !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' > > should be scrapped. > > > > I would like to accomplish that without adverse effects to virtio-ccw > > (because caring for virtio-ccw is a part of job description). > > > > Regards, > > Halil > > It is possible, in theory. IIRC the main challenge is that DMA API > has overhead of indirect function calls even when all it > does it return back the PA without changes. That overhead is gone now, the DMA direct calls are direct calls these days. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-24 17:16 ` Christoph Hellwig @ 2020-10-28 14:24 ` Alexander Graf via iommu 2020-10-28 18:01 ` Michael S. Tsirkin 0 siblings, 1 reply; 58+ messages in thread From: Alexander Graf via iommu @ 2020-10-28 14:24 UTC (permalink / raw) To: Christoph Hellwig, Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy On 24.02.20 18:16, Christoph Hellwig wrote: > On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote: >> On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: >>> AFAIU you have a positive attitude towards the idea, that >>> !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' >>> should be scrapped. >>> >>> I would like to accomplish that without adverse effects to virtio-ccw >>> (because caring for virtio-ccw is a part of job description). >>> >>> Regards, >>> Halil >> >> It is possible, in theory. IIRC the main challenge is that DMA API >> has overhead of indirect function calls even when all it >> does it return back the PA without changes. > > That overhead is gone now, the DMA direct calls are direct calls these > days. Michael, would that mitigate your concerns to just always use the DMA API? If not, wouldn't it make sense to benchmark and pinpoint Christoph to paths that do slow things down, so we can finally move virtio into a world where it doesn't bypass the kernel DMA infrastructure. Alex Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-10-28 14:24 ` Alexander Graf via iommu @ 2020-10-28 18:01 ` Michael S. Tsirkin 0 siblings, 0 replies; 58+ messages in thread From: Michael S. Tsirkin @ 2020-10-28 18:01 UTC (permalink / raw) To: Alexander Graf Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Wed, Oct 28, 2020 at 03:24:03PM +0100, Alexander Graf wrote: > On 24.02.20 18:16, Christoph Hellwig wrote: > > On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote: > > > On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote: > > > > AFAIU you have a positive attitude towards the idea, that > > > > !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' > > > > should be scrapped. > > > > > > > > I would like to accomplish that without adverse effects to virtio-ccw > > > > (because caring for virtio-ccw is a part of job description). > > > > > > > > Regards, > > > > Halil > > > > > > It is possible, in theory. IIRC the main challenge is that DMA API > > > has overhead of indirect function calls even when all it > > > does it return back the PA without changes. > > > > That overhead is gone now, the DMA direct calls are direct calls these > > days. > > Michael, would that mitigate your concerns to just always use the DMA API? > If not, wouldn't it make sense to benchmark and pinpoint Christoph to paths > that do slow things down, so we can finally move virtio into a world where > it doesn't bypass the kernel DMA infrastructure. > > > Alex There's no specific concern really. One can in theory move the code handling !VIRTIO_F_ACCESS_PLATFORM such that instead of having a branch in virtio code, you instead override platform DMA API and then use DMA API unconditionally. The gain from that will probably be marginal, but maybe I'm wrong. Let's see the patches. We still need to do the override when !VIRTIO_F_ACCESS_PLATFORM, according to spec. Encrypted hypervisors still need to set VIRTIO_F_ACCESS_PLATFORM. Is VIRTIO_F_ACCESS_PLATFORM a good default? Need to support legacy guests does not allow that at the qemu level, we need to be conservative there. But yes, if you have a very modern guest then it might be a good idea to just always enable VIRTIO_F_ACCESS_PLATFORM. I say might since unfortunately most people only use VIRTIO_F_ACCESS_PLATFORM with vIOMMU, using it without VIRTIO_F_ACCESS_PLATFORM is supported but is a path less well trodden. Benchmarking that, fixing issues that surface if any would be imho effort well spent, and would be a prerequisite to making it a default in a production system. > > > > > Amazon Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > Sitz: Berlin > Ust-ID: DE 289 237 879 > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic 2020-02-20 16:13 ` Christoph Hellwig @ 2020-02-20 20:55 ` Michael S. Tsirkin 2020-02-21 1:17 ` Ram Pai 2020-02-21 13:12 ` Halil Pasic 1 sibling, 2 replies; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-20 20:55 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > Currently the advanced guest memory protection technologies (AMD SEV, > powerpc secure guest technology and s390 Protected VMs) abuse the > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > is in turn necessary, to make IO work with guest memory protection. > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > different beast: with virtio devices whose implementation runs on an SMP > CPU we are still fine with doing all the usual optimizations, it is just > that we need to make sure that the memory protection mechanism does not > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > side of the guest (and possibly he host side as well) than we actually > need. > > An additional benefit of teaching the guest to make the right decision > (and use DMA API) on it's own is: removing the need, to mandate special > VM configuration for guests that may run with protection. This is > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > the virtio control structures into the first 2G of guest memory: > something we don't necessarily want to do per-default. > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > Tested-by: Ram Pai <linuxram@us.ibm.com> > Tested-by: Michael Mueller <mimu@linux.ibm.com> This might work for you but it's fragile, since without VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets GPA's, not DMA addresses. IOW this looks like another iteration of: virtio: Support encrypted memory on powerpc secure guests which I was under the impression was abandoned as unnecessary. To summarize, the necessary conditions for a hack along these lines (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: - secure guest mode is enabled - so we know that since we don't share most memory regular virtio code won't work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM - DMA API is giving us addresses that are actually also physical addresses - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM I don't see how this patch does this. > --- > drivers/virtio/virtio_ring.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 867c7ebd3f10..fafc8f924955 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > if (!virtio_has_iommu_quirk(vdev)) > return true; > > + if (force_dma_unencrypted(&vdev->dev)) > + return true; > + > /* Otherwise, we are left to guess. */ > /* > * In theory, it's possible to have a buggy QEMU-supposed > -- > 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* RE: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 20:55 ` Michael S. Tsirkin @ 2020-02-21 1:17 ` Ram Pai 2020-02-21 3:29 ` David Gibson 2020-02-21 13:12 ` Halil Pasic 1 sibling, 1 reply; 58+ messages in thread From: Ram Pai @ 2020-02-21 1:17 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote: > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > Currently the advanced guest memory protection technologies (AMD SEV, > > powerpc secure guest technology and s390 Protected VMs) abuse the > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > > is in turn necessary, to make IO work with guest memory protection. > > > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > > different beast: with virtio devices whose implementation runs on an SMP > > CPU we are still fine with doing all the usual optimizations, it is just > > that we need to make sure that the memory protection mechanism does not > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > > side of the guest (and possibly he host side as well) than we actually > > need. > > > > An additional benefit of teaching the guest to make the right decision > > (and use DMA API) on it's own is: removing the need, to mandate special > > VM configuration for guests that may run with protection. This is > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > > the virtio control structures into the first 2G of guest memory: > > something we don't necessarily want to do per-default. > > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > > Tested-by: Ram Pai <linuxram@us.ibm.com> > > Tested-by: Michael Mueller <mimu@linux.ibm.com> > > This might work for you but it's fragile, since without > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets > GPA's, not DMA addresses. > > > > IOW this looks like another iteration of: > > virtio: Support encrypted memory on powerpc secure guests > > which I was under the impression was abandoned as unnecessary. It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM; by default, flag on powerpc. We would like to enable secure guests on powerpc without this flag aswell enabled, but past experience has educated us that its not a easy path. However if Halil makes some inroads in this path for s390, we will like to support him. RP _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 1:17 ` Ram Pai @ 2020-02-21 3:29 ` David Gibson 0 siblings, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-21 3:29 UTC (permalink / raw) To: Ram Pai Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 2599 bytes --] On Thu, Feb 20, 2020 at 05:17:48PM -0800, Ram Pai wrote: > On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote: > > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > > Currently the advanced guest memory protection technologies (AMD SEV, > > > powerpc secure guest technology and s390 Protected VMs) abuse the > > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > > > is in turn necessary, to make IO work with guest memory protection. > > > > > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > > > different beast: with virtio devices whose implementation runs on an SMP > > > CPU we are still fine with doing all the usual optimizations, it is just > > > that we need to make sure that the memory protection mechanism does not > > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > > > side of the guest (and possibly he host side as well) than we actually > > > need. > > > > > > An additional benefit of teaching the guest to make the right decision > > > (and use DMA API) on it's own is: removing the need, to mandate special > > > VM configuration for guests that may run with protection. This is > > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > > > the virtio control structures into the first 2G of guest memory: > > > something we don't necessarily want to do per-default. > > > > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > > > Tested-by: Ram Pai <linuxram@us.ibm.com> > > > Tested-by: Michael Mueller <mimu@linux.ibm.com> > > > > This might work for you but it's fragile, since without > > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets > > GPA's, not DMA addresses. > > > > > > > > IOW this looks like another iteration of: > > > > virtio: Support encrypted memory on powerpc secure guests > > > > which I was under the impression was abandoned as unnecessary. > > It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM; > by default, flag on powerpc. Uh... we haven't yet, though we're working on it. > We would like to enable secure guests on powerpc without this flag > aswell enabled, but past experience has educated us that its not a easy > path. However if Halil makes some inroads in this path for s390, we > will like to support him. > > > RP > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-20 20:55 ` Michael S. Tsirkin 2020-02-21 1:17 ` Ram Pai @ 2020-02-21 13:12 ` Halil Pasic 2020-02-21 15:39 ` Tom Lendacky 2020-02-21 15:56 ` Michael S. Tsirkin 1 sibling, 2 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:12 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, 20 Feb 2020 15:55:14 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > Currently the advanced guest memory protection technologies (AMD SEV, > > powerpc secure guest technology and s390 Protected VMs) abuse the > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > > is in turn necessary, to make IO work with guest memory protection. > > > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > > different beast: with virtio devices whose implementation runs on an SMP > > CPU we are still fine with doing all the usual optimizations, it is just > > that we need to make sure that the memory protection mechanism does not > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > > side of the guest (and possibly he host side as well) than we actually > > need. > > > > An additional benefit of teaching the guest to make the right decision > > (and use DMA API) on it's own is: removing the need, to mandate special > > VM configuration for guests that may run with protection. This is > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > > the virtio control structures into the first 2G of guest memory: > > something we don't necessarily want to do per-default. > > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > > Tested-by: Ram Pai <linuxram@us.ibm.com> > > Tested-by: Michael Mueller <mimu@linux.ibm.com> > > This might work for you but it's fragile, since without > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets > GPA's, not DMA addresses. > Thanks for your constructive approach. I do want the hypervisor to assume it gets GPA's. My train of thought was that the guys that need to use IOVA's that are not GPA's when force_dma_unencrypted() will have to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because otherwise it won't work. But I see your point: in case of a mis-configuration and provided the DMA API returns IOVA's one could end up trying to touch wrong memory locations. But this should be similar to what would happen if DMA ops are not used, and memory is not made accessible. > > > IOW this looks like another iteration of: > > virtio: Support encrypted memory on powerpc secure guests > > which I was under the impression was abandoned as unnecessary. Unnecessary for powerpc because they do normal PCI. In the context of CCW there are only guest physical addresses (CCW I/O has no concept of IOMMU or IOVAs). > > > To summarize, the necessary conditions for a hack along these lines > (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: > > - secure guest mode is enabled - so we know that since we don't share > most memory regular virtio code won't > work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM force_dma_unencrypted(&vdev->dev) is IMHO exactly about this. > - DMA API is giving us addresses that are actually also physical > addresses In case of s390 this is given. I talked with the power people before posting this, and they ensured me they can are willing to deal with this. I was hoping to talk abut this with the AMD SEV people here (hence the cc). > - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM > I don't get this point. The argument where the hypervisor is buggy is a bit hard to follow for me. If hypervisor is buggy we have already lost anyway most of the time, or? > I don't see how this patch does this. I do get your point. I don't know of a good way to check that DMA API is giving us addresses that are actually physical addresses, and the situation you describe definitely has some risk to it. Let me comment on other ideas that came up. I would be very happy to go with the best one. Thank you very much. Regards, Halil > > > > --- > > drivers/virtio/virtio_ring.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 867c7ebd3f10..fafc8f924955 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > + if (force_dma_unencrypted(&vdev->dev)) > > + return true; > > + > > /* Otherwise, we are left to guess. */ > > /* > > * In theory, it's possible to have a buggy QEMU-supposed > > -- > > 2.17.1 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 13:12 ` Halil Pasic @ 2020-02-21 15:39 ` Tom Lendacky 2020-02-24 6:40 ` David Gibson 2020-02-21 15:56 ` Michael S. Tsirkin 1 sibling, 1 reply; 58+ messages in thread From: Tom Lendacky @ 2020-02-21 15:39 UTC (permalink / raw) To: Halil Pasic, Michael S. Tsirkin Cc: linux-s390, Huang, Wei, Singh, Brijesh, Janosch Frank, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On 2/21/20 7:12 AM, Halil Pasic wrote: > On Thu, 20 Feb 2020 15:55:14 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > >> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: >>> Currently the advanced guest memory protection technologies (AMD SEV, >>> powerpc secure guest technology and s390 Protected VMs) abuse the >>> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which >>> is in turn necessary, to make IO work with guest memory protection. >>> >>> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a >>> different beast: with virtio devices whose implementation runs on an SMP >>> CPU we are still fine with doing all the usual optimizations, it is just >>> that we need to make sure that the memory protection mechanism does not >>> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the >>> side of the guest (and possibly he host side as well) than we actually >>> need. >>> >>> An additional benefit of teaching the guest to make the right decision >>> (and use DMA API) on it's own is: removing the need, to mandate special >>> VM configuration for guests that may run with protection. This is >>> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all >>> the virtio control structures into the first 2G of guest memory: >>> something we don't necessarily want to do per-default. >>> >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> >>> Tested-by: Ram Pai <linuxram@us.ibm.com> >>> Tested-by: Michael Mueller <mimu@linux.ibm.com> >> >> This might work for you but it's fragile, since without >> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets >> GPA's, not DMA addresses. >> > > Thanks for your constructive approach. I do want the hypervisor to > assume it gets GPA's. My train of thought was that the guys that need > to use IOVA's that are not GPA's when force_dma_unencrypted() will have > to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because > otherwise it won't work. But I see your point: in case of a > mis-configuration and provided the DMA API returns IOVA's one could end > up trying to touch wrong memory locations. But this should be similar to > what would happen if DMA ops are not used, and memory is not made accessible. > >> >> >> IOW this looks like another iteration of: >> >> virtio: Support encrypted memory on powerpc secure guests >> >> which I was under the impression was abandoned as unnecessary. > > Unnecessary for powerpc because they do normal PCI. In the context of > CCW there are only guest physical addresses (CCW I/O has no concept of > IOMMU or IOVAs). > >> >> >> To summarize, the necessary conditions for a hack along these lines >> (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: >> >> - secure guest mode is enabled - so we know that since we don't share >> most memory regular virtio code won't >> work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM > > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this. > >> - DMA API is giving us addresses that are actually also physical >> addresses > > In case of s390 this is given. I talked with the power people before > posting this, and they ensured me they can are willing to deal with > this. I was hoping to talk abut this with the AMD SEV people here (hence > the cc). Yes, physical addresses are fine for SEV - the key is that the DMA API is used so that an address for unencrypted, or shared, memory is returned. E.g. for a dma_alloc_coherent() call this is an allocation that has had set_memory_decrypted() called or for a dma_map_page() call this is an address from SWIOTLB, which was mapped shared during boot, where the data will be bounce-buffered. We don't currently support an emulated IOMMU in our SEV guest because that would require a lot of support in the driver to make IOMMU data available to the hypervisor (I/O page tables, etc.). We would need hardware support to really make this work easily in the guest. Thanks, Tom > >> - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM >> > > I don't get this point. The argument where the hypervisor is buggy is a > bit hard to follow for me. If hypervisor is buggy we have already lost > anyway most of the time, or? > >> I don't see how this patch does this. > > I do get your point. I don't know of a good way to check that DMA API > is giving us addresses that are actually physical addresses, and the > situation you describe definitely has some risk to it. > > Let me comment on other ideas that came up. I would be very happy to go > with the best one. Thank you very much. > > Regards, > Halil > >> >> >>> --- >>> drivers/virtio/virtio_ring.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index 867c7ebd3f10..fafc8f924955 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>> if (!virtio_has_iommu_quirk(vdev)) >>> return true; >>> >>> + if (force_dma_unencrypted(&vdev->dev)) >>> + return true; >>> + >>> /* Otherwise, we are left to guess. */ >>> /* >>> * In theory, it's possible to have a buggy QEMU-supposed >>> -- >>> 2.17.1 >> > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 15:39 ` Tom Lendacky @ 2020-02-24 6:40 ` David Gibson 0 siblings, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-24 6:40 UTC (permalink / raw) To: Tom Lendacky Cc: linux-s390, Huang, Wei, Singh, Brijesh, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 4981 bytes --] On Fri, Feb 21, 2020 at 09:39:38AM -0600, Tom Lendacky wrote: > On 2/21/20 7:12 AM, Halil Pasic wrote: > > On Thu, 20 Feb 2020 15:55:14 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > >> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > >>> Currently the advanced guest memory protection technologies (AMD SEV, > >>> powerpc secure guest technology and s390 Protected VMs) abuse the > >>> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > >>> is in turn necessary, to make IO work with guest memory protection. > >>> > >>> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > >>> different beast: with virtio devices whose implementation runs on an SMP > >>> CPU we are still fine with doing all the usual optimizations, it is just > >>> that we need to make sure that the memory protection mechanism does not > >>> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > >>> side of the guest (and possibly he host side as well) than we actually > >>> need. > >>> > >>> An additional benefit of teaching the guest to make the right decision > >>> (and use DMA API) on it's own is: removing the need, to mandate special > >>> VM configuration for guests that may run with protection. This is > >>> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > >>> the virtio control structures into the first 2G of guest memory: > >>> something we don't necessarily want to do per-default. > >>> > >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > >>> Tested-by: Ram Pai <linuxram@us.ibm.com> > >>> Tested-by: Michael Mueller <mimu@linux.ibm.com> > >> > >> This might work for you but it's fragile, since without > >> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets > >> GPA's, not DMA addresses. > >> > > > > Thanks for your constructive approach. I do want the hypervisor to > > assume it gets GPA's. My train of thought was that the guys that need > > to use IOVA's that are not GPA's when force_dma_unencrypted() will have > > to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because > > otherwise it won't work. But I see your point: in case of a > > mis-configuration and provided the DMA API returns IOVA's one could end > > up trying to touch wrong memory locations. But this should be similar to > > what would happen if DMA ops are not used, and memory is not made accessible. > > > >> > >> > >> IOW this looks like another iteration of: > >> > >> virtio: Support encrypted memory on powerpc secure guests > >> > >> which I was under the impression was abandoned as unnecessary. > > > > Unnecessary for powerpc because they do normal PCI. In the context of > > CCW there are only guest physical addresses (CCW I/O has no concept of > > IOMMU or IOVAs). > > > >> > >> > >> To summarize, the necessary conditions for a hack along these lines > >> (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: > >> > >> - secure guest mode is enabled - so we know that since we don't share > >> most memory regular virtio code won't > >> work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM > > > > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this. > > > >> - DMA API is giving us addresses that are actually also physical > >> addresses > > > > In case of s390 this is given. I talked with the power people before > > posting this, and they ensured me they can are willing to deal with > > this. I was hoping to talk abut this with the AMD SEV people here (hence > > the cc). > > Yes, physical addresses are fine for SEV - the key is that the DMA API is > used so that an address for unencrypted, or shared, memory is returned. > E.g. for a dma_alloc_coherent() call this is an allocation that has had > set_memory_decrypted() called or for a dma_map_page() call this is an > address from SWIOTLB, which was mapped shared during boot, where the data > will be bounce-buffered. > > We don't currently support an emulated IOMMU in our SEV guest because that > would require a lot of support in the driver to make IOMMU data available > to the hypervisor (I/O page tables, etc.). We would need hardware support > to really make this work easily in the guest. A tangent here: note that on POWER our IOMMU is paravirtualized (updated with hypercalls), it's also always enabled. For that reason we can and do combine vIOMMU translation with the need for bounce buffering for secure guests. (We generally statically configure the vIOMMU to have a huge window which just maps GPAs 1-to-1, which means we can still use dma-direct, but the vIOMMU is still there from the platform point of view) -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 13:12 ` Halil Pasic 2020-02-21 15:39 ` Tom Lendacky @ 2020-02-21 15:56 ` Michael S. Tsirkin 2020-02-21 16:35 ` Christoph Hellwig 2020-02-21 18:03 ` Halil Pasic 1 sibling, 2 replies; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-21 15:56 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 02:12:30PM +0100, Halil Pasic wrote: > On Thu, 20 Feb 2020 15:55:14 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote: > > > Currently the advanced guest memory protection technologies (AMD SEV, > > > powerpc secure guest technology and s390 Protected VMs) abuse the > > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which > > > is in turn necessary, to make IO work with guest memory protection. > > > > > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a > > > different beast: with virtio devices whose implementation runs on an SMP > > > CPU we are still fine with doing all the usual optimizations, it is just > > > that we need to make sure that the memory protection mechanism does not > > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the > > > side of the guest (and possibly he host side as well) than we actually > > > need. > > > > > > An additional benefit of teaching the guest to make the right decision > > > (and use DMA API) on it's own is: removing the need, to mandate special > > > VM configuration for guests that may run with protection. This is > > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all > > > the virtio control structures into the first 2G of guest memory: > > > something we don't necessarily want to do per-default. > > > > > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > > > Tested-by: Ram Pai <linuxram@us.ibm.com> > > > Tested-by: Michael Mueller <mimu@linux.ibm.com> > > > > This might work for you but it's fragile, since without > > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets > > GPA's, not DMA addresses. > > > > Thanks for your constructive approach. I do want the hypervisor to > assume it gets GPA's. My train of thought was that the guys that need > to use IOVA's that are not GPA's when force_dma_unencrypted() will have > to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because > otherwise it won't work. But I see your point: in case of a > mis-configuration and provided the DMA API returns IOVA's one could end > up trying to touch wrong memory locations. But this should be similar to > what would happen if DMA ops are not used, and memory is not made accessible. > > > > > > > IOW this looks like another iteration of: > > > > virtio: Support encrypted memory on powerpc secure guests > > > > which I was under the impression was abandoned as unnecessary. > > Unnecessary for powerpc because they do normal PCI. In the context of > CCW there are only guest physical addresses (CCW I/O has no concept of > IOMMU or IOVAs). > > > > > > > To summarize, the necessary conditions for a hack along these lines > > (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: > > > > - secure guest mode is enabled - so we know that since we don't share > > most memory regular virtio code won't > > work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM > > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this. > > > - DMA API is giving us addresses that are actually also physical > > addresses > > In case of s390 this is given. > I talked with the power people before > posting this, and they ensured me they can are willing to deal with > this. I was hoping to talk abut this with the AMD SEV people here (hence > the cc). We'd need a part of DMA API that promises this though. Platform maintainers aren't going to go out of their way to do the right thing just for virtio, and I can't track all arches to make sure they don't violate virtio requirements. > > > - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM > > > > I don't get this point. The argument where the hypervisor is buggy is a > bit hard to follow for me. If hypervisor is buggy we have already lost > anyway most of the time, or? If VIRTIO_F_ACCESS_PLATFORM is set then things just work. If VIRTIO_F_ACCESS_PLATFORM is clear device is supposed to have access to all of memory. You can argue in various ways but it's easier to just declare a behaviour that violates this a bug. Which might still be worth working around, for various reasons. > > I don't see how this patch does this. > > I do get your point. I don't know of a good way to check that DMA API > is giving us addresses that are actually physical addresses, and the > situation you describe definitely has some risk to it. One way would be to extend the DMA API with such an API. Another would be to make virtio always use DMA API and hide the logic in there. This second approach is not easy, in particular since DMA API adds a bunch of overhead which we need to find ways to measure and mitigate. > > Let me comment on other ideas that came up. I would be very happy to go > with the best one. Thank you very much. > > Regards, > Halil > > > > > > > > --- > > > drivers/virtio/virtio_ring.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index 867c7ebd3f10..fafc8f924955 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > if (!virtio_has_iommu_quirk(vdev)) > > > return true; > > > > > > + if (force_dma_unencrypted(&vdev->dev)) > > > + return true; > > > + > > > /* Otherwise, we are left to guess. */ > > > /* > > > * In theory, it's possible to have a buggy QEMU-supposed > > > -- > > > 2.17.1 > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 15:56 ` Michael S. Tsirkin @ 2020-02-21 16:35 ` Christoph Hellwig 2020-02-21 18:03 ` Halil Pasic 1 sibling, 0 replies; 58+ messages in thread From: Christoph Hellwig @ 2020-02-21 16:35 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, Feb 21, 2020 at 10:56:45AM -0500, Michael S. Tsirkin wrote: > > > - DMA API is giving us addresses that are actually also physical > > > addresses > > > > In case of s390 this is given. > > I talked with the power people before > > posting this, and they ensured me they can are willing to deal with > > this. I was hoping to talk abut this with the AMD SEV people here (hence > > the cc). > > We'd need a part of DMA API that promises this though. Platform > maintainers aren't going to go out of their way to do the > right thing just for virtio, and I can't track all arches > to make sure they don't violate virtio requirements. There is no way the DMA API is going to promise that you. All kinds of platforms have offsets to DMA, and they are communicated through interfaces like the device tree or ACPI. But the good thing is that your hypervisors is in control of that, neither the DMA API nor virtio have any business to interfer with that. In fact for harware virtio device we might need such an offset for them to work on lots of SOCs. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected 2020-02-21 15:56 ` Michael S. Tsirkin 2020-02-21 16:35 ` Christoph Hellwig @ 2020-02-21 18:03 ` Halil Pasic 1 sibling, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 18:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, 21 Feb 2020 10:56:45 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Fri, Feb 21, 2020 at 02:12:30PM +0100, Halil Pasic wrote: > > On Thu, 20 Feb 2020 15:55:14 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: [..] > > > To summarize, the necessary conditions for a hack along these lines > > > (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that: > > > > > > - secure guest mode is enabled - so we know that since we don't share > > > most memory regular virtio code won't > > > work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM > > > > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this. > > > > > - DMA API is giving us addresses that are actually also physical > > > addresses > > > > In case of s390 this is given. > > I talked with the power people before > > posting this, and they ensured me they can are willing to deal with > > this. I was hoping to talk abut this with the AMD SEV people here (hence > > the cc). > > We'd need a part of DMA API that promises this though. Platform > maintainers aren't going to go out of their way to do the > right thing just for virtio, and I can't track all arches > to make sure they don't violate virtio requirements. > One would have to track only the arches that have force_dma_unecrypted(), and generally speaking the arches shall make sure the DMA ops are suitable, whatever that means in the given context. > > > > > - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM > > > > > > > I don't get this point. The argument where the hypervisor is buggy is a > > bit hard to follow for me. If hypervisor is buggy we have already lost > > anyway most of the time, or? > > If VIRTIO_F_ACCESS_PLATFORM is set then things just work. If > VIRTIO_F_ACCESS_PLATFORM is clear device is supposed to have access to > all of memory. You can argue in various ways but it's easier to just > declare a behaviour that violates this a bug. Which might still be worth > working around, for various reasons. > I don't agree. The spec explicitly states "If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has." That ain't any guest memory, but the addresses supplied to it. BTW this is how channel I/O works as well: the channel program authorizes the device to use the memory locations specified by the channel program, for as long as the channel program is executed. That's how channel I/O does isolation without an architected IOMMU. Can you please show me the words in the specification that say, the device has to have access to the entire guest memory all the time? Maybe I have to understand all the intentions behind VIRTIO_F_ACCESS_PLATFORM better. I've read the spec several times, but I still have the feeling, when we discuss, that I didn't get it right. IOMMUs and PCI style DMA are unfortunately not my bread and butter. I only know that the devices do not need any new device capability (I assume VIRTIO_F_ACCESS_PLATFORM does express a device capability), to work with protected virtualization. Unless one defines VIRTIO_F_ACCESS_PLATFORM is the capability that the device won't poke *arbitrary* guest memory. From that perspective mandating the flag feels wrong. (CCW devices are never allowed to poke arbitrary memory.) Yet, if VIRTIO_F_ACCESS_PLATFORM is a flag that every nice and modern virtio device should have (i.e. a lack of it means kida broken), then I have the feeling virtio-ccw should probably evolve in the direction, that having VIRTIO_F_ACCESS_PLATFORM set does not hurt. I have to think some more. > > > > I don't see how this patch does this. > > > > I do get your point. I don't know of a good way to check that DMA API > > is giving us addresses that are actually physical addresses, and the > > situation you describe definitely has some risk to it. > > One way would be to extend the DMA API with such an API. Seems Christoph does not like that idea. > > Another would be to make virtio always use DMA API > and hide the logic in there. I thought Christoph wants that, but I was wrong. > This second approach is not easy, in particular since DMA API adds > a bunch of overhead which we need to find ways to > measure and mitigate. > Agreed. From s390 perspective, I think it ain't to bad if we get GFP_DMA. Many thanks for your patience! Halil [..] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic 2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic 2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic @ 2020-02-20 20:48 ` Michael S. Tsirkin 2020-02-20 21:29 ` Michael S. Tsirkin ` (2 subsequent siblings) 5 siblings, 0 replies; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-20 20:48 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote: > Currently if one intends to run a memory protection enabled VM with > virtio devices and linux as the guest OS, one needs to specify the > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > linux use the DMA API, which in turn handles the memory > encryption/protection stuff if the guest decides to turn itself into > a protected one. This however makes no sense due to multiple reasons: > * The device is not changed by the fact that the guest RAM is > protected. The so called IOMMU bypass quirk is not affected. > * This usage is not congruent with standardised semantics of > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > for using DMA API in virtio (orthogonal with respect to what is > expressed by VIRTIO_F_IOMMU_PLATFORM). > > This series aims to decouple 'have to use DMA API because my (guest) RAM > is protected' and 'have to use DMA API because the device told me > VIRTIO_F_IOMMU_PLATFORM'. > > Please find more detailed explanations about the conceptual aspects in > the individual patches. There is however also a very practical problem > that is addressed by this series. > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > effect The vhost code assumes it the addresses on the virtio descriptor > ring are not guest physical addresses but iova's, and insists on doing a > translation of these regardless of what transport is used (e.g. whether > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > "vhost: new device IOTLB API".) On s390 this results in severe > performance degradation (c.a. factor 10). BTW with ccw I/O there is > (architecturally) no IOMMU, so the whole address translation makes no > sense in the context of virtio-ccw. That's just a QEMU thing. It uses the same flag for VIRTIO_F_ACCESS_PLATFORM and vhost IOTLB. QEMU can separate them, no need to change linux. > Halil Pasic (2): > mm: move force_dma_unencrypted() to mem_encrypt.h > virtio: let virtio use DMA API when guest RAM is protected > > drivers/virtio/virtio_ring.c | 3 +++ > include/linux/dma-direct.h | 9 --------- > include/linux/mem_encrypt.h | 10 ++++++++++ > 3 files changed, 13 insertions(+), 9 deletions(-) > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > -- > 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic ` (2 preceding siblings ...) 2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin @ 2020-02-20 21:29 ` Michael S. Tsirkin 2020-02-21 13:37 ` Halil Pasic 2020-02-20 21:33 ` Michael S. Tsirkin 2020-02-21 6:22 ` Jason Wang 5 siblings, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-20 21:29 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote: > * This usage is not congruent with standardised semantics of > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > for using DMA API in virtio (orthogonal with respect to what is > expressed by VIRTIO_F_IOMMU_PLATFORM). Quoting the spec: \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that the device can be used on a platform where device access to data in memory is limited and/or translated. E.g. this is the case if the device can be located behind an IOMMU that translates bus addresses from the device into physical addresses in memory, if the device can be limited to only access certain memory addresses or if special commands such as a cache flush can be needed to synchronise data in memory with the device. Whether accesses are actually limited or translated is described by platform-specific means. If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. since device can't access encrypted memory, this seems to match your case reasonably well. -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 21:29 ` Michael S. Tsirkin @ 2020-02-21 13:37 ` Halil Pasic 0 siblings, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:37 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, 20 Feb 2020 16:29:50 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote: > > * This usage is not congruent with standardised semantics of > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > for using DMA API in virtio (orthogonal with respect to what is > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > Quoting the spec: > > \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that > the device can be used on a platform where device access to data > in memory is limited and/or translated. E.g. this is the case if the device can be located > behind an IOMMU that translates bus addresses from the device into physical > addresses in memory, if the device can be limited to only access > certain memory addresses or if special commands such as > a cache flush can be needed to synchronise data in memory with > the device. Whether accesses are actually limited or translated > is described by platform-specific means. > If this feature bit is set to 0, then the device > has same access to memory addresses supplied to it as the > driver has. > In particular, the device will always use physical addresses > matching addresses used by the driver (typically meaning > physical addresses used by the CPU) > and not translated further, and can access any address supplied to it by > the driver. When clear, this overrides any platform-specific description of > whether device access is limited or translated in any way, e.g. > whether an IOMMU may be present. > > since device can't access encrypted memory, > this seems to match your case reasonably well. > As David already explained, the device does not have to access encrypted memory. In fact, we don't have memory encryption but memory protection on s390. All the device *should* ever see is non-protected memory (one that was previously shared by the guest). Our protected guests start as non-protected ones, and may or may not turn protected during their life-span. From the device perspective, really, nothing changes. I believe David explained this much better than I did. Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic ` (3 preceding siblings ...) 2020-02-20 21:29 ` Michael S. Tsirkin @ 2020-02-20 21:33 ` Michael S. Tsirkin 2020-02-21 13:49 ` Halil Pasic 2020-02-21 16:41 ` Christoph Hellwig 2020-02-21 6:22 ` Jason Wang 5 siblings, 2 replies; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-20 21:33 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote: > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > effect The vhost code assumes it the addresses on the virtio descriptor > ring are not guest physical addresses but iova's, and insists on doing a > translation of these regardless of what transport is used (e.g. whether > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > "vhost: new device IOTLB API".) On s390 this results in severe > performance degradation (c.a. factor 10). BTW with ccw I/O there is > (architecturally) no IOMMU, so the whole address translation makes no > sense in the context of virtio-ccw. So it sounds like a host issue: the emulation of s390 unnecessarily complicated. Working around it by the guest looks wrong ... > Halil Pasic (2): > mm: move force_dma_unencrypted() to mem_encrypt.h > virtio: let virtio use DMA API when guest RAM is protected > > drivers/virtio/virtio_ring.c | 3 +++ > include/linux/dma-direct.h | 9 --------- > include/linux/mem_encrypt.h | 10 ++++++++++ > 3 files changed, 13 insertions(+), 9 deletions(-) > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > -- > 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 21:33 ` Michael S. Tsirkin @ 2020-02-21 13:49 ` Halil Pasic 2020-02-21 16:41 ` Christoph Hellwig 1 sibling, 0 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 13:49 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, 20 Feb 2020 16:33:35 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote: > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > effect The vhost code assumes it the addresses on the virtio descriptor > > ring are not guest physical addresses but iova's, and insists on doing a > > translation of these regardless of what transport is used (e.g. whether > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > "vhost: new device IOTLB API".) On s390 this results in severe > > performance degradation (c.a. factor 10). BTW with ccw I/O there is > > (architecturally) no IOMMU, so the whole address translation makes no > > sense in the context of virtio-ccw. > > So it sounds like a host issue: the emulation of s390 unnecessarily complicated. > Working around it by the guest looks wrong ... While do think that forcing IOMMU_PLATFORM on devices that do not want or need it, just to trigger DMA API usage in guest is conceptually wrong, I do agree, that we might have a host issue. Namely, unlike PCI, CCW does not have an IOMMU, and the spec clearly states that "Whether accesses are actually limited or translated is described by platform-specific means.". With CCW devices we don't have address translation by IOMMU, and in that sense vhost is probably wrong about trying to do the translation. That is why I mentioned the patch, and that it's done regardless of the transport/platform. Regards, Halil > > > Halil Pasic (2): > > mm: move force_dma_unencrypted() to mem_encrypt.h > > virtio: let virtio use DMA API when guest RAM is protected > > > > drivers/virtio/virtio_ring.c | 3 +++ > > include/linux/dma-direct.h | 9 --------- > > include/linux/mem_encrypt.h | 10 ++++++++++ > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > > -- > > 2.17.1 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 21:33 ` Michael S. Tsirkin 2020-02-21 13:49 ` Halil Pasic @ 2020-02-21 16:41 ` Christoph Hellwig 2020-02-24 5:44 ` David Gibson 1 sibling, 1 reply; 58+ messages in thread From: Christoph Hellwig @ 2020-02-21 16:41 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote: > So it sounds like a host issue: the emulation of s390 unnecessarily complicated. > Working around it by the guest looks wrong ... Yes. If your host (and I don't care if you split hypervisor, ultravisor and megavisor out in your implementation) wants to support a VM architecture where the host can't access all guest memory you need to ensure the DMA API is used. Extra points for simply always setting the flag and thus future proofing the scheme. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-21 16:41 ` Christoph Hellwig @ 2020-02-24 5:44 ` David Gibson 0 siblings, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-24 5:44 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy [-- Attachment #1.1: Type: text/plain, Size: 2156 bytes --] On Fri, Feb 21, 2020 at 05:41:51PM +0100, Christoph Hellwig wrote: > On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote: > > So it sounds like a host issue: the emulation of s390 unnecessarily complicated. > > Working around it by the guest looks wrong ... > > Yes. If your host (and I don't care if you split hypervisor, > ultravisor and megavisor out in your implementation) wants to > support a VM architecture where the host can't access all guest > memory you need to ensure the DMA API is used. Extra points for > simply always setting the flag and thus future proofing the scheme. Moving towards F_ACCESS_PLATFORM everywhere is a good idea (for other reasons), but that doesn't make the problem as it exists right now go away. But, "you need to ensure the DMA API is used" makes no sense from the host point of view. The existence of the DMA API is an entirely guest side, and Linux specific detail, the host can't make decisions based on that. For POWER - possibly s390 as well - the hypervisor has no way of knowing at machine construction time whether it will be an old kernel (or non Linux OS) which can't support F_ACCESS_PLATFORM, or a guest which will enter secure mode and therefore requires F_ACCESS_PLATFORM (according to you). That's the fundamental problem here. The normal virtio model of features that the guest can optionally accept would work nicely here - except that that wouldn't work for the case of hardware virtio devices, where the access limitations come from "host" (platform) side and therefore can't be disabled by that host. We really do have two cases here: 1) access restrictions originating with the host/platform (e.g. hardware virtio) and 2) access restrictions originating with the guest (e.g. secure VMs). What we need to do to deal with them is basically the same at the driver level, but it has subtle and important differences at the platform level. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic ` (4 preceding siblings ...) 2020-02-20 21:33 ` Michael S. Tsirkin @ 2020-02-21 6:22 ` Jason Wang 2020-02-21 14:56 ` Halil Pasic 5 siblings, 1 reply; 58+ messages in thread From: Jason Wang @ 2020-02-21 6:22 UTC (permalink / raw) To: Halil Pasic, Michael S. Tsirkin, Marek Szyprowski, Robin Murphy, Christoph Hellwig Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Viktor Mihajlovski, David Gibson On 2020/2/21 上午12:06, Halil Pasic wrote: > Currently if one intends to run a memory protection enabled VM with > virtio devices and linux as the guest OS, one needs to specify the > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > linux use the DMA API, which in turn handles the memory > encryption/protection stuff if the guest decides to turn itself into > a protected one. This however makes no sense due to multiple reasons: > * The device is not changed by the fact that the guest RAM is > protected. The so called IOMMU bypass quirk is not affected. > * This usage is not congruent with standardised semantics of > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > for using DMA API in virtio (orthogonal with respect to what is > expressed by VIRTIO_F_IOMMU_PLATFORM). > > This series aims to decouple 'have to use DMA API because my (guest) RAM > is protected' and 'have to use DMA API because the device told me > VIRTIO_F_IOMMU_PLATFORM'. > > Please find more detailed explanations about the conceptual aspects in > the individual patches. There is however also a very practical problem > that is addressed by this series. > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > effect The vhost code assumes it the addresses on the virtio descriptor > ring are not guest physical addresses but iova's, and insists on doing a > translation of these regardless of what transport is used (e.g. whether > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > "vhost: new device IOTLB API".) On s390 this results in severe > performance degradation (c.a. factor 10). Do you see a consistent degradation on the performance, or it only happen when for during the beginning of the test? > BTW with ccw I/O there is > (architecturally) no IOMMU, so the whole address translation makes no > sense in the context of virtio-ccw. I suspect we can do optimization in qemu side. E.g send memtable entry via IOTLB API when vIOMMU is not enabled. If this makes sense, I can draft patch to see if there's any difference. Thanks > > Halil Pasic (2): > mm: move force_dma_unencrypted() to mem_encrypt.h > virtio: let virtio use DMA API when guest RAM is protected > > drivers/virtio/virtio_ring.c | 3 +++ > include/linux/dma-direct.h | 9 --------- > include/linux/mem_encrypt.h | 10 ++++++++++ > 3 files changed, 13 insertions(+), 9 deletions(-) > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-21 6:22 ` Jason Wang @ 2020-02-21 14:56 ` Halil Pasic 2020-02-24 3:38 ` David Gibson 2020-02-24 4:01 ` Jason Wang 0 siblings, 2 replies; 58+ messages in thread From: Halil Pasic @ 2020-02-21 14:56 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Fri, 21 Feb 2020 14:22:26 +0800 Jason Wang <jasowang@redhat.com> wrote: > > On 2020/2/21 上午12:06, Halil Pasic wrote: > > Currently if one intends to run a memory protection enabled VM with > > virtio devices and linux as the guest OS, one needs to specify the > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > > linux use the DMA API, which in turn handles the memory > > encryption/protection stuff if the guest decides to turn itself into > > a protected one. This however makes no sense due to multiple reasons: > > * The device is not changed by the fact that the guest RAM is > > protected. The so called IOMMU bypass quirk is not affected. > > * This usage is not congruent with standardised semantics of > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > for using DMA API in virtio (orthogonal with respect to what is > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM > > is protected' and 'have to use DMA API because the device told me > > VIRTIO_F_IOMMU_PLATFORM'. > > > > Please find more detailed explanations about the conceptual aspects in > > the individual patches. There is however also a very practical problem > > that is addressed by this series. > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > effect The vhost code assumes it the addresses on the virtio descriptor > > ring are not guest physical addresses but iova's, and insists on doing a > > translation of these regardless of what transport is used (e.g. whether > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > "vhost: new device IOTLB API".) On s390 this results in severe > > performance degradation (c.a. factor 10). > > > Do you see a consistent degradation on the performance, or it only > happen when for during the beginning of the test? > AFAIK the degradation is consistent. > > > BTW with ccw I/O there is > > (architecturally) no IOMMU, so the whole address translation makes no > > sense in the context of virtio-ccw. > > > I suspect we can do optimization in qemu side. > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled. > > If this makes sense, I can draft patch to see if there's any difference. Frankly I would prefer to avoid IOVAs on the descriptor ring (and the then necessary translation) for virtio-ccw altogether. But Michael voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices that could be used with guests running in protected mode. I don't share his opinion, but that's an ongoing discussion. Should we end up having to do translation from IOVA in vhost, we are very interested in that translation being fast and efficient. In that sense we would be very happy to test any optimization that aim into that direction. Thank you very much for your input! Regards, Halil > > Thanks > > > > > > Halil Pasic (2): > > mm: move force_dma_unencrypted() to mem_encrypt.h > > virtio: let virtio use DMA API when guest RAM is protected > > > > drivers/virtio/virtio_ring.c | 3 +++ > > include/linux/dma-direct.h | 9 --------- > > include/linux/mem_encrypt.h | 10 ++++++++++ > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-21 14:56 ` Halil Pasic @ 2020-02-24 3:38 ` David Gibson 2020-02-24 4:01 ` Jason Wang 1 sibling, 0 replies; 58+ messages in thread From: David Gibson @ 2020-02-24 3:38 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 4045 bytes --] On Fri, Feb 21, 2020 at 03:56:02PM +0100, Halil Pasic wrote: > On Fri, 21 Feb 2020 14:22:26 +0800 > Jason Wang <jasowang@redhat.com> wrote: > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote: > > > Currently if one intends to run a memory protection enabled VM with > > > virtio devices and linux as the guest OS, one needs to specify the > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > > > linux use the DMA API, which in turn handles the memory > > > encryption/protection stuff if the guest decides to turn itself into > > > a protected one. This however makes no sense due to multiple reasons: > > > * The device is not changed by the fact that the guest RAM is > > > protected. The so called IOMMU bypass quirk is not affected. > > > * This usage is not congruent with standardised semantics of > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > > for using DMA API in virtio (orthogonal with respect to what is > > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM > > > is protected' and 'have to use DMA API because the device told me > > > VIRTIO_F_IOMMU_PLATFORM'. > > > > > > Please find more detailed explanations about the conceptual aspects in > > > the individual patches. There is however also a very practical problem > > > that is addressed by this series. > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > > effect The vhost code assumes it the addresses on the virtio descriptor > > > ring are not guest physical addresses but iova's, and insists on doing a > > > translation of these regardless of what transport is used (e.g. whether > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > > "vhost: new device IOTLB API".) On s390 this results in severe > > > performance degradation (c.a. factor 10). > > > > > > Do you see a consistent degradation on the performance, or it only > > happen when for during the beginning of the test? > > > > AFAIK the degradation is consistent. > > > > > > BTW with ccw I/O there is > > > (architecturally) no IOMMU, so the whole address translation makes no > > > sense in the context of virtio-ccw. > > > > > > I suspect we can do optimization in qemu side. > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled. > > > > If this makes sense, I can draft patch to see if there's any difference. > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the > then necessary translation) for virtio-ccw altogether. But Michael > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices > that could be used with guests running in protected mode. I don't share > his opinion, but that's an ongoing discussion. I'm a bit confused by this. For the ccw specific case, F_ACCESS_PLATFORM shouldn't have any impact: for you, IOVA == GPA so everything is easy. > Should we end up having to do translation from IOVA in vhost, we are > very interested in that translation being fast and efficient. > > In that sense we would be very happy to test any optimization that aim > into that direction. > > Thank you very much for your input! > > Regards, > Halil > > > > > Thanks > > > > > > > > > > Halil Pasic (2): > > > mm: move force_dma_unencrypted() to mem_encrypt.h > > > virtio: let virtio use DMA API when guest RAM is protected > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > include/linux/dma-direct.h | 9 --------- > > > include/linux/mem_encrypt.h | 10 ++++++++++ > > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-21 14:56 ` Halil Pasic 2020-02-24 3:38 ` David Gibson @ 2020-02-24 4:01 ` Jason Wang 2020-02-24 6:06 ` Michael S. Tsirkin 1 sibling, 1 reply; 58+ messages in thread From: Jason Wang @ 2020-02-24 4:01 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 3663 bytes --] On 2020/2/21 下午10:56, Halil Pasic wrote: > On Fri, 21 Feb 2020 14:22:26 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> On 2020/2/21 上午12:06, Halil Pasic wrote: >>> Currently if one intends to run a memory protection enabled VM with >>> virtio devices and linux as the guest OS, one needs to specify the >>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest >>> linux use the DMA API, which in turn handles the memory >>> encryption/protection stuff if the guest decides to turn itself into >>> a protected one. This however makes no sense due to multiple reasons: >>> * The device is not changed by the fact that the guest RAM is >>> protected. The so called IOMMU bypass quirk is not affected. >>> * This usage is not congruent with standardised semantics of >>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason >>> for using DMA API in virtio (orthogonal with respect to what is >>> expressed by VIRTIO_F_IOMMU_PLATFORM). >>> >>> This series aims to decouple 'have to use DMA API because my (guest) RAM >>> is protected' and 'have to use DMA API because the device told me >>> VIRTIO_F_IOMMU_PLATFORM'. >>> >>> Please find more detailed explanations about the conceptual aspects in >>> the individual patches. There is however also a very practical problem >>> that is addressed by this series. >>> >>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side >>> effect The vhost code assumes it the addresses on the virtio descriptor >>> ring are not guest physical addresses but iova's, and insists on doing a >>> translation of these regardless of what transport is used (e.g. whether >>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b >>> "vhost: new device IOTLB API".) On s390 this results in severe >>> performance degradation (c.a. factor 10). >> >> Do you see a consistent degradation on the performance, or it only >> happen when for during the beginning of the test? >> > AFAIK the degradation is consistent. > >>> BTW with ccw I/O there is >>> (architecturally) no IOMMU, so the whole address translation makes no >>> sense in the context of virtio-ccw. >> >> I suspect we can do optimization in qemu side. >> >> E.g send memtable entry via IOTLB API when vIOMMU is not enabled. >> >> If this makes sense, I can draft patch to see if there's any difference. > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the > then necessary translation) for virtio-ccw altogether. But Michael > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices > that could be used with guests running in protected mode. I don't share > his opinion, but that's an ongoing discussion. > > Should we end up having to do translation from IOVA in vhost, we are > very interested in that translation being fast and efficient. > > In that sense we would be very happy to test any optimization that aim > into that direction. > > Thank you very much for your input! Using IOTLB API on platform without IOMMU support is not intended. Please try the attached patch to see if it helps. Thanks > > Regards, > Halil > >> Thanks >> >> >>> Halil Pasic (2): >>> mm: move force_dma_unencrypted() to mem_encrypt.h >>> virtio: let virtio use DMA API when guest RAM is protected >>> >>> drivers/virtio/virtio_ring.c | 3 +++ >>> include/linux/dma-direct.h | 9 --------- >>> include/linux/mem_encrypt.h | 10 ++++++++++ >>> 3 files changed, 13 insertions(+), 9 deletions(-) >>> >>> >>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 [-- Attachment #2: 0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch --] [-- Type: text/x-patch, Size: 1710 bytes --] From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 From: Jason Wang <jasowang@redhat.com> Date: Mon, 24 Feb 2020 12:00:10 +0800 Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly When transport does not support IOMMU, we should clear IOMMU_PLATFORM even if the device and vhost claims to support that. This help to avoid the performance overhead caused by unnecessary IOTLB miss/update transactions on such platform. Signed-off-by: Jason Wang <jasowang@redhat.com> --- hw/virtio/virtio-bus.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c index d6332d45c3..2741b9fdd2 100644 --- a/hw/virtio/virtio-bus.c +++ b/hw/virtio/virtio-bus.c @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) VirtioBusState *bus = VIRTIO_BUS(qbus); VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); Error *local_err = NULL; DPRINTF("%s: plug device.\n", qbus->name); @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) return; } - if (klass->get_dma_as != NULL && has_iommu) { - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); + if (false && klass->get_dma_as != NULL && + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { vdev->dma_as = klass->get_dma_as(qbus->parent); } else { + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); vdev->dma_as = &address_space_memory; } } -- 2.19.1 [-- Attachment #3: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 4:01 ` Jason Wang @ 2020-02-24 6:06 ` Michael S. Tsirkin 2020-02-24 6:45 ` Jason Wang 0 siblings, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-24 6:06 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote: > > On 2020/2/21 下午10:56, Halil Pasic wrote: > > On Fri, 21 Feb 2020 14:22:26 +0800 > > Jason Wang <jasowang@redhat.com> wrote: > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote: > > > > Currently if one intends to run a memory protection enabled VM with > > > > virtio devices and linux as the guest OS, one needs to specify the > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > > > > linux use the DMA API, which in turn handles the memory > > > > encryption/protection stuff if the guest decides to turn itself into > > > > a protected one. This however makes no sense due to multiple reasons: > > > > * The device is not changed by the fact that the guest RAM is > > > > protected. The so called IOMMU bypass quirk is not affected. > > > > * This usage is not congruent with standardised semantics of > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > > > for using DMA API in virtio (orthogonal with respect to what is > > > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM > > > > is protected' and 'have to use DMA API because the device told me > > > > VIRTIO_F_IOMMU_PLATFORM'. > > > > > > > > Please find more detailed explanations about the conceptual aspects in > > > > the individual patches. There is however also a very practical problem > > > > that is addressed by this series. > > > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > > > effect The vhost code assumes it the addresses on the virtio descriptor > > > > ring are not guest physical addresses but iova's, and insists on doing a > > > > translation of these regardless of what transport is used (e.g. whether > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > > > "vhost: new device IOTLB API".) On s390 this results in severe > > > > performance degradation (c.a. factor 10). > > > > > > Do you see a consistent degradation on the performance, or it only > > > happen when for during the beginning of the test? > > > > > AFAIK the degradation is consistent. > > > > > > BTW with ccw I/O there is > > > > (architecturally) no IOMMU, so the whole address translation makes no > > > > sense in the context of virtio-ccw. > > > > > > I suspect we can do optimization in qemu side. > > > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled. > > > > > > If this makes sense, I can draft patch to see if there's any difference. > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the > > then necessary translation) for virtio-ccw altogether. But Michael > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices > > that could be used with guests running in protected mode. I don't share > > his opinion, but that's an ongoing discussion. > > > > Should we end up having to do translation from IOVA in vhost, we are > > very interested in that translation being fast and efficient. > > > > In that sense we would be very happy to test any optimization that aim > > into that direction. > > > > Thank you very much for your input! > > > Using IOTLB API on platform without IOMMU support is not intended. Please > try the attached patch to see if it helps. > > Thanks > > > > > > Regards, > > Halil > > > > > Thanks > > > > > > > > > > Halil Pasic (2): > > > > mm: move force_dma_unencrypted() to mem_encrypt.h > > > > virtio: let virtio use DMA API when guest RAM is protected > > > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > > include/linux/dma-direct.h | 9 --------- > > > > include/linux/mem_encrypt.h | 10 ++++++++++ > > > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 > From: Jason Wang <jasowang@redhat.com> > Date: Mon, 24 Feb 2020 12:00:10 +0800 > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM > even if the device and vhost claims to support that. This help to > avoid the performance overhead caused by unnecessary IOTLB miss/update > transactions on such platform. > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > hw/virtio/virtio-bus.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c > index d6332d45c3..2741b9fdd2 100644 > --- a/hw/virtio/virtio-bus.c > +++ b/hw/virtio/virtio-bus.c > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > VirtioBusState *bus = VIRTIO_BUS(qbus); > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > Error *local_err = NULL; > > DPRINTF("%s: plug device.\n", qbus->name); > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > return; > } > > - if (klass->get_dma_as != NULL && has_iommu) { > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > + if (false && klass->get_dma_as != NULL && > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { > vdev->dma_as = klass->get_dma_as(qbus->parent); > } else { > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > vdev->dma_as = &address_space_memory; > } > } This seems to clear it unconditionally. I guess it's just a debugging patch, the real one will come later? > -- > 2.19.1 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 6:06 ` Michael S. Tsirkin @ 2020-02-24 6:45 ` Jason Wang 2020-02-24 7:48 ` Michael S. Tsirkin 0 siblings, 1 reply; 58+ messages in thread From: Jason Wang @ 2020-02-24 6:45 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 6009 bytes --] On 2020/2/24 下午2:06, Michael S. Tsirkin wrote: > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote: >> On 2020/2/21 下午10:56, Halil Pasic wrote: >>> On Fri, 21 Feb 2020 14:22:26 +0800 >>> Jason Wang <jasowang@redhat.com> wrote: >>> >>>> On 2020/2/21 上午12:06, Halil Pasic wrote: >>>>> Currently if one intends to run a memory protection enabled VM with >>>>> virtio devices and linux as the guest OS, one needs to specify the >>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest >>>>> linux use the DMA API, which in turn handles the memory >>>>> encryption/protection stuff if the guest decides to turn itself into >>>>> a protected one. This however makes no sense due to multiple reasons: >>>>> * The device is not changed by the fact that the guest RAM is >>>>> protected. The so called IOMMU bypass quirk is not affected. >>>>> * This usage is not congruent with standardised semantics of >>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason >>>>> for using DMA API in virtio (orthogonal with respect to what is >>>>> expressed by VIRTIO_F_IOMMU_PLATFORM). >>>>> >>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM >>>>> is protected' and 'have to use DMA API because the device told me >>>>> VIRTIO_F_IOMMU_PLATFORM'. >>>>> >>>>> Please find more detailed explanations about the conceptual aspects in >>>>> the individual patches. There is however also a very practical problem >>>>> that is addressed by this series. >>>>> >>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side >>>>> effect The vhost code assumes it the addresses on the virtio descriptor >>>>> ring are not guest physical addresses but iova's, and insists on doing a >>>>> translation of these regardless of what transport is used (e.g. whether >>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b >>>>> "vhost: new device IOTLB API".) On s390 this results in severe >>>>> performance degradation (c.a. factor 10). >>>> Do you see a consistent degradation on the performance, or it only >>>> happen when for during the beginning of the test? >>>> >>> AFAIK the degradation is consistent. >>> >>>>> BTW with ccw I/O there is >>>>> (architecturally) no IOMMU, so the whole address translation makes no >>>>> sense in the context of virtio-ccw. >>>> I suspect we can do optimization in qemu side. >>>> >>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled. >>>> >>>> If this makes sense, I can draft patch to see if there's any difference. >>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the >>> then necessary translation) for virtio-ccw altogether. But Michael >>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices >>> that could be used with guests running in protected mode. I don't share >>> his opinion, but that's an ongoing discussion. >>> >>> Should we end up having to do translation from IOVA in vhost, we are >>> very interested in that translation being fast and efficient. >>> >>> In that sense we would be very happy to test any optimization that aim >>> into that direction. >>> >>> Thank you very much for your input! >> >> Using IOTLB API on platform without IOMMU support is not intended. Please >> try the attached patch to see if it helps. >> >> Thanks >> >> >>> Regards, >>> Halil >>> >>>> Thanks >>>> >>>> >>>>> Halil Pasic (2): >>>>> mm: move force_dma_unencrypted() to mem_encrypt.h >>>>> virtio: let virtio use DMA API when guest RAM is protected >>>>> >>>>> drivers/virtio/virtio_ring.c | 3 +++ >>>>> include/linux/dma-direct.h | 9 --------- >>>>> include/linux/mem_encrypt.h | 10 ++++++++++ >>>>> 3 files changed, 13 insertions(+), 9 deletions(-) >>>>> >>>>> >>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 >> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 >> From: Jason Wang <jasowang@redhat.com> >> Date: Mon, 24 Feb 2020 12:00:10 +0800 >> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly >> >> When transport does not support IOMMU, we should clear IOMMU_PLATFORM >> even if the device and vhost claims to support that. This help to >> avoid the performance overhead caused by unnecessary IOTLB miss/update >> transactions on such platform. >> >> Signed-off-by: Jason Wang <jasowang@redhat.com> >> --- >> hw/virtio/virtio-bus.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c >> index d6332d45c3..2741b9fdd2 100644 >> --- a/hw/virtio/virtio-bus.c >> +++ b/hw/virtio/virtio-bus.c >> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >> VirtioBusState *bus = VIRTIO_BUS(qbus); >> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); >> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); >> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >> Error *local_err = NULL; >> >> DPRINTF("%s: plug device.\n", qbus->name); >> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >> return; >> } >> >> - if (klass->get_dma_as != NULL && has_iommu) { >> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); >> + if (false && klass->get_dma_as != NULL && >> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { >> vdev->dma_as = klass->get_dma_as(qbus->parent); >> } else { >> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); >> vdev->dma_as = &address_space_memory; >> } >> } > > This seems to clear it unconditionally. I guess it's just a debugging > patch, the real one will come later? My bad, here's the correct one. Thanks > >> -- >> 2.19.1 >> [-- Attachment #2: 0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch --] [-- Type: text/x-patch, Size: 1701 bytes --] From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001 From: Jason Wang <jasowang@redhat.com> Date: Mon, 24 Feb 2020 12:00:10 +0800 Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly When transport does not support IOMMU, we should clear IOMMU_PLATFORM even if the device and vhost claims to support that. This help to avoid the performance overhead caused by unnecessary IOTLB miss/update transactions on such platform. Signed-off-by: Jason Wang <jasowang@redhat.com> --- hw/virtio/virtio-bus.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c index d6332d45c3..4be64e193e 100644 --- a/hw/virtio/virtio-bus.c +++ b/hw/virtio/virtio-bus.c @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) VirtioBusState *bus = VIRTIO_BUS(qbus); VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); Error *local_err = NULL; DPRINTF("%s: plug device.\n", qbus->name); @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) return; } - if (klass->get_dma_as != NULL && has_iommu) { - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); + if (klass->get_dma_as != NULL && + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { vdev->dma_as = klass->get_dma_as(qbus->parent); } else { + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); vdev->dma_as = &address_space_memory; } } -- 2.19.1 [-- Attachment #3: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 6:45 ` Jason Wang @ 2020-02-24 7:48 ` Michael S. Tsirkin 2020-02-24 9:26 ` Jason Wang 0 siblings, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-24 7:48 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote: > > On 2020/2/24 下午2:06, Michael S. Tsirkin wrote: > > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote: > > > On 2020/2/21 下午10:56, Halil Pasic wrote: > > > > On Fri, 21 Feb 2020 14:22:26 +0800 > > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote: > > > > > > Currently if one intends to run a memory protection enabled VM with > > > > > > virtio devices and linux as the guest OS, one needs to specify the > > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > > > > > > linux use the DMA API, which in turn handles the memory > > > > > > encryption/protection stuff if the guest decides to turn itself into > > > > > > a protected one. This however makes no sense due to multiple reasons: > > > > > > * The device is not changed by the fact that the guest RAM is > > > > > > protected. The so called IOMMU bypass quirk is not affected. > > > > > > * This usage is not congruent with standardised semantics of > > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > > > > > for using DMA API in virtio (orthogonal with respect to what is > > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > > > > > > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM > > > > > > is protected' and 'have to use DMA API because the device told me > > > > > > VIRTIO_F_IOMMU_PLATFORM'. > > > > > > > > > > > > Please find more detailed explanations about the conceptual aspects in > > > > > > the individual patches. There is however also a very practical problem > > > > > > that is addressed by this series. > > > > > > > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > > > > > effect The vhost code assumes it the addresses on the virtio descriptor > > > > > > ring are not guest physical addresses but iova's, and insists on doing a > > > > > > translation of these regardless of what transport is used (e.g. whether > > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > > > > > "vhost: new device IOTLB API".) On s390 this results in severe > > > > > > performance degradation (c.a. factor 10). > > > > > Do you see a consistent degradation on the performance, or it only > > > > > happen when for during the beginning of the test? > > > > > > > > > AFAIK the degradation is consistent. > > > > > > > > > > BTW with ccw I/O there is > > > > > > (architecturally) no IOMMU, so the whole address translation makes no > > > > > > sense in the context of virtio-ccw. > > > > > I suspect we can do optimization in qemu side. > > > > > > > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled. > > > > > > > > > > If this makes sense, I can draft patch to see if there's any difference. > > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the > > > > then necessary translation) for virtio-ccw altogether. But Michael > > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices > > > > that could be used with guests running in protected mode. I don't share > > > > his opinion, but that's an ongoing discussion. > > > > > > > > Should we end up having to do translation from IOVA in vhost, we are > > > > very interested in that translation being fast and efficient. > > > > > > > > In that sense we would be very happy to test any optimization that aim > > > > into that direction. > > > > > > > > Thank you very much for your input! > > > > > > Using IOTLB API on platform without IOMMU support is not intended. Please > > > try the attached patch to see if it helps. > > > > > > Thanks > > > > > > > > > > Regards, > > > > Halil > > > > > > > > > Thanks > > > > > > > > > > > > > > > > Halil Pasic (2): > > > > > > mm: move force_dma_unencrypted() to mem_encrypt.h > > > > > > virtio: let virtio use DMA API when guest RAM is protected > > > > > > > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > > > > include/linux/dma-direct.h | 9 --------- > > > > > > include/linux/mem_encrypt.h | 10 ++++++++++ > > > > > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > > > > > > > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 > > > From: Jason Wang <jasowang@redhat.com> > > > Date: Mon, 24 Feb 2020 12:00:10 +0800 > > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly > > > > > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM > > > even if the device and vhost claims to support that. This help to > > > avoid the performance overhead caused by unnecessary IOTLB miss/update > > > transactions on such platform. > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > --- > > > hw/virtio/virtio-bus.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c > > > index d6332d45c3..2741b9fdd2 100644 > > > --- a/hw/virtio/virtio-bus.c > > > +++ b/hw/virtio/virtio-bus.c > > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > VirtioBusState *bus = VIRTIO_BUS(qbus); > > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); > > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); > > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > Error *local_err = NULL; > > > DPRINTF("%s: plug device.\n", qbus->name); > > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > return; > > > } > > > - if (klass->get_dma_as != NULL && has_iommu) { > > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > > + if (false && klass->get_dma_as != NULL && > > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { > > > vdev->dma_as = klass->get_dma_as(qbus->parent); > > > } else { > > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > > vdev->dma_as = &address_space_memory; > > > } > > > } > > > > This seems to clear it unconditionally. I guess it's just a debugging > > patch, the real one will come later? > > > My bad, here's the correct one. > > Thanks > > > > > > > -- > > > 2.19.1 > > > > >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001 > From: Jason Wang <jasowang@redhat.com> > Date: Mon, 24 Feb 2020 12:00:10 +0800 > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM > even if the device and vhost claims to support that. This help to > avoid the performance overhead caused by unnecessary IOTLB miss/update > transactions on such platform. > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > hw/virtio/virtio-bus.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c > index d6332d45c3..4be64e193e 100644 > --- a/hw/virtio/virtio-bus.c > +++ b/hw/virtio/virtio-bus.c > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > VirtioBusState *bus = VIRTIO_BUS(qbus); > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > Error *local_err = NULL; > > DPRINTF("%s: plug device.\n", qbus->name); > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > return; > } > > - if (klass->get_dma_as != NULL && has_iommu) { > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); So it looks like this line is unnecessary, but it's an unrelated cleanup, right? > + if (klass->get_dma_as != NULL && > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { > vdev->dma_as = klass->get_dma_as(qbus->parent); > } else { > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); Of course any change like that will have to affect migration compat, etc. Can't we clear the bit when we are sending the features to vhost instead? > vdev->dma_as = &address_space_memory; > } > } > -- > 2.19.1 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 7:48 ` Michael S. Tsirkin @ 2020-02-24 9:26 ` Jason Wang 2020-02-24 13:40 ` Michael S. Tsirkin 2020-02-24 13:56 ` Halil Pasic 0 siblings, 2 replies; 58+ messages in thread From: Jason Wang @ 2020-02-24 9:26 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 8606 bytes --] On 2020/2/24 下午3:48, Michael S. Tsirkin wrote: > On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote: >> On 2020/2/24 下午2:06, Michael S. Tsirkin wrote: >>> On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote: >>>> On 2020/2/21 下午10:56, Halil Pasic wrote: >>>>> On Fri, 21 Feb 2020 14:22:26 +0800 >>>>> Jason Wang <jasowang@redhat.com> wrote: >>>>> >>>>>> On 2020/2/21 上午12:06, Halil Pasic wrote: >>>>>>> Currently if one intends to run a memory protection enabled VM with >>>>>>> virtio devices and linux as the guest OS, one needs to specify the >>>>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest >>>>>>> linux use the DMA API, which in turn handles the memory >>>>>>> encryption/protection stuff if the guest decides to turn itself into >>>>>>> a protected one. This however makes no sense due to multiple reasons: >>>>>>> * The device is not changed by the fact that the guest RAM is >>>>>>> protected. The so called IOMMU bypass quirk is not affected. >>>>>>> * This usage is not congruent with standardised semantics of >>>>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason >>>>>>> for using DMA API in virtio (orthogonal with respect to what is >>>>>>> expressed by VIRTIO_F_IOMMU_PLATFORM). >>>>>>> >>>>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM >>>>>>> is protected' and 'have to use DMA API because the device told me >>>>>>> VIRTIO_F_IOMMU_PLATFORM'. >>>>>>> >>>>>>> Please find more detailed explanations about the conceptual aspects in >>>>>>> the individual patches. There is however also a very practical problem >>>>>>> that is addressed by this series. >>>>>>> >>>>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side >>>>>>> effect The vhost code assumes it the addresses on the virtio descriptor >>>>>>> ring are not guest physical addresses but iova's, and insists on doing a >>>>>>> translation of these regardless of what transport is used (e.g. whether >>>>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b >>>>>>> "vhost: new device IOTLB API".) On s390 this results in severe >>>>>>> performance degradation (c.a. factor 10). >>>>>> Do you see a consistent degradation on the performance, or it only >>>>>> happen when for during the beginning of the test? >>>>>> >>>>> AFAIK the degradation is consistent. >>>>> >>>>>>> BTW with ccw I/O there is >>>>>>> (architecturally) no IOMMU, so the whole address translation makes no >>>>>>> sense in the context of virtio-ccw. >>>>>> I suspect we can do optimization in qemu side. >>>>>> >>>>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled. >>>>>> >>>>>> If this makes sense, I can draft patch to see if there's any difference. >>>>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the >>>>> then necessary translation) for virtio-ccw altogether. But Michael >>>>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices >>>>> that could be used with guests running in protected mode. I don't share >>>>> his opinion, but that's an ongoing discussion. >>>>> >>>>> Should we end up having to do translation from IOVA in vhost, we are >>>>> very interested in that translation being fast and efficient. >>>>> >>>>> In that sense we would be very happy to test any optimization that aim >>>>> into that direction. >>>>> >>>>> Thank you very much for your input! >>>> Using IOTLB API on platform without IOMMU support is not intended. Please >>>> try the attached patch to see if it helps. >>>> >>>> Thanks >>>> >>>> >>>>> Regards, >>>>> Halil >>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>>> Halil Pasic (2): >>>>>>> mm: move force_dma_unencrypted() to mem_encrypt.h >>>>>>> virtio: let virtio use DMA API when guest RAM is protected >>>>>>> >>>>>>> drivers/virtio/virtio_ring.c | 3 +++ >>>>>>> include/linux/dma-direct.h | 9 --------- >>>>>>> include/linux/mem_encrypt.h | 10 ++++++++++ >>>>>>> 3 files changed, 13 insertions(+), 9 deletions(-) >>>>>>> >>>>>>> >>>>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 >>>> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 >>>> From: Jason Wang <jasowang@redhat.com> >>>> Date: Mon, 24 Feb 2020 12:00:10 +0800 >>>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly >>>> >>>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM >>>> even if the device and vhost claims to support that. This help to >>>> avoid the performance overhead caused by unnecessary IOTLB miss/update >>>> transactions on such platform. >>>> >>>> Signed-off-by: Jason Wang <jasowang@redhat.com> >>>> --- >>>> hw/virtio/virtio-bus.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c >>>> index d6332d45c3..2741b9fdd2 100644 >>>> --- a/hw/virtio/virtio-bus.c >>>> +++ b/hw/virtio/virtio-bus.c >>>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >>>> VirtioBusState *bus = VIRTIO_BUS(qbus); >>>> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); >>>> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); >>>> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >>>> Error *local_err = NULL; >>>> DPRINTF("%s: plug device.\n", qbus->name); >>>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >>>> return; >>>> } >>>> - if (klass->get_dma_as != NULL && has_iommu) { >>>> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); >>>> + if (false && klass->get_dma_as != NULL && >>>> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { >>>> vdev->dma_as = klass->get_dma_as(qbus->parent); >>>> } else { >>>> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); >>>> vdev->dma_as = &address_space_memory; >>>> } >>>> } >>> This seems to clear it unconditionally. I guess it's just a debugging >>> patch, the real one will come later? >> >> My bad, here's the correct one. >> >> Thanks >> >> >>>> -- >>>> 2.19.1 >>>> >> >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001 >> From: Jason Wang <jasowang@redhat.com> >> Date: Mon, 24 Feb 2020 12:00:10 +0800 >> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly >> >> When transport does not support IOMMU, we should clear IOMMU_PLATFORM >> even if the device and vhost claims to support that. This help to >> avoid the performance overhead caused by unnecessary IOTLB miss/update >> transactions on such platform. >> >> Signed-off-by: Jason Wang <jasowang@redhat.com> >> --- >> hw/virtio/virtio-bus.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c >> index d6332d45c3..4be64e193e 100644 >> --- a/hw/virtio/virtio-bus.c >> +++ b/hw/virtio/virtio-bus.c >> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >> VirtioBusState *bus = VIRTIO_BUS(qbus); >> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); >> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); >> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >> Error *local_err = NULL; >> >> DPRINTF("%s: plug device.\n", qbus->name); >> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) >> return; >> } >> >> - if (klass->get_dma_as != NULL && has_iommu) { >> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > So it looks like this line is unnecessary, but it's an unrelated > cleanup, right? Yes. > >> + if (klass->get_dma_as != NULL && >> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { >> vdev->dma_as = klass->get_dma_as(qbus->parent); >> } else { >> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > Of course any change like that will have to affect migration compat, etc. > Can't we clear the bit when we are sending the features to vhost > instead? That's better. How about attached? Thanks > > >> vdev->dma_as = &address_space_memory; >> } >> } >> -- >> 2.19.1 >> [-- Attachment #2: 0001-vhost-do-not-set-VIRTIO_F_IOMMU_PLATFORM-when-IOMMU-.patch --] [-- Type: text/x-patch, Size: 1641 bytes --] From 3177c5194c729f3056b84c67664c59b9b949bb76 Mon Sep 17 00:00:00 2001 From: Jason Wang <jasowang@redhat.com> Date: Mon, 24 Feb 2020 17:24:14 +0800 Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not used We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is negotiated. This lead unnecessary IOTLB miss/update transactions when IOMMU is used. This patch fixes this. Signed-off-by: Jason Wang <jasowang@redhat.com> --- hw/net/virtio-net.c | 3 +++ hw/virtio/vhost.c | 4 +--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 3627bb1717..0d50e8bd34 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) virtio_net_apply_guest_offloads(n); } + if (vdev->dma_as == &address_space_memory) + features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM); + for (i = 0; i < n->max_queues; i++) { NetClientState *nc = qemu_get_subqueue(n->nic, i); diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 9edfadc81d..711b1136f6 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -288,9 +288,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size) static int vhost_dev_has_iommu(struct vhost_dev *dev) { - VirtIODevice *vdev = dev->vdev; - - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); + return virtio_has_feature(dev->acked_features, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, -- 2.19.1 [-- Attachment #3: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 9:26 ` Jason Wang @ 2020-02-24 13:40 ` Michael S. Tsirkin 2020-02-25 3:38 ` Jason Wang 2020-02-24 13:56 ` Halil Pasic 1 sibling, 1 reply; 58+ messages in thread From: Michael S. Tsirkin @ 2020-02-24 13:40 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Mon, Feb 24, 2020 at 05:26:20PM +0800, Jason Wang wrote: > > On 2020/2/24 下午3:48, Michael S. Tsirkin wrote: > > On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote: > > > On 2020/2/24 下午2:06, Michael S. Tsirkin wrote: > > > > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote: > > > > > On 2020/2/21 下午10:56, Halil Pasic wrote: > > > > > > On Fri, 21 Feb 2020 14:22:26 +0800 > > > > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote: > > > > > > > > Currently if one intends to run a memory protection enabled VM with > > > > > > > > virtio devices and linux as the guest OS, one needs to specify the > > > > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest > > > > > > > > linux use the DMA API, which in turn handles the memory > > > > > > > > encryption/protection stuff if the guest decides to turn itself into > > > > > > > > a protected one. This however makes no sense due to multiple reasons: > > > > > > > > * The device is not changed by the fact that the guest RAM is > > > > > > > > protected. The so called IOMMU bypass quirk is not affected. > > > > > > > > * This usage is not congruent with standardised semantics of > > > > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason > > > > > > > > for using DMA API in virtio (orthogonal with respect to what is > > > > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM). > > > > > > > > > > > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM > > > > > > > > is protected' and 'have to use DMA API because the device told me > > > > > > > > VIRTIO_F_IOMMU_PLATFORM'. > > > > > > > > > > > > > > > > Please find more detailed explanations about the conceptual aspects in > > > > > > > > the individual patches. There is however also a very practical problem > > > > > > > > that is addressed by this series. > > > > > > > > > > > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side > > > > > > > > effect The vhost code assumes it the addresses on the virtio descriptor > > > > > > > > ring are not guest physical addresses but iova's, and insists on doing a > > > > > > > > translation of these regardless of what transport is used (e.g. whether > > > > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b > > > > > > > > "vhost: new device IOTLB API".) On s390 this results in severe > > > > > > > > performance degradation (c.a. factor 10). > > > > > > > Do you see a consistent degradation on the performance, or it only > > > > > > > happen when for during the beginning of the test? > > > > > > > > > > > > > AFAIK the degradation is consistent. > > > > > > > > > > > > > > BTW with ccw I/O there is > > > > > > > > (architecturally) no IOMMU, so the whole address translation makes no > > > > > > > > sense in the context of virtio-ccw. > > > > > > > I suspect we can do optimization in qemu side. > > > > > > > > > > > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled. > > > > > > > > > > > > > > If this makes sense, I can draft patch to see if there's any difference. > > > > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the > > > > > > then necessary translation) for virtio-ccw altogether. But Michael > > > > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices > > > > > > that could be used with guests running in protected mode. I don't share > > > > > > his opinion, but that's an ongoing discussion. > > > > > > > > > > > > Should we end up having to do translation from IOVA in vhost, we are > > > > > > very interested in that translation being fast and efficient. > > > > > > > > > > > > In that sense we would be very happy to test any optimization that aim > > > > > > into that direction. > > > > > > > > > > > > Thank you very much for your input! > > > > > Using IOTLB API on platform without IOMMU support is not intended. Please > > > > > try the attached patch to see if it helps. > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > Regards, > > > > > > Halil > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > Halil Pasic (2): > > > > > > > > mm: move force_dma_unencrypted() to mem_encrypt.h > > > > > > > > virtio: let virtio use DMA API when guest RAM is protected > > > > > > > > > > > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > > > > > > include/linux/dma-direct.h | 9 --------- > > > > > > > > include/linux/mem_encrypt.h | 10 ++++++++++ > > > > > > > > 3 files changed, 13 insertions(+), 9 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2 > > > > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001 > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > Date: Mon, 24 Feb 2020 12:00:10 +0800 > > > > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly > > > > > > > > > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM > > > > > even if the device and vhost claims to support that. This help to > > > > > avoid the performance overhead caused by unnecessary IOTLB miss/update > > > > > transactions on such platform. > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > --- > > > > > hw/virtio/virtio-bus.c | 6 +++--- > > > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c > > > > > index d6332d45c3..2741b9fdd2 100644 > > > > > --- a/hw/virtio/virtio-bus.c > > > > > +++ b/hw/virtio/virtio-bus.c > > > > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > > > VirtioBusState *bus = VIRTIO_BUS(qbus); > > > > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); > > > > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > > > Error *local_err = NULL; > > > > > DPRINTF("%s: plug device.\n", qbus->name); > > > > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > > > return; > > > > > } > > > > > - if (klass->get_dma_as != NULL && has_iommu) { > > > > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > > > > + if (false && klass->get_dma_as != NULL && > > > > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { > > > > > vdev->dma_as = klass->get_dma_as(qbus->parent); > > > > > } else { > > > > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > > > > vdev->dma_as = &address_space_memory; > > > > > } > > > > > } > > > > This seems to clear it unconditionally. I guess it's just a debugging > > > > patch, the real one will come later? > > > > > > My bad, here's the correct one. > > > > > > Thanks > > > > > > > > > > > -- > > > > > 2.19.1 > > > > > > > > >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001 > > > From: Jason Wang <jasowang@redhat.com> > > > Date: Mon, 24 Feb 2020 12:00:10 +0800 > > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly > > > > > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM > > > even if the device and vhost claims to support that. This help to > > > avoid the performance overhead caused by unnecessary IOTLB miss/update > > > transactions on such platform. > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > --- > > > hw/virtio/virtio-bus.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c > > > index d6332d45c3..4be64e193e 100644 > > > --- a/hw/virtio/virtio-bus.c > > > +++ b/hw/virtio/virtio-bus.c > > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > VirtioBusState *bus = VIRTIO_BUS(qbus); > > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus); > > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev); > > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > Error *local_err = NULL; > > > DPRINTF("%s: plug device.\n", qbus->name); > > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp) > > > return; > > > } > > > - if (klass->get_dma_as != NULL && has_iommu) { > > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > So it looks like this line is unnecessary, but it's an unrelated > > cleanup, right? > > > Yes. > > > > > > > + if (klass->get_dma_as != NULL && > > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { > > > vdev->dma_as = klass->get_dma_as(qbus->parent); > > > } else { > > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM); > > > > Of course any change like that will have to affect migration compat, etc. > > Can't we clear the bit when we are sending the features to vhost > > instead? > > > That's better. > > How about attached? > > Thanks > > > > > > > > > vdev->dma_as = &address_space_memory; > > > } > > > } > > > -- > > > 2.19.1 > > > > >From 3177c5194c729f3056b84c67664c59b9b949bb76 Mon Sep 17 00:00:00 2001 > From: Jason Wang <jasowang@redhat.com> > Date: Mon, 24 Feb 2020 17:24:14 +0800 > Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not > used > > We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is > negotiated. This lead unnecessary IOTLB miss/update transactions when > IOMMU is used. This patch fixes this. > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > hw/net/virtio-net.c | 3 +++ > hw/virtio/vhost.c | 4 +--- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 3627bb1717..0d50e8bd34 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) > virtio_net_apply_guest_offloads(n); > } > > + if (vdev->dma_as == &address_space_memory) > + features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM); > + > for (i = 0; i < n->max_queues; i++) { > NetClientState *nc = qemu_get_subqueue(n->nic, i); This pokes at acked features. I think they are also guest visible ... > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 9edfadc81d..711b1136f6 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -288,9 +288,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size) > > static int vhost_dev_has_iommu(struct vhost_dev *dev) > { > - VirtIODevice *vdev = dev->vdev; > - > - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > + return virtio_has_feature(dev->acked_features, VIRTIO_F_IOMMU_PLATFORM); > } > > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, > -- > 2.19.1 > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 13:40 ` Michael S. Tsirkin @ 2020-02-25 3:38 ` Jason Wang 0 siblings, 0 replies; 58+ messages in thread From: Jason Wang @ 2020-02-25 3:38 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Halil Pasic, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On 2020/2/24 下午9:40, Michael S. Tsirkin wrote: >> Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not >> used >> >> We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is >> negotiated. This lead unnecessary IOTLB miss/update transactions when >> IOMMU is used. This patch fixes this. >> >> Signed-off-by: Jason Wang<jasowang@redhat.com> >> --- >> hw/net/virtio-net.c | 3 +++ >> hw/virtio/vhost.c | 4 +--- >> 2 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >> index 3627bb1717..0d50e8bd34 100644 >> --- a/hw/net/virtio-net.c >> +++ b/hw/net/virtio-net.c >> @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) >> virtio_net_apply_guest_offloads(n); >> } >> >> + if (vdev->dma_as == &address_space_memory) >> + features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM); >> + >> for (i = 0; i < n->max_queues; i++) { >> NetClientState *nc = qemu_get_subqueue(n->nic, i); > This pokes at acked features. I think they are also > guest visible ... It's the acked features of vhost device, so I guess not? E.g virtio_set_features_nocheck() did: val &= vdev->host_features; if (k->set_features) { k->set_features(vdev, val); } vdev->guest_features = val; Thanks > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 9:26 ` Jason Wang 2020-02-24 13:40 ` Michael S. Tsirkin @ 2020-02-24 13:56 ` Halil Pasic 2020-02-25 3:30 ` Jason Wang 1 sibling, 1 reply; 58+ messages in thread From: Halil Pasic @ 2020-02-24 13:56 UTC (permalink / raw) To: Jason Wang Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On Mon, 24 Feb 2020 17:26:20 +0800 Jason Wang <jasowang@redhat.com> wrote: > That's better. > > How about attached? > > Thanks Thanks Jason! It does avoid the translation overhead in vhost. Tested-by: Halil Pasic <pasic@linux.ibm.com> Regarding the code, you fence it in virtio-net.c, but AFAIU this feature has relevance for other vhost devices as well. E.g. what about vhost user? Would it be the responsibility of each virtio device to fence this on its own? I'm also a bit confused about the semantics of the vhost feature bit F_ACCESS_PLATFORM. What we have specified on virtio level is: """ This feature indicates that the device can be used on a platform where device access to data in memory is limited and/or translated. E.g. this is the case if the device can be located behind an IOMMU that translates bus addresses from the device into physical addresses in memory, if the device can be limited to only access certain memory addresses or if special commands such as a cache flush can be needed to synchronise data in memory with the device. Whether accesses are actually limited or translated is described by platform-specific means. If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. """ I read this like the addresses may be IOVAs which require IMMU translation or GPAs which don't. On the vhost level however, it seems that F_IOMMU_PLATFORM means that vhost has to do the translation (via IOTLB API). Do I understand this correctly? If yes, I believe we should document this properly. BTW I'm still not 100% on the purpose and semantics of the F_ACCESS_PLATFORM feature bit. But that is a different problem. Regards, Halil _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM 2020-02-24 13:56 ` Halil Pasic @ 2020-02-25 3:30 ` Jason Wang 0 siblings, 0 replies; 58+ messages in thread From: Jason Wang @ 2020-02-25 3:30 UTC (permalink / raw) To: Halil Pasic Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel, virtualization, Christian Borntraeger, iommu, David Gibson, Michael Mueller, Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig On 2020/2/24 下午9:56, Halil Pasic wrote: > On Mon, 24 Feb 2020 17:26:20 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> That's better. >> >> How about attached? >> >> Thanks > Thanks Jason! It does avoid the translation overhead in vhost. > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > Regarding the code, you fence it in virtio-net.c, but AFAIU this feature > has relevance for other vhost devices as well. E.g. what about vhost > user? Would it be the responsibility of each virtio device to fence this > on its own? Yes, it looks to me it's better to do that in virtio_set_features_nocheck() > > I'm also a bit confused about the semantics of the vhost feature bit > F_ACCESS_PLATFORM. What we have specified on virtio level is: > """ > This feature indicates that the device can be used on a platform where > device access to data in memory is limited and/or translated. E.g. this > is the case if the device can be located behind an IOMMU that translates > bus addresses from the device into physical addresses in memory, if the > device can be limited to only access certain memory addresses or if > special commands such as a cache flush can be needed to synchronise data > in memory with the device. Whether accesses are actually limited or > translated is described by platform-specific means. If this feature bit > is set to 0, then the device has same access to memory addresses > supplied to it as the driver has. In particular, the device will always > use physical addresses matching addresses used by the driver (typically > meaning physical addresses used by the CPU) and not translated further, > and can access any address supplied to it by the driver. When clear, > this overrides any platform-specific description of whether device > access is limited or translated in any way, e.g. whether an IOMMU may be > present. > """ > > I read this like the addresses may be IOVAs which require > IMMU translation or GPAs which don't. > > On the vhost level however, it seems that F_IOMMU_PLATFORM means that > vhost has to do the translation (via IOTLB API). Yes. > > Do I understand this correctly? If yes, I believe we should document > this properly. Good point. I think it was probably wrong to tie F_IOMMU_PLATFORM to IOTLB API. Technically IOTLB can work with GPA->HVA mapping. I originally use a dedicated feature bit (you can see that from commit log), but for some reason Michael tweak it to virtio feature bit. I guess it was probably because at that time there's no way to specify e.g backend capability but now we have VHOST_GET_BACKEND_FEATURES. For now, it was probably too late to fix that but document or we can add the support of enabling IOTLB via new backend features. > > BTW I'm still not 100% on the purpose and semantics of the > F_ACCESS_PLATFORM feature bit. But that is a different problem. Yes, I aggree that we should decouple the features that does not belongs to device (protected, encrypted, swiotlb etc) from F_IOMMU_PLATFORM. But Michael and other have their points as well. Thanks > > Regards, > Halil > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2020-10-28 18:01 UTC | newest] Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic 2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic 2020-02-20 16:11 ` Christoph Hellwig 2020-02-20 16:23 ` Christian Borntraeger 2020-02-20 16:31 ` Christoph Hellwig 2020-02-20 17:00 ` Christian Borntraeger 2020-02-21 3:27 ` David Gibson 2020-02-21 13:06 ` Halil Pasic 2020-02-21 15:48 ` Michael S. Tsirkin 2020-02-21 18:07 ` Halil Pasic 2020-02-24 3:33 ` David Gibson 2020-02-24 18:49 ` Halil Pasic 2020-02-25 18:08 ` Cornelia Huck 2020-02-28 0:23 ` David Gibson 2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic 2020-02-20 16:13 ` Christoph Hellwig 2020-02-21 2:59 ` David Gibson 2020-02-21 3:41 ` Jason Wang 2020-02-21 13:31 ` Halil Pasic 2020-02-21 13:27 ` Halil Pasic 2020-02-21 16:36 ` Christoph Hellwig 2020-02-24 6:50 ` David Gibson 2020-02-24 18:59 ` Halil Pasic 2020-02-21 14:33 ` Halil Pasic 2020-02-21 16:39 ` Christoph Hellwig 2020-02-21 18:16 ` Halil Pasic 2020-02-22 19:07 ` Michael S. Tsirkin 2020-02-24 17:16 ` Christoph Hellwig 2020-10-28 14:24 ` Alexander Graf via iommu 2020-10-28 18:01 ` Michael S. Tsirkin 2020-02-20 20:55 ` Michael S. Tsirkin 2020-02-21 1:17 ` Ram Pai 2020-02-21 3:29 ` David Gibson 2020-02-21 13:12 ` Halil Pasic 2020-02-21 15:39 ` Tom Lendacky 2020-02-24 6:40 ` David Gibson 2020-02-21 15:56 ` Michael S. Tsirkin 2020-02-21 16:35 ` Christoph Hellwig 2020-02-21 18:03 ` Halil Pasic 2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin 2020-02-20 21:29 ` Michael S. Tsirkin 2020-02-21 13:37 ` Halil Pasic 2020-02-20 21:33 ` Michael S. Tsirkin 2020-02-21 13:49 ` Halil Pasic 2020-02-21 16:41 ` Christoph Hellwig 2020-02-24 5:44 ` David Gibson 2020-02-21 6:22 ` Jason Wang 2020-02-21 14:56 ` Halil Pasic 2020-02-24 3:38 ` David Gibson 2020-02-24 4:01 ` Jason Wang 2020-02-24 6:06 ` Michael S. Tsirkin 2020-02-24 6:45 ` Jason Wang 2020-02-24 7:48 ` Michael S. Tsirkin 2020-02-24 9:26 ` Jason Wang 2020-02-24 13:40 ` Michael S. Tsirkin 2020-02-25 3:38 ` Jason Wang 2020-02-24 13:56 ` Halil Pasic 2020-02-25 3:30 ` Jason Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).