[PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU

iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
@ 2020-02-20 16:06 Halil Pasic
  2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
                   ` (5 more replies)
  0 siblings, 6 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy,
	Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller,
	Viktor Mihajlovski, David Gibson

Currently if one intends to run a memory protection enabled VM with
virtio devices and linux as the guest OS, one needs to specify the
VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
linux use the DMA API, which in turn handles the memory
encryption/protection stuff if the guest decides to turn itself into
a protected one. This however makes no sense due to multiple reasons:
* The device is not changed by the fact that the guest RAM is
protected. The so called IOMMU bypass quirk is not affected.
* This usage is not congruent with  standardised semantics of
VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
for using DMA API in virtio (orthogonal with respect to what is
expressed by VIRTIO_F_IOMMU_PLATFORM). 

This series aims to decouple 'have to use DMA API because my (guest) RAM
is protected' and 'have to use DMA API because the device told me
VIRTIO_F_IOMMU_PLATFORM'.

Please find more detailed explanations about the conceptual aspects in
the individual patches. There is however also a very practical problem
that is addressed by this series. 

For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
effect The vhost code assumes it the addresses on the virtio descriptor
ring are not guest physical addresses but iova's, and insists on doing a
translation of these regardless of what transport is used (e.g. whether
we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
"vhost: new device IOTLB API".) On s390 this results in severe
performance degradation (c.a. factor 10). BTW with ccw I/O there is
(architecturally) no IOMMU, so the whole address translation makes no
sense in the context of virtio-ccw.

Halil Pasic (2):
  mm: move force_dma_unencrypted() to mem_encrypt.h
  virtio: let virtio use DMA API when guest RAM is protected

 drivers/virtio/virtio_ring.c |  3 +++
 include/linux/dma-direct.h   |  9 ---------
 include/linux/mem_encrypt.h  | 10 ++++++++++
 3 files changed, 13 insertions(+), 9 deletions(-)

base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
@ 2020-02-20 16:06 ` Halil Pasic
  2020-02-20 16:11   ` Christoph Hellwig
  2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 58+ messages in thread
From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy,
	Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller,
	Viktor Mihajlovski, David Gibson

Currently force_dma_unencrypted() is only used by the direct
implementation of the DMA API, and thus resides in dma-direct.h. But
there is nothing dma-direct specific about it: if one was -- for
whatever reason -- to implement custom DMA ops that have to in the
encrypted/protected scenarios dma-direct currently deals with, one would
need exactly this kind of information.

More importantly, virtio has to use DMA API (so that the memory
encryption (x86) or protection (power, s390) is handled) under the very
same circumstances force_dma_unencrypted() returns true. Furthermore,
the in these cases the reason to go via the DMA API is distinct,
compared to the reason indicated by VIRTIO_F_IOMMU_PLATFORM: we need to
use DMA API independently of the device's properties with regards to
access to memory. I.e. the addresses in the descriptors are still guest
physical addresses, the device may still be implemented by a SMP CPU,
and thus the device shall use those without any further translation. See
[1].

Let's move force_dma_unencrypted() the so virtio, or other
implementations of DMA ops can make the right decisions.

[1] https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-4100006
(In the spec VIRTIO_F_IOMMU_PLATFORM is called
VIRTIO_F_ACCESS_PLATFORM).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Ram Pai <linuxram@us.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
---
 include/linux/dma-direct.h  |  9 ---------
 include/linux/mem_encrypt.h | 10 ++++++++++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..590b15d881b0 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -26,15 +26,6 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */

-#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED
-bool force_dma_unencrypted(struct device *dev);
-#else
-static inline bool force_dma_unencrypted(struct device *dev)
-{
-	return false;
-}
-#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */
-
 /*
  * If memory encryption is supported, phys_to_dma will set the memory encryption
  * bit in the DMA address, and dma_to_phys will clear it.  The raw __phys_to_dma
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 5c4a18a91f89..64a48c4b01a2 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -22,6 +22,16 @@ static inline bool mem_encrypt_active(void) { return false; }

 #endif	/* CONFIG_ARCH_HAS_MEM_ENCRYPT */

+struct device;
+#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED
+bool force_dma_unencrypted(struct device *dev);
+#else
+static inline bool force_dma_unencrypted(struct device *dev)
+{
+	return false;
+}
+#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
  * The __sme_set() and __sme_clr() macros are useful for adding or removing
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
  2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
@ 2020-02-20 16:06 ` Halil Pasic
  2020-02-20 16:13   ` Christoph Hellwig
  2020-02-20 20:55   ` Michael S. Tsirkin
  2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-20 16:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Marek Szyprowski, Robin Murphy,
	Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller,
	Viktor Mihajlovski, David Gibson

Currently the advanced guest memory protection technologies (AMD SEV,
powerpc secure guest technology and s390 Protected VMs) abuse the
VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
is in turn necessary, to make IO work with guest memory protection.

But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
different beast: with virtio devices whose implementation runs on an SMP
CPU we are still fine with doing all the usual optimizations, it is just
that we need to make sure that the memory protection mechanism does not
get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
side of the guest (and possibly he host side as well) than we actually
need.

An additional benefit of teaching the guest to make the right decision
(and use DMA API) on it's own is: removing the need, to mandate special
VM configuration for guests that may run with protection. This is
especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
the virtio control structures into the first 2G of guest memory:
something we don't necessarily want to do per-default.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Ram Pai <linuxram@us.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
---
 drivers/virtio/virtio_ring.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 867c7ebd3f10..fafc8f924955 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
 	if (!virtio_has_iommu_quirk(vdev))
 		return true;
 
+	if (force_dma_unencrypted(&vdev->dev))
+		return true;
+
 	/* Otherwise, we are left to guess. */
 	/*
 	 * In theory, it's possible to have a buggy QEMU-supposed
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
@ 2020-02-20 16:11   ` Christoph Hellwig
  2020-02-20 16:23     ` Christian Borntraeger
  0 siblings, 1 reply; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-20 16:11 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote:
> Currently force_dma_unencrypted() is only used by the direct
> implementation of the DMA API, and thus resides in dma-direct.h. But
> there is nothing dma-direct specific about it: if one was -- for
> whatever reason -- to implement custom DMA ops that have to in the
> encrypted/protected scenarios dma-direct currently deals with, one would
> need exactly this kind of information.

I really don't think it has business being anywhre else, and your completely
bogus second patch just proves the point.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
@ 2020-02-20 16:13   ` Christoph Hellwig
  2020-02-21  2:59     ` David Gibson
  2020-02-21 14:33     ` Halil Pasic
  2020-02-20 20:55   ` Michael S. Tsirkin
  1 sibling, 2 replies; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-20 16:13 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 867c7ebd3f10..fafc8f924955 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  	if (!virtio_has_iommu_quirk(vdev))
>  		return true;
>  
> +	if (force_dma_unencrypted(&vdev->dev))
> +		return true;

Hell no.  This is a detail of the platform DMA direct implementation.
Drivers have no business looking at this flag, and virtio finally needs
to be fixed to use the DMA API properly for everything but legacy devices.

No amount of desparate hacks is going to fix that fundamental problem,
and I'm starting to get really sick and tired of all the crap patches
published in this area.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:11   ` Christoph Hellwig
@ 2020-02-20 16:23     ` Christian Borntraeger
  2020-02-20 16:31       ` Christoph Hellwig
  0 siblings, 1 reply; 58+ messages in thread
From: Christian Borntraeger @ 2020-02-20 16:23 UTC (permalink / raw)
  To: Christoph Hellwig, Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization, iommu,
	David Gibson, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy



On 20.02.20 17:11, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote:
>> Currently force_dma_unencrypted() is only used by the direct
>> implementation of the DMA API, and thus resides in dma-direct.h. But
>> there is nothing dma-direct specific about it: if one was -- for
>> whatever reason -- to implement custom DMA ops that have to in the
>> encrypted/protected scenarios dma-direct currently deals with, one would
>> need exactly this kind of information.
> 
> I really don't think it has business being anywhre else, and your completely
> bogus second patch just proves the point.

From a users perspective it makes absolutely perfect sense to use the
bounce buffers when they are NEEDED. 
Forcing the user to specify iommu_platform just because you need bounce buffers
really feels wrong. And obviously we have a severe performance issue
because of the indirections.

Now: I understand that you want to get this fixes differently, but maybe you 
could help to outline how this could be fixed proper. 

Christian

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:23     ` Christian Borntraeger
@ 2020-02-20 16:31       ` Christoph Hellwig
  2020-02-20 17:00         ` Christian Borntraeger
  2020-02-21  3:27         ` David Gibson
  0 siblings, 2 replies; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-20 16:31 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, iommu, David Gibson, Michael Mueller, Lendacky,
	Thomas, Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> >From a users perspective it makes absolutely perfect sense to use the
> bounce buffers when they are NEEDED. 
> Forcing the user to specify iommu_platform just because you need bounce buffers
> really feels wrong. And obviously we have a severe performance issue
> because of the indirections.

The point is that the user should not have to specify iommu_platform.
We need to make sure any new hypervisor (especially one that might require
bounce buffering) always sets it, as was a rather bogus legacy hack
that isn't extensibe for cases that for example require bounce buffering.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:31       ` Christoph Hellwig
@ 2020-02-20 17:00         ` Christian Borntraeger
  2020-02-21  3:27         ` David Gibson
  1 sibling, 0 replies; 58+ messages in thread
From: Christian Borntraeger @ 2020-02-20 17:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, iommu, David Gibson, Michael Mueller, Lendacky,
	Thomas, Viktor Mihajlovski, Robin Murphy



On 20.02.20 17:31, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
>> >From a users perspective it makes absolutely perfect sense to use the
>> bounce buffers when they are NEEDED. 
>> Forcing the user to specify iommu_platform just because you need bounce buffers
>> really feels wrong. And obviously we have a severe performance issue
>> because of the indirections.
> 
> The point is that the user should not have to specify iommu_platform.

I (and Halil) agree on that. From a user perspective we want to
have the right thing in the guest without any command option. The iommu_plaform
thing was indeed a first step to make things work. 

I might mis-read Halils patch, but isnt one effect of this patch set 
that things WILL work without iommu_platform?


> We need to make sure any new hypervisor (especially one that might require
> bounce buffering) always sets it, as was a rather bogus legacy hack
> that isn't extensibe for cases that for example require bounce buffering.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
  2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
  2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
@ 2020-02-20 20:48 ` Michael S. Tsirkin
  2020-02-20 21:29 ` Michael S. Tsirkin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-20 20:48 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> Currently if one intends to run a memory protection enabled VM with
> virtio devices and linux as the guest OS, one needs to specify the
> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> linux use the DMA API, which in turn handles the memory
> encryption/protection stuff if the guest decides to turn itself into
> a protected one. This however makes no sense due to multiple reasons:
> * The device is not changed by the fact that the guest RAM is
> protected. The so called IOMMU bypass quirk is not affected.
> * This usage is not congruent with  standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM). 
> 
> This series aims to decouple 'have to use DMA API because my (guest) RAM
> is protected' and 'have to use DMA API because the device told me
> VIRTIO_F_IOMMU_PLATFORM'.
> 
> Please find more detailed explanations about the conceptual aspects in
> the individual patches. There is however also a very practical problem
> that is addressed by this series. 
> 
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

That's just a QEMU thing. It uses the same flag for VIRTIO_F_ACCESS_PLATFORM
and vhost IOTLB. QEMU can separate them, no need to change linux.


> Halil Pasic (2):
>   mm: move force_dma_unencrypted() to mem_encrypt.h
>   virtio: let virtio use DMA API when guest RAM is protected
> 
>  drivers/virtio/virtio_ring.c |  3 +++
>  include/linux/dma-direct.h   |  9 ---------
>  include/linux/mem_encrypt.h  | 10 ++++++++++
>  3 files changed, 13 insertions(+), 9 deletions(-)
> 
> 
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> -- 
> 2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
  2020-02-20 16:13   ` Christoph Hellwig
@ 2020-02-20 20:55   ` Michael S. Tsirkin
  2020-02-21  1:17     ` Ram Pai
  2020-02-21 13:12     ` Halil Pasic
  1 sibling, 2 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-20 20:55 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> Currently the advanced guest memory protection technologies (AMD SEV,
> powerpc secure guest technology and s390 Protected VMs) abuse the
> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> is in turn necessary, to make IO work with guest memory protection.
> 
> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> different beast: with virtio devices whose implementation runs on an SMP
> CPU we are still fine with doing all the usual optimizations, it is just
> that we need to make sure that the memory protection mechanism does not
> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> side of the guest (and possibly he host side as well) than we actually
> need.
> 
> An additional benefit of teaching the guest to make the right decision
> (and use DMA API) on it's own is: removing the need, to mandate special
> VM configuration for guests that may run with protection. This is
> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> the virtio control structures into the first 2G of guest memory:
> something we don't necessarily want to do per-default.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Tested-by: Ram Pai <linuxram@us.ibm.com>
> Tested-by: Michael Mueller <mimu@linux.ibm.com>

This might work for you but it's fragile, since without
VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
GPA's, not DMA addresses.



IOW this looks like another iteration of:

	virtio: Support encrypted memory on powerpc secure guests

which I was under the impression was abandoned as unnecessary.


To summarize, the necessary conditions for a hack along these lines
(using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:

  - secure guest mode is enabled - so we know that since we don't share
    most memory regular virtio code won't
    work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
  - DMA API is giving us addresses that are actually also physical
    addresses
  - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
 
I don't see how this patch does this.


> ---
>  drivers/virtio/virtio_ring.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 867c7ebd3f10..fafc8f924955 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  	if (!virtio_has_iommu_quirk(vdev))
>  		return true;
>  
> +	if (force_dma_unencrypted(&vdev->dev))
> +		return true;
> +
>  	/* Otherwise, we are left to guess. */
>  	/*
>  	 * In theory, it's possible to have a buggy QEMU-supposed
> -- 
> 2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
                   ` (2 preceding siblings ...)
  2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin
@ 2020-02-20 21:29 ` Michael S. Tsirkin
  2020-02-21 13:37   ` Halil Pasic
  2020-02-20 21:33 ` Michael S. Tsirkin
  2020-02-21  6:22 ` Jason Wang
  5 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-20 21:29 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> * This usage is not congruent with  standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM). 

Quoting the spec:

  \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that
  the device can be used on a platform where device access to data
  in memory is limited and/or translated. E.g. this is the case if the device can be located
  behind an IOMMU that translates bus addresses from the device into physical
  addresses in memory, if the device can be limited to only access
  certain memory addresses or if special commands such as
  a cache flush can be needed to synchronise data in memory with
  the device. Whether accesses are actually limited or translated
  is described by platform-specific means.
  If this feature bit is set to 0, then the device
  has same access to memory addresses supplied to it as the
  driver has.
  In particular, the device will always use physical addresses
  matching addresses used by the driver (typically meaning
  physical addresses used by the CPU)
  and not translated further, and can access any address supplied to it by
  the driver. When clear, this overrides any platform-specific description of
  whether device access is limited or translated in any way, e.g.
  whether an IOMMU may be present.

since device can't access encrypted memory,
this seems to match your case reasonably well.

-- 
MST

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
                   ` (3 preceding siblings ...)
  2020-02-20 21:29 ` Michael S. Tsirkin
@ 2020-02-20 21:33 ` Michael S. Tsirkin
  2020-02-21 13:49   ` Halil Pasic
  2020-02-21 16:41   ` Christoph Hellwig
  2020-02-21  6:22 ` Jason Wang
  5 siblings, 2 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-20 21:33 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
Working around it by the guest looks wrong ...

> Halil Pasic (2):
>   mm: move force_dma_unencrypted() to mem_encrypt.h
>   virtio: let virtio use DMA API when guest RAM is protected
> 
>  drivers/virtio/virtio_ring.c |  3 +++
>  include/linux/dma-direct.h   |  9 ---------
>  include/linux/mem_encrypt.h  | 10 ++++++++++
>  3 files changed, 13 insertions(+), 9 deletions(-)
> 
> 
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> -- 
> 2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 20:55   ` Michael S. Tsirkin
@ 2020-02-21  1:17     ` Ram Pai
  2020-02-21  3:29       ` David Gibson
  2020-02-21 13:12     ` Halil Pasic
  1 sibling, 1 reply; 58+ messages in thread
From: Ram Pai @ 2020-02-21  1:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote:
> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > Currently the advanced guest memory protection technologies (AMD SEV,
> > powerpc secure guest technology and s390 Protected VMs) abuse the
> > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > is in turn necessary, to make IO work with guest memory protection.
> > 
> > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > different beast: with virtio devices whose implementation runs on an SMP
> > CPU we are still fine with doing all the usual optimizations, it is just
> > that we need to make sure that the memory protection mechanism does not
> > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > side of the guest (and possibly he host side as well) than we actually
> > need.
> > 
> > An additional benefit of teaching the guest to make the right decision
> > (and use DMA API) on it's own is: removing the need, to mandate special
> > VM configuration for guests that may run with protection. This is
> > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > the virtio control structures into the first 2G of guest memory:
> > something we don't necessarily want to do per-default.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Tested-by: Ram Pai <linuxram@us.ibm.com>
> > Tested-by: Michael Mueller <mimu@linux.ibm.com>
> 
> This might work for you but it's fragile, since without
> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> GPA's, not DMA addresses.
> 
> 
> 
> IOW this looks like another iteration of:
> 
> 	virtio: Support encrypted memory on powerpc secure guests
> 
> which I was under the impression was abandoned as unnecessary.

It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM;
by default, flag on powerpc.

We would like to enable secure guests on powerpc without this flag
aswell enabled, but past experience has educated us that its not a easy
path.  However if Halil makes some inroads in this path for s390, we
will like to support him.


RP

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 16:13   ` Christoph Hellwig
@ 2020-02-21  2:59     ` David Gibson
  2020-02-21  3:41       ` Jason Wang
                         ` (2 more replies)
  2020-02-21 14:33     ` Halil Pasic
  1 sibling, 3 replies; 58+ messages in thread
From: David Gibson @ 2020-02-21  2:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 2208 bytes --]

On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 867c7ebd3f10..fafc8f924955 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  	if (!virtio_has_iommu_quirk(vdev))
> >  		return true;
> >  
> > +	if (force_dma_unencrypted(&vdev->dev))
> > +		return true;
> 
> Hell no.  This is a detail of the platform DMA direct implementation.
> Drivers have no business looking at this flag, and virtio finally needs
> to be fixed to use the DMA API properly for everything but legacy devices.

So, this patch definitely isn't right as it stands, but I'm struggling
to understand what it is you're saying is the right way.

By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
see how being a legacy device or not relates to use of the DMA API.

I *think* what you are suggesting here is that virtio devices that
have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
DMA API treats IOVA==PA, which will satisfy what the device expects.
Then the virtio driver can use the DMA API the same way for both
F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.

But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices,
then AFAICT it will work equally well for legacy devices.

Using the DMA API for *everything* in virtio, legacy or not, seems
like a reasonable approach to me.  But, AFAICT, that does require the
DMA layer to have some kind of explicit call to turn on this
behaviour, which the virtio driver would call during initializsation.
I don't think we can do it 100% within the DMA layer, because only the
driver can reasonably know when a device has this weird non-standard
DMA behaviour.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-20 16:31       ` Christoph Hellwig
  2020-02-20 17:00         ` Christian Borntraeger
@ 2020-02-21  3:27         ` David Gibson
  2020-02-21 13:06           ` Halil Pasic
  1 sibling, 1 reply; 58+ messages in thread
From: David Gibson @ 2020-02-21  3:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 2284 bytes --]

On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > >From a users perspective it makes absolutely perfect sense to use the
> > bounce buffers when they are NEEDED. 
> > Forcing the user to specify iommu_platform just because you need bounce buffers
> > really feels wrong. And obviously we have a severe performance issue
> > because of the indirections.
> 
> The point is that the user should not have to specify iommu_platform.
> We need to make sure any new hypervisor (especially one that might require
> bounce buffering) always sets it,

So, I have draft qemu patches which enable iommu_platform by default.
But that's really because of other problems with !iommu_platform, not
anything to do with bounce buffering or secure VMs.

The thing is that the hypervisor *doesn't* require bounce buffering.
In the POWER (and maybe s390 as well) models for Secure VMs, it's the
*guest*'s choice to enter secure mode, so the hypervisor has no reason
to know whether the guest needs bounce buffering.  As far as the
hypervisor and qemu are concerned that's a guest internal detail, it
just expects to get addresses it can access whether those are GPAs
(iommu_platform=off) or IOVAs (iommu_platform=on).

> as was a rather bogus legacy hack

It was certainly a bad idea, but it was a bad idea that went into a
public spec and has been widely deployed for many years.  We can't
just pretend it didn't happen and move on.

Turning iommu_platform=on by default breaks old guests, some of which
we still care about.  We can't (automatically) do it only for guests
that need bounce buffering, because the hypervisor doesn't know that
ahead of time.

> that isn't extensibe for cases that for example require bounce buffering.

In fact bounce buffering isn't really the issue from the hypervisor
(or spec's) point of view.  It's the fact that not all of guest memory
is accessible to the hypervisor.  Bounce buffering is just one way the
guest might deal with that.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21  1:17     ` Ram Pai
@ 2020-02-21  3:29       ` David Gibson
  0 siblings, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-21  3:29 UTC (permalink / raw)
  To: Ram Pai
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 2599 bytes --]

On Thu, Feb 20, 2020 at 05:17:48PM -0800, Ram Pai wrote:
> On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > > Currently the advanced guest memory protection technologies (AMD SEV,
> > > powerpc secure guest technology and s390 Protected VMs) abuse the
> > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > > is in turn necessary, to make IO work with guest memory protection.
> > > 
> > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > > different beast: with virtio devices whose implementation runs on an SMP
> > > CPU we are still fine with doing all the usual optimizations, it is just
> > > that we need to make sure that the memory protection mechanism does not
> > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > > side of the guest (and possibly he host side as well) than we actually
> > > need.
> > > 
> > > An additional benefit of teaching the guest to make the right decision
> > > (and use DMA API) on it's own is: removing the need, to mandate special
> > > VM configuration for guests that may run with protection. This is
> > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > > the virtio control structures into the first 2G of guest memory:
> > > something we don't necessarily want to do per-default.
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > Tested-by: Ram Pai <linuxram@us.ibm.com>
> > > Tested-by: Michael Mueller <mimu@linux.ibm.com>
> > 
> > This might work for you but it's fragile, since without
> > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> > GPA's, not DMA addresses.
> > 
> > 
> > 
> > IOW this looks like another iteration of:
> > 
> > 	virtio: Support encrypted memory on powerpc secure guests
> > 
> > which I was under the impression was abandoned as unnecessary.
> 
> It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM;
> by default, flag on powerpc.

Uh... we haven't yet, though we're working on it.

> We would like to enable secure guests on powerpc without this flag
> aswell enabled, but past experience has educated us that its not a easy
> path.  However if Halil makes some inroads in this path for s390, we
> will like to support him.
> 
> 
> RP
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21  2:59     ` David Gibson
@ 2020-02-21  3:41       ` Jason Wang
  2020-02-21 13:31         ` Halil Pasic
  2020-02-21 13:27       ` Halil Pasic
  2020-02-21 16:36       ` Christoph Hellwig
  2 siblings, 1 reply; 58+ messages in thread
From: Jason Wang @ 2020-02-21  3:41 UTC (permalink / raw)
  To: David Gibson, Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy


On 2020/2/21 上午10:59, David Gibson wrote:
> On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote:
>> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>> index 867c7ebd3f10..fafc8f924955 100644
>>> --- a/drivers/virtio/virtio_ring.c
>>> +++ b/drivers/virtio/virtio_ring.c
>>> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>   	if (!virtio_has_iommu_quirk(vdev))
>>>   		return true;
>>>   
>>> +	if (force_dma_unencrypted(&vdev->dev))
>>> +		return true;
>> Hell no.  This is a detail of the platform DMA direct implementation.
>> Drivers have no business looking at this flag, and virtio finally needs
>> to be fixed to use the DMA API properly for everything but legacy devices.
> So, this patch definitely isn't right as it stands, but I'm struggling
> to understand what it is you're saying is the right way.
>
> By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
> lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
> see how being a legacy device or not relates to use of the DMA API.
>
> I *think* what you are suggesting here is that virtio devices that
> have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
> DMA API treats IOVA==PA, which will satisfy what the device expects.


Can this work for swiotlb?

Thanks


> Then the virtio driver can use the DMA API the same way for both
> F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.
>
> But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices,
> then AFAICT it will work equally well for legacy devices.
>
> Using the DMA API for *everything* in virtio, legacy or not, seems
> like a reasonable approach to me.  But, AFAICT, that does require the
> DMA layer to have some kind of explicit call to turn on this
> behaviour, which the virtio driver would call during initializsation.
> I don't think we can do it 100% within the DMA layer, because only the
> driver can reasonably know when a device has this weird non-standard
> DMA behaviour.
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
                   ` (4 preceding siblings ...)
  2020-02-20 21:33 ` Michael S. Tsirkin
@ 2020-02-21  6:22 ` Jason Wang
  2020-02-21 14:56   ` Halil Pasic
  5 siblings, 1 reply; 58+ messages in thread
From: Jason Wang @ 2020-02-21  6:22 UTC (permalink / raw)
  To: Halil Pasic, Michael S. Tsirkin, Marek Szyprowski, Robin Murphy,
	Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Christian Borntraeger,
	iommu, Michael Mueller, Viktor Mihajlovski, David Gibson


On 2020/2/21 上午12:06, Halil Pasic wrote:
> Currently if one intends to run a memory protection enabled VM with
> virtio devices and linux as the guest OS, one needs to specify the
> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> linux use the DMA API, which in turn handles the memory
> encryption/protection stuff if the guest decides to turn itself into
> a protected one. This however makes no sense due to multiple reasons:
> * The device is not changed by the fact that the guest RAM is
> protected. The so called IOMMU bypass quirk is not affected.
> * This usage is not congruent with  standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM).
>
> This series aims to decouple 'have to use DMA API because my (guest) RAM
> is protected' and 'have to use DMA API because the device told me
> VIRTIO_F_IOMMU_PLATFORM'.
>
> Please find more detailed explanations about the conceptual aspects in
> the individual patches. There is however also a very practical problem
> that is addressed by this series.
>
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10).


Do you see a consistent degradation on the performance, or it only 
happen when for during the beginning of the test?


> BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.


I suspect we can do optimization in qemu side.

E.g send memtable entry via IOTLB API when vIOMMU is not enabled.

If this makes sense, I can draft patch to see if there's any difference.

Thanks


>
> Halil Pasic (2):
>    mm: move force_dma_unencrypted() to mem_encrypt.h
>    virtio: let virtio use DMA API when guest RAM is protected
>
>   drivers/virtio/virtio_ring.c |  3 +++
>   include/linux/dma-direct.h   |  9 ---------
>   include/linux/mem_encrypt.h  | 10 ++++++++++
>   3 files changed, 13 insertions(+), 9 deletions(-)
>
>
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-21  3:27         ` David Gibson
@ 2020-02-21 13:06           ` Halil Pasic
  2020-02-21 15:48             ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:06 UTC (permalink / raw)
  To: David Gibson
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 2498 bytes --]

On Fri, 21 Feb 2020 14:27:27 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > > >From a users perspective it makes absolutely perfect sense to use the
> > > bounce buffers when they are NEEDED. 
> > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > really feels wrong. And obviously we have a severe performance issue
> > > because of the indirections.
> > 
> > The point is that the user should not have to specify iommu_platform.
> > We need to make sure any new hypervisor (especially one that might require
> > bounce buffering) always sets it,
> 
> So, I have draft qemu patches which enable iommu_platform by default.
> But that's really because of other problems with !iommu_platform, not
> anything to do with bounce buffering or secure VMs.
> 
> The thing is that the hypervisor *doesn't* require bounce buffering.
> In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> *guest*'s choice to enter secure mode, so the hypervisor has no reason
> to know whether the guest needs bounce buffering.  As far as the
> hypervisor and qemu are concerned that's a guest internal detail, it
> just expects to get addresses it can access whether those are GPAs
> (iommu_platform=off) or IOVAs (iommu_platform=on).

I very much agree!

> 
> > as was a rather bogus legacy hack
> 
> It was certainly a bad idea, but it was a bad idea that went into a
> public spec and has been widely deployed for many years.  We can't
> just pretend it didn't happen and move on.
> 
> Turning iommu_platform=on by default breaks old guests, some of which
> we still care about.  We can't (automatically) do it only for guests
> that need bounce buffering, because the hypervisor doesn't know that
> ahead of time.

Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
because for CCW I/O there is no such thing as IOMMU and the addresses
are always physical addresses.

> 
> > that isn't extensibe for cases that for example require bounce buffering.
> 
> In fact bounce buffering isn't really the issue from the hypervisor
> (or spec's) point of view.  It's the fact that not all of guest memory
> is accessible to the hypervisor.  Bounce buffering is just one way the
> guest might deal with that.
> 

Agreed.

Regards,
Halil




[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 20:55   ` Michael S. Tsirkin
  2020-02-21  1:17     ` Ram Pai
@ 2020-02-21 13:12     ` Halil Pasic
  2020-02-21 15:39       ` Tom Lendacky
  2020-02-21 15:56       ` Michael S. Tsirkin
  1 sibling, 2 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, 20 Feb 2020 15:55:14 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > Currently the advanced guest memory protection technologies (AMD SEV,
> > powerpc secure guest technology and s390 Protected VMs) abuse the
> > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > is in turn necessary, to make IO work with guest memory protection.
> > 
> > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > different beast: with virtio devices whose implementation runs on an SMP
> > CPU we are still fine with doing all the usual optimizations, it is just
> > that we need to make sure that the memory protection mechanism does not
> > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > side of the guest (and possibly he host side as well) than we actually
> > need.
> > 
> > An additional benefit of teaching the guest to make the right decision
> > (and use DMA API) on it's own is: removing the need, to mandate special
> > VM configuration for guests that may run with protection. This is
> > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > the virtio control structures into the first 2G of guest memory:
> > something we don't necessarily want to do per-default.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Tested-by: Ram Pai <linuxram@us.ibm.com>
> > Tested-by: Michael Mueller <mimu@linux.ibm.com>
> 
> This might work for you but it's fragile, since without
> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> GPA's, not DMA addresses.
> 

Thanks for your constructive approach. I do want the hypervisor to
assume it gets GPA's. My train of thought was that the guys that need
to use IOVA's that are not GPA's when force_dma_unencrypted() will have
to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because
otherwise it won't work. But I see your point: in case of a
mis-configuration and provided the DMA API returns IOVA's one could end
up trying to touch wrong memory locations. But this should be similar to
what would happen if DMA ops are not used, and memory is not made accessible.

> 
> 
> IOW this looks like another iteration of:
> 
> 	virtio: Support encrypted memory on powerpc secure guests
> 
> which I was under the impression was abandoned as unnecessary.

Unnecessary for powerpc because they do normal PCI. In the context of
CCW there are only guest physical addresses (CCW I/O has no concept of
IOMMU or IOVAs).

> 
> 
> To summarize, the necessary conditions for a hack along these lines
> (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:
> 
>   - secure guest mode is enabled - so we know that since we don't share
>     most memory regular virtio code won't
>     work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM

force_dma_unencrypted(&vdev->dev) is IMHO exactly about this.

>   - DMA API is giving us addresses that are actually also physical
>     addresses

In case of s390 this is given. I talked with the power people before
posting this, and they ensured me they can are willing to deal with
this. I was hoping to talk abut this with the AMD SEV people here (hence
the cc).

>   - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
> 

I don't get this point. The argument where the hypervisor is buggy is a
bit hard to follow for me. If hypervisor is buggy we have already lost
anyway most of the time, or?
 
> I don't see how this patch does this.

I do get your point. I don't know of a good way to check that DMA API
is giving us addresses that are actually physical addresses, and the
situation you describe definitely has some risk to it.

Let me comment on other ideas that came up. I would be very happy to go
with the best one. Thank you very much.

Regards,
Halil

> 
> 
> > ---
> >  drivers/virtio/virtio_ring.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 867c7ebd3f10..fafc8f924955 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  	if (!virtio_has_iommu_quirk(vdev))
> >  		return true;
> >  
> > +	if (force_dma_unencrypted(&vdev->dev))
> > +		return true;
> > +
> >  	/* Otherwise, we are left to guess. */
> >  	/*
> >  	 * In theory, it's possible to have a buggy QEMU-supposed
> > -- 
> > 2.17.1
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21  2:59     ` David Gibson
  2020-02-21  3:41       ` Jason Wang
@ 2020-02-21 13:27       ` Halil Pasic
  2020-02-21 16:36       ` Christoph Hellwig
  2 siblings, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:27 UTC (permalink / raw)
  To: David Gibson
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 3282 bytes --]

On Fri, 21 Feb 2020 13:59:15 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 867c7ebd3f10..fafc8f924955 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  	if (!virtio_has_iommu_quirk(vdev))
> > >  		return true;
> > >  
> > > +	if (force_dma_unencrypted(&vdev->dev))
> > > +		return true;
> > 
> > Hell no.  This is a detail of the platform DMA direct implementation.
> > Drivers have no business looking at this flag, and virtio finally needs
> > to be fixed to use the DMA API properly for everything but legacy devices.
> 
> So, this patch definitely isn't right as it stands, but I'm struggling
> to understand what it is you're saying is the right way.
> 
> By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
> lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
> see how being a legacy device or not relates to use of the DMA API.
> 
> I *think* what you are suggesting here is that virtio devices that
> have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
> DMA API treats IOVA==PA, which will satisfy what the device expects.
> Then the virtio driver can use the DMA API the same way for both
> F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.

I've considered this idea, and as a matter a fact would be my preferred
solution. It would be, if we could use GFP_DMA when allocating coherent
memory through the DMA API. For CCW devices we *must* have some of the
device accessible memory 31 bit addressable (in physical address space),
because the s390 architecture *mandates it*. I had patches
(https://lkml.org/lkml/2019/9/23/313) that do that but Christoph was not
open to the idea.

Always pushing all the stuff allocated by virtio devices into the DMA
zone is not a good option for us. We did that for protected guests,
because that was what Christoph was willing to accept, but not happy
with it.

I'm willing to go down that rute, but I really need the capability to
tell the DMA API 'do this allocation from ZONE_DMA'. I don't care how.


> 
> But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices,
> then AFAICT it will work equally well for legacy devices.

I agree.

> 
> Using the DMA API for *everything* in virtio, legacy or not, seems
> like a reasonable approach to me.  But, AFAICT, that does require the
> DMA layer to have some kind of explicit call to turn on this
> behaviour, which the virtio driver would call during initializsation.
> I don't think we can do it 100% within the DMA layer, because only the
> driver can reasonably know when a device has this weird non-standard
> DMA behaviour.
> 

IMHO one could make the DMA API work with different DMA ops. The
virtio transport or the virtio core can probably intervene. But then, we
(virtio-ccw) would need the iommu=off DMA ops to do the bounce buffering
iff force_dma_unencrypted().

Regards,
Halil

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21  3:41       ` Jason Wang
@ 2020-02-21 13:31         ` Halil Pasic
  0 siblings, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:31 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Viktor Mihajlovski, Janosch Frank,
	Michael S. Tsirkin, Cornelia Huck, Ram Pai, linux-kernel,
	virtualization, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, David Gibson, Robin Murphy, Christoph Hellwig

On Fri, 21 Feb 2020 11:41:57 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > I *think* what you are suggesting here is that virtio devices that
> > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
> > DMA API treats IOVA==PA, which will satisfy what the device expects.  
> 
> 
> Can this work for swiotlb?

It works on s390. I guess it would be the responsibility of however
provides the dma ops for the virtio device to ensure that if 
!F_IOMMU_PLATFORM the addresses are GPA like *mandated* by the VIRTIO
specification.

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 21:29 ` Michael S. Tsirkin
@ 2020-02-21 13:37   ` Halil Pasic
  0 siblings, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, 20 Feb 2020 16:29:50 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> > * This usage is not congruent with  standardised semantics of
> > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > for using DMA API in virtio (orthogonal with respect to what is
> > expressed by VIRTIO_F_IOMMU_PLATFORM). 
> 
> Quoting the spec:
> 
>   \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that
>   the device can be used on a platform where device access to data
>   in memory is limited and/or translated. E.g. this is the case if the device can be located
>   behind an IOMMU that translates bus addresses from the device into physical
>   addresses in memory, if the device can be limited to only access
>   certain memory addresses or if special commands such as
>   a cache flush can be needed to synchronise data in memory with
>   the device. Whether accesses are actually limited or translated
>   is described by platform-specific means.
>   If this feature bit is set to 0, then the device
>   has same access to memory addresses supplied to it as the
>   driver has.
>   In particular, the device will always use physical addresses
>   matching addresses used by the driver (typically meaning
>   physical addresses used by the CPU)
>   and not translated further, and can access any address supplied to it by
>   the driver. When clear, this overrides any platform-specific description of
>   whether device access is limited or translated in any way, e.g.
>   whether an IOMMU may be present.
> 
> since device can't access encrypted memory,
> this seems to match your case reasonably well.
> 

As David already explained, the device does not have to access encrypted
memory. In fact, we don't have memory encryption but memory protection on
s390. All the device *should* ever see is non-protected memory (one that
was previously shared by the guest).

Our protected guests start as non-protected ones, and may or may not
turn protected during their life-span. From the device perspective,
really, nothing changes. I believe David explained this much better than
I did.

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 21:33 ` Michael S. Tsirkin
@ 2020-02-21 13:49   ` Halil Pasic
  2020-02-21 16:41   ` Christoph Hellwig
  1 sibling, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 13:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Thu, 20 Feb 2020 16:33:35 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > effect The vhost code assumes it the addresses on the virtio descriptor
> > ring are not guest physical addresses but iova's, and insists on doing a
> > translation of these regardless of what transport is used (e.g. whether
> > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > "vhost: new device IOTLB API".) On s390 this results in severe
> > performance degradation (c.a. factor 10). BTW with ccw I/O there is
> > (architecturally) no IOMMU, so the whole address translation makes no
> > sense in the context of virtio-ccw.
> 
> So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> Working around it by the guest looks wrong ...

While do think that forcing IOMMU_PLATFORM on devices that do not
want or need it, just to trigger DMA API usage in guest is conceptually
wrong, I do agree, that we might have a host issue. Namely, unlike PCI,
CCW does not have an IOMMU, and the spec clearly states that "Whether
accesses are actually limited or translated is described by
platform-specific means.". With CCW devices we don't have address translation
by IOMMU, and in that sense vhost is probably wrong about trying to do
the translation. That is why I mentioned the patch, and that it's done
regardless of the transport/platform.

Regards,
Halil


> 
> > Halil Pasic (2):
> >   mm: move force_dma_unencrypted() to mem_encrypt.h
> >   virtio: let virtio use DMA API when guest RAM is protected
> > 
> >  drivers/virtio/virtio_ring.c |  3 +++
> >  include/linux/dma-direct.h   |  9 ---------
> >  include/linux/mem_encrypt.h  | 10 ++++++++++
> >  3 files changed, 13 insertions(+), 9 deletions(-)
> > 
> > 
> > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > -- 
> > 2.17.1
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-20 16:13   ` Christoph Hellwig
  2020-02-21  2:59     ` David Gibson
@ 2020-02-21 14:33     ` Halil Pasic
  2020-02-21 16:39       ` Christoph Hellwig
  2020-02-22 19:07       ` Michael S. Tsirkin
  1 sibling, 2 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 14:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy

On Thu, 20 Feb 2020 17:13:09 +0100
Christoph Hellwig <hch@lst.de> wrote:

> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 867c7ebd3f10..fafc8f924955 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  	if (!virtio_has_iommu_quirk(vdev))
> >  		return true;
> >  
> > +	if (force_dma_unencrypted(&vdev->dev))
> > +		return true;
> 
> Hell no.  This is a detail of the platform DMA direct implementation.

I beg to differ. If it was a detail of the DMA direct implementation, it
should have/would have been private to kernel/dma/direct.c. 

A look at 

$ git grep -e force_dma_unencrypted
arch/powerpc/include/asm/mem_encrypt.h:static inline bool force_dma_unencrypted(struct device *dev)
arch/s390/mm/init.c:bool force_dma_unencrypted(struct device *dev)
arch/x86/mm/mem_encrypt.c:bool force_dma_unencrypted(struct device *dev)
include/linux/dma-direct.h:bool force_dma_unencrypted(struct device *dev);
include/linux/dma-direct.h:static inline bool force_dma_unencrypted(struct device *dev)
kernel/dma/direct.c:    if (force_dma_unencrypted(dev))
kernel/dma/direct.c:    if (force_dma_unencrypted(dev))
kernel/dma/direct.c:        !force_dma_unencrypted(dev)) {
kernel/dma/direct.c:    if (force_dma_unencrypted(dev))
kernel/dma/direct.c:    if (force_dma_unencrypted(dev))
kernel/dma/direct.c:        !force_dma_unencrypted(dev)) {
kernel/dma/direct.c:    if (force_dma_unencrypted(dev))

tells you, that force_dma_unencrypted() is *consumed* by dma direct, but
*provided* by the memory encryption or memory management code.
I.e. platform code tells the dma (direct) code what decisions to make
under certain circumstances.

Consider what would we have to do to make PCI devices do I/O trough
pages that were shared when the guest is running in a protected VM. The
s390_pci_dma_ops would also need to know whether to 'force dma uencrypted'
or not, and it's the exact same logic. I doubt simply using DMA direct
for zPCI would do, because we still have to do all the Z specific IOMMU
management.

> Drivers have no business looking at this flag, and virtio finally needs
> to be fixed to use the DMA API properly for everything but legacy devices.

See the follow on discussion with David Gibson. In short: I'm in favor
of always using DMA API iff we keep conformance with the VIRTIO spec and
if it does not imply any degradations for s390 (virtio-ccw), or any
other regressions.

> 
> No amount of desparate hacks is going to fix that fundamental problem,
> and I'm starting to get really sick and tired of all the crap patches
> published in this area.

I don't think I deserve the offensive language.

AFAIU you have a positive attitude towards the idea, that 
!F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' 
should be scrapped. 

I would like to accomplish that without adverse effects to virtio-ccw
(because caring for virtio-ccw is a part of job description).

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-21  6:22 ` Jason Wang
@ 2020-02-21 14:56   ` Halil Pasic
  2020-02-24  3:38     ` David Gibson
  2020-02-24  4:01     ` Jason Wang
  0 siblings, 2 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 14:56 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Christian Borntraeger,
	iommu, David Gibson, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Fri, 21 Feb 2020 14:22:26 +0800
Jason Wang <jasowang@redhat.com> wrote:

> 
> On 2020/2/21 上午12:06, Halil Pasic wrote:
> > Currently if one intends to run a memory protection enabled VM with
> > virtio devices and linux as the guest OS, one needs to specify the
> > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > linux use the DMA API, which in turn handles the memory
> > encryption/protection stuff if the guest decides to turn itself into
> > a protected one. This however makes no sense due to multiple reasons:
> > * The device is not changed by the fact that the guest RAM is
> > protected. The so called IOMMU bypass quirk is not affected.
> > * This usage is not congruent with  standardised semantics of
> > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > for using DMA API in virtio (orthogonal with respect to what is
> > expressed by VIRTIO_F_IOMMU_PLATFORM).
> >
> > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > is protected' and 'have to use DMA API because the device told me
> > VIRTIO_F_IOMMU_PLATFORM'.
> >
> > Please find more detailed explanations about the conceptual aspects in
> > the individual patches. There is however also a very practical problem
> > that is addressed by this series.
> >
> > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > effect The vhost code assumes it the addresses on the virtio descriptor
> > ring are not guest physical addresses but iova's, and insists on doing a
> > translation of these regardless of what transport is used (e.g. whether
> > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > "vhost: new device IOTLB API".) On s390 this results in severe
> > performance degradation (c.a. factor 10).
> 
> 
> Do you see a consistent degradation on the performance, or it only 
> happen when for during the beginning of the test?
> 

AFAIK the degradation is consistent.

> 
> > BTW with ccw I/O there is
> > (architecturally) no IOMMU, so the whole address translation makes no
> > sense in the context of virtio-ccw.
> 
> 
> I suspect we can do optimization in qemu side.
> 
> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> 
> If this makes sense, I can draft patch to see if there's any difference.

Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
then necessary translation) for virtio-ccw altogether. But Michael
voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
that could be used with guests running in protected mode. I don't share
his opinion, but that's an ongoing discussion.

Should we end up having to do translation from IOVA in vhost, we are
very interested in that translation being fast and efficient.

In that sense we would be very happy to test any optimization that aim
into that direction.

Thank you very much for your input!

Regards,
Halil

> 
> Thanks
> 
> 
> >
> > Halil Pasic (2):
> >    mm: move force_dma_unencrypted() to mem_encrypt.h
> >    virtio: let virtio use DMA API when guest RAM is protected
> >
> >   drivers/virtio/virtio_ring.c |  3 +++
> >   include/linux/dma-direct.h   |  9 ---------
> >   include/linux/mem_encrypt.h  | 10 ++++++++++
> >   3 files changed, 13 insertions(+), 9 deletions(-)
> >
> >
> > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 13:12     ` Halil Pasic
@ 2020-02-21 15:39       ` Tom Lendacky
  2020-02-24  6:40         ` David Gibson
  2020-02-21 15:56       ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Tom Lendacky @ 2020-02-21 15:39 UTC (permalink / raw)
  To: Halil Pasic, Michael S. Tsirkin
  Cc: linux-s390, Huang, Wei, Singh, Brijesh, Janosch Frank,
	Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On 2/21/20 7:12 AM, Halil Pasic wrote:
> On Thu, 20 Feb 2020 15:55:14 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
>>> Currently the advanced guest memory protection technologies (AMD SEV,
>>> powerpc secure guest technology and s390 Protected VMs) abuse the
>>> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
>>> is in turn necessary, to make IO work with guest memory protection.
>>>
>>> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
>>> different beast: with virtio devices whose implementation runs on an SMP
>>> CPU we are still fine with doing all the usual optimizations, it is just
>>> that we need to make sure that the memory protection mechanism does not
>>> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
>>> side of the guest (and possibly he host side as well) than we actually
>>> need.
>>>
>>> An additional benefit of teaching the guest to make the right decision
>>> (and use DMA API) on it's own is: removing the need, to mandate special
>>> VM configuration for guests that may run with protection. This is
>>> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
>>> the virtio control structures into the first 2G of guest memory:
>>> something we don't necessarily want to do per-default.
>>>
>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>> Tested-by: Ram Pai <linuxram@us.ibm.com>
>>> Tested-by: Michael Mueller <mimu@linux.ibm.com>
>>
>> This might work for you but it's fragile, since without
>> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
>> GPA's, not DMA addresses.
>>
> 
> Thanks for your constructive approach. I do want the hypervisor to
> assume it gets GPA's. My train of thought was that the guys that need
> to use IOVA's that are not GPA's when force_dma_unencrypted() will have
> to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because
> otherwise it won't work. But I see your point: in case of a
> mis-configuration and provided the DMA API returns IOVA's one could end
> up trying to touch wrong memory locations. But this should be similar to
> what would happen if DMA ops are not used, and memory is not made accessible.
> 
>>
>>
>> IOW this looks like another iteration of:
>>
>> 	virtio: Support encrypted memory on powerpc secure guests
>>
>> which I was under the impression was abandoned as unnecessary.
> 
> Unnecessary for powerpc because they do normal PCI. In the context of
> CCW there are only guest physical addresses (CCW I/O has no concept of
> IOMMU or IOVAs).
> 
>>
>>
>> To summarize, the necessary conditions for a hack along these lines
>> (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:
>>
>>   - secure guest mode is enabled - so we know that since we don't share
>>     most memory regular virtio code won't
>>     work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
> 
> force_dma_unencrypted(&vdev->dev) is IMHO exactly about this.
> 
>>   - DMA API is giving us addresses that are actually also physical
>>     addresses
> 
> In case of s390 this is given. I talked with the power people before
> posting this, and they ensured me they can are willing to deal with
> this. I was hoping to talk abut this with the AMD SEV people here (hence
> the cc).

Yes, physical addresses are fine for SEV - the key is that the DMA API is
used so that an address for unencrypted, or shared, memory is returned.
E.g. for a dma_alloc_coherent() call this is an allocation that has had
set_memory_decrypted() called or for a dma_map_page() call this is an
address from SWIOTLB, which was mapped shared during boot, where the data
will be bounce-buffered.

We don't currently support an emulated IOMMU in our SEV guest because that
would require a lot of support in the driver to make IOMMU data available
to the hypervisor (I/O page tables, etc.). We would need hardware support
to really make this work easily in the guest.

Thanks,
Tom

> 
>>   - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
>>
> 
> I don't get this point. The argument where the hypervisor is buggy is a
> bit hard to follow for me. If hypervisor is buggy we have already lost
> anyway most of the time, or?
>  
>> I don't see how this patch does this.
> 
> I do get your point. I don't know of a good way to check that DMA API
> is giving us addresses that are actually physical addresses, and the
> situation you describe definitely has some risk to it.
> 
> Let me comment on other ideas that came up. I would be very happy to go
> with the best one. Thank you very much.
> 
> Regards,
> Halil
> 
>>
>>
>>> ---
>>>  drivers/virtio/virtio_ring.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>> index 867c7ebd3f10..fafc8f924955 100644
>>> --- a/drivers/virtio/virtio_ring.c
>>> +++ b/drivers/virtio/virtio_ring.c
>>> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>  	if (!virtio_has_iommu_quirk(vdev))
>>>  		return true;
>>>  
>>> +	if (force_dma_unencrypted(&vdev->dev))
>>> +		return true;
>>> +
>>>  	/* Otherwise, we are left to guess. */
>>>  	/*
>>>  	 * In theory, it's possible to have a buggy QEMU-supposed
>>> -- 
>>> 2.17.1
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-21 13:06           ` Halil Pasic
@ 2020-02-21 15:48             ` Michael S. Tsirkin
  2020-02-21 18:07               ` Halil Pasic
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-21 15:48 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Lendacky, Thomas,
	Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, David Gibson,
	Robin Murphy, Christoph Hellwig

On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:
> On Fri, 21 Feb 2020 14:27:27 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > > > >From a users perspective it makes absolutely perfect sense to use the
> > > > bounce buffers when they are NEEDED. 
> > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > really feels wrong. And obviously we have a severe performance issue
> > > > because of the indirections.
> > > 
> > > The point is that the user should not have to specify iommu_platform.
> > > We need to make sure any new hypervisor (especially one that might require
> > > bounce buffering) always sets it,
> > 
> > So, I have draft qemu patches which enable iommu_platform by default.
> > But that's really because of other problems with !iommu_platform, not
> > anything to do with bounce buffering or secure VMs.
> > 
> > The thing is that the hypervisor *doesn't* require bounce buffering.
> > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > to know whether the guest needs bounce buffering.  As far as the
> > hypervisor and qemu are concerned that's a guest internal detail, it
> > just expects to get addresses it can access whether those are GPAs
> > (iommu_platform=off) or IOVAs (iommu_platform=on).
> 
> I very much agree!
> 
> > 
> > > as was a rather bogus legacy hack
> > 
> > It was certainly a bad idea, but it was a bad idea that went into a
> > public spec and has been widely deployed for many years.  We can't
> > just pretend it didn't happen and move on.
> > 
> > Turning iommu_platform=on by default breaks old guests, some of which
> > we still care about.  We can't (automatically) do it only for guests
> > that need bounce buffering, because the hypervisor doesn't know that
> > ahead of time.
> 
> Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> because for CCW I/O there is no such thing as IOMMU and the addresses
> are always physical addresses.

Fix the name then. The spec calls is ACCESS_PLATFORM now, which
makes much more sense.

> > 
> > > that isn't extensibe for cases that for example require bounce buffering.
> > 
> > In fact bounce buffering isn't really the issue from the hypervisor
> > (or spec's) point of view.  It's the fact that not all of guest memory
> > is accessible to the hypervisor.  Bounce buffering is just one way the
> > guest might deal with that.
> > 
> 
> Agreed.
> 
> Regards,
> Halil
> 
> 
> 


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 13:12     ` Halil Pasic
  2020-02-21 15:39       ` Tom Lendacky
@ 2020-02-21 15:56       ` Michael S. Tsirkin
  2020-02-21 16:35         ` Christoph Hellwig
  2020-02-21 18:03         ` Halil Pasic
  1 sibling, 2 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-21 15:56 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Fri, Feb 21, 2020 at 02:12:30PM +0100, Halil Pasic wrote:
> On Thu, 20 Feb 2020 15:55:14 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > > Currently the advanced guest memory protection technologies (AMD SEV,
> > > powerpc secure guest technology and s390 Protected VMs) abuse the
> > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > > is in turn necessary, to make IO work with guest memory protection.
> > > 
> > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > > different beast: with virtio devices whose implementation runs on an SMP
> > > CPU we are still fine with doing all the usual optimizations, it is just
> > > that we need to make sure that the memory protection mechanism does not
> > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > > side of the guest (and possibly he host side as well) than we actually
> > > need.
> > > 
> > > An additional benefit of teaching the guest to make the right decision
> > > (and use DMA API) on it's own is: removing the need, to mandate special
> > > VM configuration for guests that may run with protection. This is
> > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > > the virtio control structures into the first 2G of guest memory:
> > > something we don't necessarily want to do per-default.
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > Tested-by: Ram Pai <linuxram@us.ibm.com>
> > > Tested-by: Michael Mueller <mimu@linux.ibm.com>
> > 
> > This might work for you but it's fragile, since without
> > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> > GPA's, not DMA addresses.
> > 
> 
> Thanks for your constructive approach. I do want the hypervisor to
> assume it gets GPA's. My train of thought was that the guys that need
> to use IOVA's that are not GPA's when force_dma_unencrypted() will have
> to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because
> otherwise it won't work. But I see your point: in case of a
> mis-configuration and provided the DMA API returns IOVA's one could end
> up trying to touch wrong memory locations. But this should be similar to
> what would happen if DMA ops are not used, and memory is not made accessible.
> 
> > 
> > 
> > IOW this looks like another iteration of:
> > 
> > 	virtio: Support encrypted memory on powerpc secure guests
> > 
> > which I was under the impression was abandoned as unnecessary.
> 
> Unnecessary for powerpc because they do normal PCI. In the context of
> CCW there are only guest physical addresses (CCW I/O has no concept of
> IOMMU or IOVAs).
> 
> > 
> > 
> > To summarize, the necessary conditions for a hack along these lines
> > (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:
> > 
> >   - secure guest mode is enabled - so we know that since we don't share
> >     most memory regular virtio code won't
> >     work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
> 
> force_dma_unencrypted(&vdev->dev) is IMHO exactly about this.
> 
> >   - DMA API is giving us addresses that are actually also physical
> >     addresses
> 
> In case of s390 this is given.
> I talked with the power people before
> posting this, and they ensured me they can are willing to deal with
> this. I was hoping to talk abut this with the AMD SEV people here (hence
> the cc).

We'd need a part of DMA API that promises this though. Platform
maintainers aren't going to go out of their way to do the
right thing just for virtio, and I can't track all arches
to make sure they don't violate virtio requirements.

> 
> >   - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
> > 
> 
> I don't get this point. The argument where the hypervisor is buggy is a
> bit hard to follow for me. If hypervisor is buggy we have already lost
> anyway most of the time, or?

If VIRTIO_F_ACCESS_PLATFORM is set then things just work.  If
VIRTIO_F_ACCESS_PLATFORM is clear device is supposed to have access to
all of memory.  You can argue in various ways but it's easier to just
declare a behaviour that violates this a bug. Which might still be worth
working around, for various reasons.


> > I don't see how this patch does this.
> 
> I do get your point. I don't know of a good way to check that DMA API
> is giving us addresses that are actually physical addresses, and the
> situation you describe definitely has some risk to it.

One way would be to extend the DMA API with such an API.

Another would be to make virtio always use DMA API
and hide the logic in there.
This second approach is not easy, in particular since DMA API adds
a bunch of overhead which we need to find ways to
measure and mitigate.


> 
> Let me comment on other ideas that came up. I would be very happy to go
> with the best one. Thank you very much.
> 
> Regards,
> Halil
> 
> > 
> > 
> > > ---
> > >  drivers/virtio/virtio_ring.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 867c7ebd3f10..fafc8f924955 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  	if (!virtio_has_iommu_quirk(vdev))
> > >  		return true;
> > >  
> > > +	if (force_dma_unencrypted(&vdev->dev))
> > > +		return true;
> > > +
> > >  	/* Otherwise, we are left to guess. */
> > >  	/*
> > >  	 * In theory, it's possible to have a buggy QEMU-supposed
> > > -- 
> > > 2.17.1
> > 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 15:56       ` Michael S. Tsirkin
@ 2020-02-21 16:35         ` Christoph Hellwig
  2020-02-21 18:03         ` Halil Pasic
  1 sibling, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-21 16:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, David Gibson,
	Michael Mueller, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Fri, Feb 21, 2020 at 10:56:45AM -0500, Michael S. Tsirkin wrote:
> > >   - DMA API is giving us addresses that are actually also physical
> > >     addresses
> > 
> > In case of s390 this is given.
> > I talked with the power people before
> > posting this, and they ensured me they can are willing to deal with
> > this. I was hoping to talk abut this with the AMD SEV people here (hence
> > the cc).
> 
> We'd need a part of DMA API that promises this though. Platform
> maintainers aren't going to go out of their way to do the
> right thing just for virtio, and I can't track all arches
> to make sure they don't violate virtio requirements.

There is no way the DMA API is going to promise that you.  All kinds
of platforms have offsets to DMA, and they are communicated through
interfaces like the device tree or ACPI.  But the good thing is that
your hypervisors is in control of that, neither the DMA API nor
virtio have any business to interfer with that.  In fact for harware
virtio device we might need such an offset for them to work on
lots of SOCs.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21  2:59     ` David Gibson
  2020-02-21  3:41       ` Jason Wang
  2020-02-21 13:27       ` Halil Pasic
@ 2020-02-21 16:36       ` Christoph Hellwig
  2020-02-24  6:50         ` David Gibson
  2020-02-24 18:59         ` Halil Pasic
  2 siblings, 2 replies; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-21 16:36 UTC (permalink / raw)
  To: David Gibson
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Fri, Feb 21, 2020 at 01:59:15PM +1100, David Gibson wrote:
> > Hell no.  This is a detail of the platform DMA direct implementation.
> > Drivers have no business looking at this flag, and virtio finally needs
> > to be fixed to use the DMA API properly for everything but legacy devices.
> 
> So, this patch definitely isn't right as it stands, but I'm struggling
> to understand what it is you're saying is the right way.
> 
> By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
> lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
> see how being a legacy device or not relates to use of the DMA API.

No.   "legacy" is anything that does not set F_ACCESS_PLATFORM.

> I *think* what you are suggesting here is that virtio devices that
> have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
> DMA API treats IOVA==PA, which will satisfy what the device expects.
> Then the virtio driver can use the DMA API the same way for both
> F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.

No.  Those should just keep using the existing legacy non-dma ops
case and be done with it.  No changes to that and most certainly
no new features.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 14:33     ` Halil Pasic
@ 2020-02-21 16:39       ` Christoph Hellwig
  2020-02-21 18:16         ` Halil Pasic
  2020-02-22 19:07       ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-21 16:39 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
> > Hell no.  This is a detail of the platform DMA direct implementation.
> 
> I beg to differ. If it was a detail of the DMA direct implementation, it
> should have/would have been private to kernel/dma/direct.c. 

It can't given that platforms have to implement it.  It is an arch hook
for dma-direct.

> Consider what would we have to do to make PCI devices do I/O trough
> pages that were shared when the guest is running in a protected VM. The
> s390_pci_dma_ops would also need to know whether to 'force dma uencrypted'
> or not, and it's the exact same logic. I doubt simply using DMA direct
> for zPCI would do, because we still have to do all the Z specific IOMMU
> management.

And your IOMMU can't deal with the encryption bit?  In the case we
could think of allowing IOMMU implementation to access it.  But the
point that it is an internal detail of the DMA implementation and by
now means for drivers.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-20 21:33 ` Michael S. Tsirkin
  2020-02-21 13:49   ` Halil Pasic
@ 2020-02-21 16:41   ` Christoph Hellwig
  2020-02-24  5:44     ` David Gibson
  1 sibling, 1 reply; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-21 16:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, David Gibson,
	Michael Mueller, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote:
> So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> Working around it by the guest looks wrong ...

Yes.  If your host (and I don't care if you split hypervisor, ultravisor
and megavisor out in your implementation) wants to support a VM
architecture where the host can't access all guest memory you need to
ensure the DMA API is used.  Extra points for simply always setting the
flag and thus future proofing the scheme.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 15:56       ` Michael S. Tsirkin
  2020-02-21 16:35         ` Christoph Hellwig
@ 2020-02-21 18:03         ` Halil Pasic
  1 sibling, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 18:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Fri, 21 Feb 2020 10:56:45 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Feb 21, 2020 at 02:12:30PM +0100, Halil Pasic wrote:
> > On Thu, 20 Feb 2020 15:55:14 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
[..]
> > > To summarize, the necessary conditions for a hack along these lines
> > > (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:
> > > 
> > >   - secure guest mode is enabled - so we know that since we don't share
> > >     most memory regular virtio code won't
> > >     work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
> > 
> > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this.
> > 
> > >   - DMA API is giving us addresses that are actually also physical
> > >     addresses
> > 
> > In case of s390 this is given.
> > I talked with the power people before
> > posting this, and they ensured me they can are willing to deal with
> > this. I was hoping to talk abut this with the AMD SEV people here (hence
> > the cc).
> 
> We'd need a part of DMA API that promises this though. Platform
> maintainers aren't going to go out of their way to do the
> right thing just for virtio, and I can't track all arches
> to make sure they don't violate virtio requirements.
> 

One would have to track only the arches that have
force_dma_unecrypted(), and generally speaking the arches shall make
sure the DMA ops are suitable, whatever that means in the given context.

> > 
> > >   - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
> > > 
> > 
> > I don't get this point. The argument where the hypervisor is buggy is a
> > bit hard to follow for me. If hypervisor is buggy we have already lost
> > anyway most of the time, or?
> 
> If VIRTIO_F_ACCESS_PLATFORM is set then things just work.  If
> VIRTIO_F_ACCESS_PLATFORM is clear device is supposed to have access to
> all of memory.  You can argue in various ways but it's easier to just
> declare a behaviour that violates this a bug. Which might still be worth
> working around, for various reasons.
> 

I don't agree. The spec explicitly states "If this feature bit is set
to 0, then the device has same access to memory addresses supplied to it
as the driver has." That ain't any guest memory, but the addresses
supplied to it. BTW this is how channel I/O works as well: the channel
program authorizes the device to use the memory locations specified by
the channel program, for as long as the channel program is executed.
That's how channel I/O does isolation without an architected IOMMU.

Can you please show me the words in the specification that say, the
device has to have access to the entire guest memory all the time?

Maybe I have to understand all the intentions behind
VIRTIO_F_ACCESS_PLATFORM better. I've read the spec several times, but I
still have the feeling, when we discuss, that I didn't get it right.
IOMMUs and PCI style DMA are unfortunately not my bread and butter.

I only know that the devices do not need any new device capability (I
assume VIRTIO_F_ACCESS_PLATFORM does express a device capability), to
work with protected virtualization. Unless one defines
VIRTIO_F_ACCESS_PLATFORM is the capability that the device won't poke
*arbitrary* guest memory.  From that perspective mandating the flag
feels wrong. (CCW devices are never allowed to poke arbitrary
memory.)

Yet, if VIRTIO_F_ACCESS_PLATFORM is a flag that every nice and modern
virtio device should have (i.e. a lack of it means kida broken), then I
have the feeling virtio-ccw should probably evolve in the direction, that
having VIRTIO_F_ACCESS_PLATFORM set does not hurt.

I have to think some more.

> 
> > > I don't see how this patch does this.
> > 
> > I do get your point. I don't know of a good way to check that DMA API
> > is giving us addresses that are actually physical addresses, and the
> > situation you describe definitely has some risk to it.
> 
> One way would be to extend the DMA API with such an API.

Seems Christoph does not like that idea.

> 
> Another would be to make virtio always use DMA API
> and hide the logic in there.

I thought Christoph wants that, but I was wrong.

> This second approach is not easy, in particular since DMA API adds
> a bunch of overhead which we need to find ways to
> measure and mitigate.
> 

Agreed. From s390 perspective, I think it ain't to bad if we get GFP_DMA.

Many thanks for your patience!
Halil

[..]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-21 15:48             ` Michael S. Tsirkin
@ 2020-02-21 18:07               ` Halil Pasic
  2020-02-24  3:33                 ` David Gibson
  0 siblings, 1 reply; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 18:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Viktor Mihajlovski, Janosch Frank, Lendacky, Thomas,
	Jason Wang, Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, David Gibson,
	Robin Murphy, Christoph Hellwig

On Fri, 21 Feb 2020 10:48:15 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:
> > On Fri, 21 Feb 2020 14:27:27 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > > > > >From a users perspective it makes absolutely perfect sense to use the
> > > > > bounce buffers when they are NEEDED. 
> > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > because of the indirections.
> > > > 
> > > > The point is that the user should not have to specify iommu_platform.
> > > > We need to make sure any new hypervisor (especially one that might require
> > > > bounce buffering) always sets it,
> > > 
> > > So, I have draft qemu patches which enable iommu_platform by default.
> > > But that's really because of other problems with !iommu_platform, not
> > > anything to do with bounce buffering or secure VMs.
> > > 
> > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > to know whether the guest needs bounce buffering.  As far as the
> > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > just expects to get addresses it can access whether those are GPAs
> > > (iommu_platform=off) or IOVAs (iommu_platform=on).
> > 
> > I very much agree!
> > 
> > > 
> > > > as was a rather bogus legacy hack
> > > 
> > > It was certainly a bad idea, but it was a bad idea that went into a
> > > public spec and has been widely deployed for many years.  We can't
> > > just pretend it didn't happen and move on.
> > > 
> > > Turning iommu_platform=on by default breaks old guests, some of which
> > > we still care about.  We can't (automatically) do it only for guests
> > > that need bounce buffering, because the hypervisor doesn't know that
> > > ahead of time.
> > 
> > Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> > because for CCW I/O there is no such thing as IOMMU and the addresses
> > are always physical addresses.
> 
> Fix the name then. The spec calls is ACCESS_PLATFORM now, which
> makes much more sense.

I don't quite get it. Sorry. Maybe I will revisit this later.

Regards,
Halil

> 
> > > 
> > > > that isn't extensibe for cases that for example require bounce buffering.
> > > 
> > > In fact bounce buffering isn't really the issue from the hypervisor
> > > (or spec's) point of view.  It's the fact that not all of guest memory
> > > is accessible to the hypervisor.  Bounce buffering is just one way the
> > > guest might deal with that.
> > > 
> > 
> > Agreed.
> > 
> > Regards,
> > Halil
> > 
> > 
> > 
> 
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 16:39       ` Christoph Hellwig
@ 2020-02-21 18:16         ` Halil Pasic
  0 siblings, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-21 18:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy

On Fri, 21 Feb 2020 17:39:38 +0100
Christoph Hellwig <hch@lst.de> wrote:

> On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
> > > Hell no.  This is a detail of the platform DMA direct implementation.
> > 
> > I beg to differ. If it was a detail of the DMA direct implementation, it
> > should have/would have been private to kernel/dma/direct.c. 
> 
> It can't given that platforms have to implement it.  It is an arch hook
> for dma-direct.
> 
> > Consider what would we have to do to make PCI devices do I/O trough
> > pages that were shared when the guest is running in a protected VM. The
> > s390_pci_dma_ops would also need to know whether to 'force dma uencrypted'
> > or not, and it's the exact same logic. I doubt simply using DMA direct
> > for zPCI would do, because we still have to do all the Z specific IOMMU
> > management.
> 
> And your IOMMU can't deal with the encryption bit?

There is no encrypt bit, and our memory is not encrypted, but protected.
Means e.g. when buggy/malicious hypervisor tries to read a protected
page it wont get ciphertext, but a slap on its finger. In order do make
memory accessible to the hypervisor (or another guest, or a real device)
the guest must make a so called utlravisor call (talk to the firmware)
and share the respective page.

We tapped into the memory encryption infrastructure, because both is
protecting the guest memory form the host (just by different means), and
because it made no sense to build up something in parallel when most of
the stuff we need was already there. But most unfortunately the names
are deceiving when it comes to s390 protected virtualization and it's
guest I/O enablement.

>  In the case we
> could think of allowing IOMMU implementation to access it.  But the
> point that it is an internal detail of the DMA implementation and by
> now means for drivers.

From the perspective, that any driver that does anything remotely DMAish,
that is, some external entity (possibly a hypervisor, possibly a channel
subsystem, possibly a DMA controller) to should the memory, should do
DMA API first, to make sure, the DMAish goes well, your argument makes
perfect sense. But form that perspective !F_ACCESS_PLATFORM is also a
DMAish. And the virtio spec mandates that !F_ACCESS_PLATFORM implies
GPA's.

For virtio-ccw I want GPA's and not IOVA's on s390, for virtio-pci,
which we also support in general but not with protected virtualization,
well, it's a different story.

With protected visualization however I must make sure all I/O goes
through shared pages. We use swiotlb for that. But then the old
infrastructure won't cut it. Jet we still need GPA's on the ring (with
the extra requirement that the page must be shared). 

DMA API is a nice fit there because we can allocate DMA coherent memory
(such that what comes back from our DMA ops is a GPA), so we have shared
memory that the hypervisor and the guest is allowed to look at
concurrently, and for the buffers that are going to be put on the vring,
we can use the streaming API, which uses bounce buffers. The returned
IOVA (in DMA API speak) is a GPA of the bounce buffer, and the guest is
not allowed to peek until it unmaps, so everything is cozy. But for that
to work, we all (AMD SEV, power, and s390) must go through the DMA API,
because the old infrastructure in virtio core simply won't cut it. And
it has nothing to do with the device. David explained it very well.

My series is about controlling virtio-core's usage of DMA API. I believe,
I did it in a way that doesn't hurt any arch at the moment.

Maybe the conflict can be resolved if the transport gets a say in
whether to use the DMA API or not. In the end the VIRTIO spec does say
that "Whether accesses are actually limited or translated is described
by platform-specific means."

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 14:33     ` Halil Pasic
  2020-02-21 16:39       ` Christoph Hellwig
@ 2020-02-22 19:07       ` Michael S. Tsirkin
  2020-02-24 17:16         ` Christoph Hellwig
  1 sibling, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-22 19:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
> AFAIU you have a positive attitude towards the idea, that 
> !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' 
> should be scrapped. 
> 
> I would like to accomplish that without adverse effects to virtio-ccw
> (because caring for virtio-ccw is a part of job description).
> 
> Regards,
> Halil

It is possible, in theory. IIRC the main challenge is that DMA API
has overhead of indirect function calls even when all it
does it return back the PA without changes. So that will lead to
a measureable performance degradation. That might be fixable,
possibly using some kind of logic along the lines of
	if (iova is pa)
		return pa
	else
		indirect call

and for unmapping, we might need an API that says "unmap
is a nop, safe to skip" so we don't maintain the
dma address until unmap.


-- 
MST

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-21 18:07               ` Halil Pasic
@ 2020-02-24  3:33                 ` David Gibson
  2020-02-24 18:49                   ` Halil Pasic
  0 siblings, 1 reply; 58+ messages in thread
From: David Gibson @ 2020-02-24  3:33 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 4446 bytes --]

On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote:
> On Fri, 21 Feb 2020 10:48:15 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:
> > > On Fri, 21 Feb 2020 14:27:27 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > 
> > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > > > > > >From a users perspective it makes absolutely perfect sense to use the
> > > > > > bounce buffers when they are NEEDED. 
> > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > > because of the indirections.
> > > > > 
> > > > > The point is that the user should not have to specify iommu_platform.
> > > > > We need to make sure any new hypervisor (especially one that might require
> > > > > bounce buffering) always sets it,
> > > > 
> > > > So, I have draft qemu patches which enable iommu_platform by default.
> > > > But that's really because of other problems with !iommu_platform, not
> > > > anything to do with bounce buffering or secure VMs.
> > > > 
> > > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > > to know whether the guest needs bounce buffering.  As far as the
> > > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > > just expects to get addresses it can access whether those are GPAs
> > > > (iommu_platform=off) or IOVAs (iommu_platform=on).
> > > 
> > > I very much agree!
> > > 
> > > > 
> > > > > as was a rather bogus legacy hack
> > > > 
> > > > It was certainly a bad idea, but it was a bad idea that went into a
> > > > public spec and has been widely deployed for many years.  We can't
> > > > just pretend it didn't happen and move on.
> > > > 
> > > > Turning iommu_platform=on by default breaks old guests, some of which
> > > > we still care about.  We can't (automatically) do it only for guests
> > > > that need bounce buffering, because the hypervisor doesn't know that
> > > > ahead of time.
> > > 
> > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> > > because for CCW I/O there is no such thing as IOMMU and the addresses
> > > are always physical addresses.
> > 
> > Fix the name then. The spec calls is ACCESS_PLATFORM now, which
> > makes much more sense.
> 
> I don't quite get it. Sorry. Maybe I will revisit this later.

Halil, I think I can clarify this.

The "iommu_platform" flag doesn't necessarily have anything to do with
an iommu, although it often will.  Basically it means "access guest
memory via the bus's normal DMA mechanism" rather than "access guest
memory using GPA, because you're the hypervisor and you can do that".

For the case of ccw, both mechanisms end up being the same thing,
since CCW's normal DMA *is* untranslated GPA access.

For this reason, the flag in the spec was renamed to ACCESS_PLATFORM,
but the flag in qemu still has the old name.

AIUI, Michael is saying you could trivially change the name in qemu
(obviously you'd need to alias the old name to the new one for
compatibility).


Actually, the fact that ccw has no translation makes things easier for
you: you don't really have any impediment to turning ACCESS_PLATFORM
on by default, since it doesn't make any real change to how things
work.

The remaining difficulty is that the virtio driver - since it can sit
on multiple buses - won't know this, and will reject the
ACCESS_PLATFORM flag, even though it could just do what it normally
does on ccw and it would work.

For that case, we could consider a hack in qemu where for virtio-ccw
devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and
carry on anyway.  Normally we insist that the guest accept the
ACCESS_PLATFORM flag if offered, because on most platforms they
*don't* amount to the same thing.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-21 14:56   ` Halil Pasic
@ 2020-02-24  3:38     ` David Gibson
  2020-02-24  4:01     ` Jason Wang
  1 sibling, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-24  3:38 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 4045 bytes --]

On Fri, Feb 21, 2020 at 03:56:02PM +0100, Halil Pasic wrote:
> On Fri, 21 Feb 2020 14:22:26 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > 
> > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > Currently if one intends to run a memory protection enabled VM with
> > > virtio devices and linux as the guest OS, one needs to specify the
> > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > linux use the DMA API, which in turn handles the memory
> > > encryption/protection stuff if the guest decides to turn itself into
> > > a protected one. This however makes no sense due to multiple reasons:
> > > * The device is not changed by the fact that the guest RAM is
> > > protected. The so called IOMMU bypass quirk is not affected.
> > > * This usage is not congruent with  standardised semantics of
> > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > for using DMA API in virtio (orthogonal with respect to what is
> > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > >
> > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > is protected' and 'have to use DMA API because the device told me
> > > VIRTIO_F_IOMMU_PLATFORM'.
> > >
> > > Please find more detailed explanations about the conceptual aspects in
> > > the individual patches. There is however also a very practical problem
> > > that is addressed by this series.
> > >
> > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > ring are not guest physical addresses but iova's, and insists on doing a
> > > translation of these regardless of what transport is used (e.g. whether
> > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > performance degradation (c.a. factor 10).
> > 
> > 
> > Do you see a consistent degradation on the performance, or it only 
> > happen when for during the beginning of the test?
> > 
> 
> AFAIK the degradation is consistent.
> 
> > 
> > > BTW with ccw I/O there is
> > > (architecturally) no IOMMU, so the whole address translation makes no
> > > sense in the context of virtio-ccw.
> > 
> > 
> > I suspect we can do optimization in qemu side.
> > 
> > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > 
> > If this makes sense, I can draft patch to see if there's any difference.
> 
> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> then necessary translation) for virtio-ccw altogether. But Michael
> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> that could be used with guests running in protected mode. I don't share
> his opinion, but that's an ongoing discussion.

I'm a bit confused by this.  For the ccw specific case,
F_ACCESS_PLATFORM shouldn't have any impact: for you, IOVA == GPA so
everything is easy.

> Should we end up having to do translation from IOVA in vhost, we are
> very interested in that translation being fast and efficient.
> 
> In that sense we would be very happy to test any optimization that aim
> into that direction.
> 
> Thank you very much for your input!
> 
> Regards,
> Halil
> 
> > 
> > Thanks
> > 
> > 
> > >
> > > Halil Pasic (2):
> > >    mm: move force_dma_unencrypted() to mem_encrypt.h
> > >    virtio: let virtio use DMA API when guest RAM is protected
> > >
> > >   drivers/virtio/virtio_ring.c |  3 +++
> > >   include/linux/dma-direct.h   |  9 ---------
> > >   include/linux/mem_encrypt.h  | 10 ++++++++++
> > >   3 files changed, 13 insertions(+), 9 deletions(-)
> > >
> > >
> > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-21 14:56   ` Halil Pasic
  2020-02-24  3:38     ` David Gibson
@ 2020-02-24  4:01     ` Jason Wang
  2020-02-24  6:06       ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Jason Wang @ 2020-02-24  4:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Christian Borntraeger,
	iommu, David Gibson, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 3663 bytes --]


On 2020/2/21 下午10:56, Halil Pasic wrote:
> On Fri, 21 Feb 2020 14:22:26 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>> Currently if one intends to run a memory protection enabled VM with
>>> virtio devices and linux as the guest OS, one needs to specify the
>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>> linux use the DMA API, which in turn handles the memory
>>> encryption/protection stuff if the guest decides to turn itself into
>>> a protected one. This however makes no sense due to multiple reasons:
>>> * The device is not changed by the fact that the guest RAM is
>>> protected. The so called IOMMU bypass quirk is not affected.
>>> * This usage is not congruent with  standardised semantics of
>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>> for using DMA API in virtio (orthogonal with respect to what is
>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>
>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>> is protected' and 'have to use DMA API because the device told me
>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>
>>> Please find more detailed explanations about the conceptual aspects in
>>> the individual patches. There is however also a very practical problem
>>> that is addressed by this series.
>>>
>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>> ring are not guest physical addresses but iova's, and insists on doing a
>>> translation of these regardless of what transport is used (e.g. whether
>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>> performance degradation (c.a. factor 10).
>>
>> Do you see a consistent degradation on the performance, or it only
>> happen when for during the beginning of the test?
>>
> AFAIK the degradation is consistent.
>
>>> BTW with ccw I/O there is
>>> (architecturally) no IOMMU, so the whole address translation makes no
>>> sense in the context of virtio-ccw.
>>
>> I suspect we can do optimization in qemu side.
>>
>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>
>> If this makes sense, I can draft patch to see if there's any difference.
> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> then necessary translation) for virtio-ccw altogether. But Michael
> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> that could be used with guests running in protected mode. I don't share
> his opinion, but that's an ongoing discussion.
>
> Should we end up having to do translation from IOVA in vhost, we are
> very interested in that translation being fast and efficient.
>
> In that sense we would be very happy to test any optimization that aim
> into that direction.
>
> Thank you very much for your input!


Using IOTLB API on platform without IOMMU support is not intended. 
Please try the attached patch to see if it helps.

Thanks


>
> Regards,
> Halil
>
>> Thanks
>>
>>
>>> Halil Pasic (2):
>>>     mm: move force_dma_unencrypted() to mem_encrypt.h
>>>     virtio: let virtio use DMA API when guest RAM is protected
>>>
>>>    drivers/virtio/virtio_ring.c |  3 +++
>>>    include/linux/dma-direct.h   |  9 ---------
>>>    include/linux/mem_encrypt.h  | 10 ++++++++++
>>>    3 files changed, 13 insertions(+), 9 deletions(-)
>>>
>>>
>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2

[-- Attachment #2: 0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch --]
[-- Type: text/x-patch, Size: 1710 bytes --]

From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Mon, 24 Feb 2020 12:00:10 +0800
Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly

When transport does not support IOMMU, we should clear IOMMU_PLATFORM
even if the device and vhost claims to support that. This help to
avoid the performance overhead caused by unnecessary IOTLB miss/update
transactions on such platform.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/virtio-bus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index d6332d45c3..2741b9fdd2 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
     VirtioBusState *bus = VIRTIO_BUS(qbus);
     VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
     VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
-    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
     Error *local_err = NULL;
 
     DPRINTF("%s: plug device.\n", qbus->name);
@@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
         return;
     }
 
-    if (klass->get_dma_as != NULL && has_iommu) {
-        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
+    if (false && klass->get_dma_as != NULL &&
+        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
         vdev->dma_as = klass->get_dma_as(qbus->parent);
     } else {
+        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = &address_space_memory;
     }
 }
-- 
2.19.1


[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-21 16:41   ` Christoph Hellwig
@ 2020-02-24  5:44     ` David Gibson
  0 siblings, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-24  5:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 2156 bytes --]

On Fri, Feb 21, 2020 at 05:41:51PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote:
> > So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> > Working around it by the guest looks wrong ...
> 
> Yes.  If your host (and I don't care if you split hypervisor,
> ultravisor and megavisor out in your implementation) wants to
> support a VM architecture where the host can't access all guest
> memory you need to ensure the DMA API is used.  Extra points for
> simply always setting the flag and thus future proofing the scheme.

Moving towards F_ACCESS_PLATFORM everywhere is a good idea (for other
reasons), but that doesn't make the problem as it exists right now go
away.

But, "you need to ensure the DMA API is used" makes no sense from the
host point of view.  The existence of the DMA API is an entirely guest
side, and Linux specific detail, the host can't make decisions based
on that.

For POWER - possibly s390 as well - the hypervisor has no way of
knowing at machine construction time whether it will be an old kernel
(or non Linux OS) which can't support F_ACCESS_PLATFORM, or a guest
which will enter secure mode and therefore requires F_ACCESS_PLATFORM
(according to you).  That's the fundamental problem here.

The normal virtio model of features that the guest can optionally
accept would work nicely here - except that that wouldn't work for the
case of hardware virtio devices, where the access limitations come
from "host" (platform) side and therefore can't be disabled by that
host.

We really do have two cases here: 1) access restrictions originating
with the host/platform (e.g. hardware virtio) and 2) access
restrictions originating with the guest (e.g. secure VMs).  What we
need to do to deal with them is basically the same at the driver
level, but it has subtle and important differences at the platform
level.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  4:01     ` Jason Wang
@ 2020-02-24  6:06       ` Michael S. Tsirkin
  2020-02-24  6:45         ` Jason Wang
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-24  6:06 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
> 
> On 2020/2/21 下午10:56, Halil Pasic wrote:
> > On Fri, 21 Feb 2020 14:22:26 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> > 
> > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > Currently if one intends to run a memory protection enabled VM with
> > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > linux use the DMA API, which in turn handles the memory
> > > > encryption/protection stuff if the guest decides to turn itself into
> > > > a protected one. This however makes no sense due to multiple reasons:
> > > > * The device is not changed by the fact that the guest RAM is
> > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > * This usage is not congruent with  standardised semantics of
> > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > > 
> > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > is protected' and 'have to use DMA API because the device told me
> > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > > 
> > > > Please find more detailed explanations about the conceptual aspects in
> > > > the individual patches. There is however also a very practical problem
> > > > that is addressed by this series.
> > > > 
> > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > translation of these regardless of what transport is used (e.g. whether
> > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > performance degradation (c.a. factor 10).
> > > 
> > > Do you see a consistent degradation on the performance, or it only
> > > happen when for during the beginning of the test?
> > > 
> > AFAIK the degradation is consistent.
> > 
> > > > BTW with ccw I/O there is
> > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > sense in the context of virtio-ccw.
> > > 
> > > I suspect we can do optimization in qemu side.
> > > 
> > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > > 
> > > If this makes sense, I can draft patch to see if there's any difference.
> > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > then necessary translation) for virtio-ccw altogether. But Michael
> > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > that could be used with guests running in protected mode. I don't share
> > his opinion, but that's an ongoing discussion.
> > 
> > Should we end up having to do translation from IOVA in vhost, we are
> > very interested in that translation being fast and efficient.
> > 
> > In that sense we would be very happy to test any optimization that aim
> > into that direction.
> > 
> > Thank you very much for your input!
> 
> 
> Using IOTLB API on platform without IOMMU support is not intended. Please
> try the attached patch to see if it helps.
> 
> Thanks
> 
> 
> > 
> > Regards,
> > Halil
> > 
> > > Thanks
> > > 
> > > 
> > > > Halil Pasic (2):
> > > >     mm: move force_dma_unencrypted() to mem_encrypt.h
> > > >     virtio: let virtio use DMA API when guest RAM is protected
> > > > 
> > > >    drivers/virtio/virtio_ring.c |  3 +++
> > > >    include/linux/dma-direct.h   |  9 ---------
> > > >    include/linux/mem_encrypt.h  | 10 ++++++++++
> > > >    3 files changed, 13 insertions(+), 9 deletions(-)
> > > > 
> > > > 
> > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2

> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> From: Jason Wang <jasowang@redhat.com>
> Date: Mon, 24 Feb 2020 12:00:10 +0800
> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> 
> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> even if the device and vhost claims to support that. This help to
> avoid the performance overhead caused by unnecessary IOTLB miss/update
> transactions on such platform.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/virtio/virtio-bus.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6332d45c3..2741b9fdd2 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>      VirtioBusState *bus = VIRTIO_BUS(qbus);
>      VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>      VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>      Error *local_err = NULL;
>  
>      DPRINTF("%s: plug device.\n", qbus->name);
> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>          return;
>      }
>  
> -    if (klass->get_dma_as != NULL && has_iommu) {
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> +    if (false && klass->get_dma_as != NULL &&
> +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>      } else {
> +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = &address_space_memory;
>      }
>  }


This seems to clear it unconditionally. I guess it's just a debugging
patch, the real one will come later?

> -- 
> 2.19.1
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 15:39       ` Tom Lendacky
@ 2020-02-24  6:40         ` David Gibson
  0 siblings, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-24  6:40 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-s390, Huang, Wei, Singh, Brijesh, Janosch Frank,
	Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai,
	linux-kernel, virtualization, Halil Pasic, Christian Borntraeger,
	iommu, Michael Mueller, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 4981 bytes --]

On Fri, Feb 21, 2020 at 09:39:38AM -0600, Tom Lendacky wrote:
> On 2/21/20 7:12 AM, Halil Pasic wrote:
> > On Thu, 20 Feb 2020 15:55:14 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> >> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> >>> Currently the advanced guest memory protection technologies (AMD SEV,
> >>> powerpc secure guest technology and s390 Protected VMs) abuse the
> >>> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> >>> is in turn necessary, to make IO work with guest memory protection.
> >>>
> >>> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> >>> different beast: with virtio devices whose implementation runs on an SMP
> >>> CPU we are still fine with doing all the usual optimizations, it is just
> >>> that we need to make sure that the memory protection mechanism does not
> >>> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> >>> side of the guest (and possibly he host side as well) than we actually
> >>> need.
> >>>
> >>> An additional benefit of teaching the guest to make the right decision
> >>> (and use DMA API) on it's own is: removing the need, to mandate special
> >>> VM configuration for guests that may run with protection. This is
> >>> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> >>> the virtio control structures into the first 2G of guest memory:
> >>> something we don't necessarily want to do per-default.
> >>>
> >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>> Tested-by: Ram Pai <linuxram@us.ibm.com>
> >>> Tested-by: Michael Mueller <mimu@linux.ibm.com>
> >>
> >> This might work for you but it's fragile, since without
> >> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> >> GPA's, not DMA addresses.
> >>
> > 
> > Thanks for your constructive approach. I do want the hypervisor to
> > assume it gets GPA's. My train of thought was that the guys that need
> > to use IOVA's that are not GPA's when force_dma_unencrypted() will have
> > to to specify VIRTIO_F_ACCESS_PLATFORM (at the device) anyway, because
> > otherwise it won't work. But I see your point: in case of a
> > mis-configuration and provided the DMA API returns IOVA's one could end
> > up trying to touch wrong memory locations. But this should be similar to
> > what would happen if DMA ops are not used, and memory is not made accessible.
> > 
> >>
> >>
> >> IOW this looks like another iteration of:
> >>
> >> 	virtio: Support encrypted memory on powerpc secure guests
> >>
> >> which I was under the impression was abandoned as unnecessary.
> > 
> > Unnecessary for powerpc because they do normal PCI. In the context of
> > CCW there are only guest physical addresses (CCW I/O has no concept of
> > IOMMU or IOVAs).
> > 
> >>
> >>
> >> To summarize, the necessary conditions for a hack along these lines
> >> (using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:
> >>
> >>   - secure guest mode is enabled - so we know that since we don't share
> >>     most memory regular virtio code won't
> >>     work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
> > 
> > force_dma_unencrypted(&vdev->dev) is IMHO exactly about this.
> > 
> >>   - DMA API is giving us addresses that are actually also physical
> >>     addresses
> > 
> > In case of s390 this is given. I talked with the power people before
> > posting this, and they ensured me they can are willing to deal with
> > this. I was hoping to talk abut this with the AMD SEV people here (hence
> > the cc).
> 
> Yes, physical addresses are fine for SEV - the key is that the DMA API is
> used so that an address for unencrypted, or shared, memory is returned.
> E.g. for a dma_alloc_coherent() call this is an allocation that has had
> set_memory_decrypted() called or for a dma_map_page() call this is an
> address from SWIOTLB, which was mapped shared during boot, where the data
> will be bounce-buffered.
> 
> We don't currently support an emulated IOMMU in our SEV guest because that
> would require a lot of support in the driver to make IOMMU data available
> to the hypervisor (I/O page tables, etc.). We would need hardware support
> to really make this work easily in the guest.

A tangent here: note that on POWER our IOMMU is paravirtualized
(updated with hypercalls), it's also always enabled.  For that reason
we can and do combine vIOMMU translation with the need for bounce
buffering for secure guests.

(We generally statically configure the vIOMMU to have a huge window
which just maps GPAs 1-to-1, which means we can still use dma-direct,
but the vIOMMU is still there from the platform point of view)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  6:06       ` Michael S. Tsirkin
@ 2020-02-24  6:45         ` Jason Wang
  2020-02-24  7:48           ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Jason Wang @ 2020-02-24  6:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 6009 bytes --]


On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
>> On 2020/2/21 下午10:56, Halil Pasic wrote:
>>> On Fri, 21 Feb 2020 14:22:26 +0800
>>> Jason Wang <jasowang@redhat.com> wrote:
>>>
>>>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>>>> Currently if one intends to run a memory protection enabled VM with
>>>>> virtio devices and linux as the guest OS, one needs to specify the
>>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>>>> linux use the DMA API, which in turn handles the memory
>>>>> encryption/protection stuff if the guest decides to turn itself into
>>>>> a protected one. This however makes no sense due to multiple reasons:
>>>>> * The device is not changed by the fact that the guest RAM is
>>>>> protected. The so called IOMMU bypass quirk is not affected.
>>>>> * This usage is not congruent with  standardised semantics of
>>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>>>> for using DMA API in virtio (orthogonal with respect to what is
>>>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>>>
>>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>>>> is protected' and 'have to use DMA API because the device told me
>>>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>>>
>>>>> Please find more detailed explanations about the conceptual aspects in
>>>>> the individual patches. There is however also a very practical problem
>>>>> that is addressed by this series.
>>>>>
>>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>>>> ring are not guest physical addresses but iova's, and insists on doing a
>>>>> translation of these regardless of what transport is used (e.g. whether
>>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>>>> performance degradation (c.a. factor 10).
>>>> Do you see a consistent degradation on the performance, or it only
>>>> happen when for during the beginning of the test?
>>>>
>>> AFAIK the degradation is consistent.
>>>
>>>>> BTW with ccw I/O there is
>>>>> (architecturally) no IOMMU, so the whole address translation makes no
>>>>> sense in the context of virtio-ccw.
>>>> I suspect we can do optimization in qemu side.
>>>>
>>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>>>
>>>> If this makes sense, I can draft patch to see if there's any difference.
>>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
>>> then necessary translation) for virtio-ccw altogether. But Michael
>>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
>>> that could be used with guests running in protected mode. I don't share
>>> his opinion, but that's an ongoing discussion.
>>>
>>> Should we end up having to do translation from IOVA in vhost, we are
>>> very interested in that translation being fast and efficient.
>>>
>>> In that sense we would be very happy to test any optimization that aim
>>> into that direction.
>>>
>>> Thank you very much for your input!
>>
>> Using IOTLB API on platform without IOMMU support is not intended. Please
>> try the attached patch to see if it helps.
>>
>> Thanks
>>
>>
>>> Regards,
>>> Halil
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Halil Pasic (2):
>>>>>      mm: move force_dma_unencrypted() to mem_encrypt.h
>>>>>      virtio: let virtio use DMA API when guest RAM is protected
>>>>>
>>>>>     drivers/virtio/virtio_ring.c |  3 +++
>>>>>     include/linux/dma-direct.h   |  9 ---------
>>>>>     include/linux/mem_encrypt.h  | 10 ++++++++++
>>>>>     3 files changed, 13 insertions(+), 9 deletions(-)
>>>>>
>>>>>
>>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
>> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
>> From: Jason Wang <jasowang@redhat.com>
>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>
>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>> even if the device and vhost claims to support that. This help to
>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>> transactions on such platform.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   hw/virtio/virtio-bus.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>> index d6332d45c3..2741b9fdd2 100644
>> --- a/hw/virtio/virtio-bus.c
>> +++ b/hw/virtio/virtio-bus.c
>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>       VirtioBusState *bus = VIRTIO_BUS(qbus);
>>       VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>>       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>> -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>       Error *local_err = NULL;
>>   
>>       DPRINTF("%s: plug device.\n", qbus->name);
>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>           return;
>>       }
>>   
>> -    if (klass->get_dma_as != NULL && has_iommu) {
>> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>> +    if (false && klass->get_dma_as != NULL &&
>> +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>>       } else {
>> +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>           vdev->dma_as = &address_space_memory;
>>       }
>>   }
>
> This seems to clear it unconditionally. I guess it's just a debugging
> patch, the real one will come later?


My bad, here's the correct one.

Thanks


>
>> -- 
>> 2.19.1
>>

[-- Attachment #2: 0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch --]
[-- Type: text/x-patch, Size: 1701 bytes --]

From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Mon, 24 Feb 2020 12:00:10 +0800
Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly

When transport does not support IOMMU, we should clear IOMMU_PLATFORM
even if the device and vhost claims to support that. This help to
avoid the performance overhead caused by unnecessary IOTLB miss/update
transactions on such platform.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/virtio-bus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index d6332d45c3..4be64e193e 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
     VirtioBusState *bus = VIRTIO_BUS(qbus);
     VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
     VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
-    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
     Error *local_err = NULL;
 
     DPRINTF("%s: plug device.\n", qbus->name);
@@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
         return;
     }
 
-    if (klass->get_dma_as != NULL && has_iommu) {
-        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
+    if (klass->get_dma_as != NULL &&
+        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
         vdev->dma_as = klass->get_dma_as(qbus->parent);
     } else {
+        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = &address_space_memory;
     }
 }
-- 
2.19.1


[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 16:36       ` Christoph Hellwig
@ 2020-02-24  6:50         ` David Gibson
  2020-02-24 18:59         ` Halil Pasic
  1 sibling, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-24  6:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, Viktor Mihajlovski, Robin Murphy


[-- Attachment #1.1: Type: text/plain, Size: 1758 bytes --]

On Fri, Feb 21, 2020 at 05:36:45PM +0100, Christoph Hellwig wrote:
> On Fri, Feb 21, 2020 at 01:59:15PM +1100, David Gibson wrote:
> > > Hell no.  This is a detail of the platform DMA direct implementation.
> > > Drivers have no business looking at this flag, and virtio finally needs
> > > to be fixed to use the DMA API properly for everything but legacy devices.
> > 
> > So, this patch definitely isn't right as it stands, but I'm struggling
> > to understand what it is you're saying is the right way.
> > 
> > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
> > lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
> > see how being a legacy device or not relates to use of the DMA API.
> 
> No.   "legacy" is anything that does not set F_ACCESS_PLATFORM.

Hm, I see.

The trouble is I think we can only reasonably call things "legacy"
when essentially all currently in use OSes have support for the new,
better way of doing things.  That is, alas, not really the case for
F_ACCESS_PLATFORM.

> > I *think* what you are suggesting here is that virtio devices that
> > have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
> > DMA API treats IOVA==PA, which will satisfy what the device expects.
> > Then the virtio driver can use the DMA API the same way for both
> > F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.
> 
> No.  Those should just keep using the existing legacy non-dma ops
> case and be done with it.  No changes to that and most certainly
> no new features.
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  6:45         ` Jason Wang
@ 2020-02-24  7:48           ` Michael S. Tsirkin
  2020-02-24  9:26             ` Jason Wang
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-24  7:48 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
> 
> On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
> > > On 2020/2/21 下午10:56, Halil Pasic wrote:
> > > > On Fri, 21 Feb 2020 14:22:26 +0800
> > > > Jason Wang <jasowang@redhat.com> wrote:
> > > > 
> > > > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > > > Currently if one intends to run a memory protection enabled VM with
> > > > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > > > linux use the DMA API, which in turn handles the memory
> > > > > > encryption/protection stuff if the guest decides to turn itself into
> > > > > > a protected one. This however makes no sense due to multiple reasons:
> > > > > > * The device is not changed by the fact that the guest RAM is
> > > > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > > > * This usage is not congruent with  standardised semantics of
> > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > > > > 
> > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > > > is protected' and 'have to use DMA API because the device told me
> > > > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > > > > 
> > > > > > Please find more detailed explanations about the conceptual aspects in
> > > > > > the individual patches. There is however also a very practical problem
> > > > > > that is addressed by this series.
> > > > > > 
> > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > > > translation of these regardless of what transport is used (e.g. whether
> > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > > > performance degradation (c.a. factor 10).
> > > > > Do you see a consistent degradation on the performance, or it only
> > > > > happen when for during the beginning of the test?
> > > > > 
> > > > AFAIK the degradation is consistent.
> > > > 
> > > > > > BTW with ccw I/O there is
> > > > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > > > sense in the context of virtio-ccw.
> > > > > I suspect we can do optimization in qemu side.
> > > > > 
> > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > > > > 
> > > > > If this makes sense, I can draft patch to see if there's any difference.
> > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > > > then necessary translation) for virtio-ccw altogether. But Michael
> > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > > > that could be used with guests running in protected mode. I don't share
> > > > his opinion, but that's an ongoing discussion.
> > > > 
> > > > Should we end up having to do translation from IOVA in vhost, we are
> > > > very interested in that translation being fast and efficient.
> > > > 
> > > > In that sense we would be very happy to test any optimization that aim
> > > > into that direction.
> > > > 
> > > > Thank you very much for your input!
> > > 
> > > Using IOTLB API on platform without IOMMU support is not intended. Please
> > > try the attached patch to see if it helps.
> > > 
> > > Thanks
> > > 
> > > 
> > > > Regards,
> > > > Halil
> > > > 
> > > > > Thanks
> > > > > 
> > > > > 
> > > > > > Halil Pasic (2):
> > > > > >      mm: move force_dma_unencrypted() to mem_encrypt.h
> > > > > >      virtio: let virtio use DMA API when guest RAM is protected
> > > > > > 
> > > > > >     drivers/virtio/virtio_ring.c |  3 +++
> > > > > >     include/linux/dma-direct.h   |  9 ---------
> > > > > >     include/linux/mem_encrypt.h  | 10 ++++++++++
> > > > > >     3 files changed, 13 insertions(+), 9 deletions(-)
> > > > > > 
> > > > > > 
> > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> > > From: Jason Wang <jasowang@redhat.com>
> > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > > 
> > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > even if the device and vhost claims to support that. This help to
> > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > transactions on such platform.
> > > 
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >   hw/virtio/virtio-bus.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index d6332d45c3..2741b9fdd2 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >       VirtioBusState *bus = VIRTIO_BUS(qbus);
> > >       VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > >       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >       Error *local_err = NULL;
> > >       DPRINTF("%s: plug device.\n", qbus->name);
> > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >           return;
> > >       }
> > > -    if (klass->get_dma_as != NULL && has_iommu) {
> > > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > +    if (false && klass->get_dma_as != NULL &&
> > > +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > >           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >       } else {
> > > +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >           vdev->dma_as = &address_space_memory;
> > >       }
> > >   }
> > 
> > This seems to clear it unconditionally. I guess it's just a debugging
> > patch, the real one will come later?
> 
> 
> My bad, here's the correct one.
> 
> Thanks
> 
> 
> > 
> > > -- 
> > > 2.19.1
> > > 

> >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
> From: Jason Wang <jasowang@redhat.com>
> Date: Mon, 24 Feb 2020 12:00:10 +0800
> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> 
> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> even if the device and vhost claims to support that. This help to
> avoid the performance overhead caused by unnecessary IOTLB miss/update
> transactions on such platform.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/virtio/virtio-bus.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6332d45c3..4be64e193e 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>      VirtioBusState *bus = VIRTIO_BUS(qbus);
>      VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>      VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>      Error *local_err = NULL;
>  
>      DPRINTF("%s: plug device.\n", qbus->name);
> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>          return;
>      }
>  
> -    if (klass->get_dma_as != NULL && has_iommu) {
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);

So it looks like this line is unnecessary, but it's an unrelated
cleanup, right?

> +    if (klass->get_dma_as != NULL &&
> +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>      } else {
> +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);


Of course any change like that will have to affect migration compat, etc.
Can't we clear the bit when we are sending the features to vhost
instead?


>          vdev->dma_as = &address_space_memory;
>      }
>  }
> -- 
> 2.19.1
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  7:48           ` Michael S. Tsirkin
@ 2020-02-24  9:26             ` Jason Wang
  2020-02-24 13:40               ` Michael S. Tsirkin
  2020-02-24 13:56               ` Halil Pasic
  0 siblings, 2 replies; 58+ messages in thread
From: Jason Wang @ 2020-02-24  9:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 8606 bytes --]


On 2020/2/24 下午3:48, Michael S. Tsirkin wrote:
> On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
>> On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
>>> On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
>>>> On 2020/2/21 下午10:56, Halil Pasic wrote:
>>>>> On Fri, 21 Feb 2020 14:22:26 +0800
>>>>> Jason Wang <jasowang@redhat.com> wrote:
>>>>>
>>>>>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>>>>>> Currently if one intends to run a memory protection enabled VM with
>>>>>>> virtio devices and linux as the guest OS, one needs to specify the
>>>>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>>>>>> linux use the DMA API, which in turn handles the memory
>>>>>>> encryption/protection stuff if the guest decides to turn itself into
>>>>>>> a protected one. This however makes no sense due to multiple reasons:
>>>>>>> * The device is not changed by the fact that the guest RAM is
>>>>>>> protected. The so called IOMMU bypass quirk is not affected.
>>>>>>> * This usage is not congruent with  standardised semantics of
>>>>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>>>>>> for using DMA API in virtio (orthogonal with respect to what is
>>>>>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>>>>>
>>>>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>>>>>> is protected' and 'have to use DMA API because the device told me
>>>>>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>>>>>
>>>>>>> Please find more detailed explanations about the conceptual aspects in
>>>>>>> the individual patches. There is however also a very practical problem
>>>>>>> that is addressed by this series.
>>>>>>>
>>>>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>>>>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>>>>>> ring are not guest physical addresses but iova's, and insists on doing a
>>>>>>> translation of these regardless of what transport is used (e.g. whether
>>>>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>>>>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>>>>>> performance degradation (c.a. factor 10).
>>>>>> Do you see a consistent degradation on the performance, or it only
>>>>>> happen when for during the beginning of the test?
>>>>>>
>>>>> AFAIK the degradation is consistent.
>>>>>
>>>>>>> BTW with ccw I/O there is
>>>>>>> (architecturally) no IOMMU, so the whole address translation makes no
>>>>>>> sense in the context of virtio-ccw.
>>>>>> I suspect we can do optimization in qemu side.
>>>>>>
>>>>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>>>>>
>>>>>> If this makes sense, I can draft patch to see if there's any difference.
>>>>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
>>>>> then necessary translation) for virtio-ccw altogether. But Michael
>>>>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
>>>>> that could be used with guests running in protected mode. I don't share
>>>>> his opinion, but that's an ongoing discussion.
>>>>>
>>>>> Should we end up having to do translation from IOVA in vhost, we are
>>>>> very interested in that translation being fast and efficient.
>>>>>
>>>>> In that sense we would be very happy to test any optimization that aim
>>>>> into that direction.
>>>>>
>>>>> Thank you very much for your input!
>>>> Using IOTLB API on platform without IOMMU support is not intended. Please
>>>> try the attached patch to see if it helps.
>>>>
>>>> Thanks
>>>>
>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> Halil Pasic (2):
>>>>>>>       mm: move force_dma_unencrypted() to mem_encrypt.h
>>>>>>>       virtio: let virtio use DMA API when guest RAM is protected
>>>>>>>
>>>>>>>      drivers/virtio/virtio_ring.c |  3 +++
>>>>>>>      include/linux/dma-direct.h   |  9 ---------
>>>>>>>      include/linux/mem_encrypt.h  | 10 ++++++++++
>>>>>>>      3 files changed, 13 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
>>>> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>>>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>>>
>>>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>>>> even if the device and vhost claims to support that. This help to
>>>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>>>> transactions on such platform.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>    hw/virtio/virtio-bus.c | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>>>> index d6332d45c3..2741b9fdd2 100644
>>>> --- a/hw/virtio/virtio-bus.c
>>>> +++ b/hw/virtio/virtio-bus.c
>>>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>>        VirtioBusState *bus = VIRTIO_BUS(qbus);
>>>>        VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>>>>        VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>>>> -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>>        Error *local_err = NULL;
>>>>        DPRINTF("%s: plug device.\n", qbus->name);
>>>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>>            return;
>>>>        }
>>>> -    if (klass->get_dma_as != NULL && has_iommu) {
>>>> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>> +    if (false && klass->get_dma_as != NULL &&
>>>> +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>>>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>>        } else {
>>>> +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>>            vdev->dma_as = &address_space_memory;
>>>>        }
>>>>    }
>>> This seems to clear it unconditionally. I guess it's just a debugging
>>> patch, the real one will come later?
>>
>> My bad, here's the correct one.
>>
>> Thanks
>>
>>
>>>> -- 
>>>> 2.19.1
>>>>
>> >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
>> From: Jason Wang <jasowang@redhat.com>
>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>
>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>> even if the device and vhost claims to support that. This help to
>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>> transactions on such platform.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   hw/virtio/virtio-bus.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>> index d6332d45c3..4be64e193e 100644
>> --- a/hw/virtio/virtio-bus.c
>> +++ b/hw/virtio/virtio-bus.c
>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>       VirtioBusState *bus = VIRTIO_BUS(qbus);
>>       VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>>       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>> -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>       Error *local_err = NULL;
>>   
>>       DPRINTF("%s: plug device.\n", qbus->name);
>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>           return;
>>       }
>>   
>> -    if (klass->get_dma_as != NULL && has_iommu) {
>> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> So it looks like this line is unnecessary, but it's an unrelated
> cleanup, right?


Yes.


>
>> +    if (klass->get_dma_as != NULL &&
>> +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>>       } else {
>> +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>
> Of course any change like that will have to affect migration compat, etc.
> Can't we clear the bit when we are sending the features to vhost
> instead?


That's better.

How about attached?

Thanks


>
>
>>           vdev->dma_as = &address_space_memory;
>>       }
>>   }
>> -- 
>> 2.19.1
>>

[-- Attachment #2: 0001-vhost-do-not-set-VIRTIO_F_IOMMU_PLATFORM-when-IOMMU-.patch --]
[-- Type: text/x-patch, Size: 1641 bytes --]

From 3177c5194c729f3056b84c67664c59b9b949bb76 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Mon, 24 Feb 2020 17:24:14 +0800
Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not
 used

We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is
negotiated. This lead unnecessary IOTLB miss/update transactions when
IOMMU is used. This patch fixes this.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/net/virtio-net.c | 3 +++
 hw/virtio/vhost.c   | 4 +---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3627bb1717..0d50e8bd34 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
         virtio_net_apply_guest_offloads(n);
     }
 
+    if (vdev->dma_as == &address_space_memory)
+        features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
+
     for (i = 0;  i < n->max_queues; i++) {
         NetClientState *nc = qemu_get_subqueue(n->nic, i);
 
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 9edfadc81d..711b1136f6 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -288,9 +288,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
 
 static int vhost_dev_has_iommu(struct vhost_dev *dev)
 {
-    VirtIODevice *vdev = dev->vdev;
-
-    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
+    return virtio_has_feature(dev->acked_features, VIRTIO_F_IOMMU_PLATFORM);
 }
 
 static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
-- 
2.19.1


[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  9:26             ` Jason Wang
@ 2020-02-24 13:40               ` Michael S. Tsirkin
  2020-02-25  3:38                 ` Jason Wang
  2020-02-24 13:56               ` Halil Pasic
  1 sibling, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-02-24 13:40 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Mon, Feb 24, 2020 at 05:26:20PM +0800, Jason Wang wrote:
> 
> On 2020/2/24 下午3:48, Michael S. Tsirkin wrote:
> > On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
> > > On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> > > > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
> > > > > On 2020/2/21 下午10:56, Halil Pasic wrote:
> > > > > > On Fri, 21 Feb 2020 14:22:26 +0800
> > > > > > Jason Wang <jasowang@redhat.com> wrote:
> > > > > > 
> > > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > > > > > Currently if one intends to run a memory protection enabled VM with
> > > > > > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > > > > > linux use the DMA API, which in turn handles the memory
> > > > > > > > encryption/protection stuff if the guest decides to turn itself into
> > > > > > > > a protected one. This however makes no sense due to multiple reasons:
> > > > > > > > * The device is not changed by the fact that the guest RAM is
> > > > > > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > > > > > * This usage is not congruent with  standardised semantics of
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > > > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > > > > > > 
> > > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > > > > > is protected' and 'have to use DMA API because the device told me
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > > > > > > 
> > > > > > > > Please find more detailed explanations about the conceptual aspects in
> > > > > > > > the individual patches. There is however also a very practical problem
> > > > > > > > that is addressed by this series.
> > > > > > > > 
> > > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > > > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > > > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > > > > > translation of these regardless of what transport is used (e.g. whether
> > > > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > > > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > > > > > performance degradation (c.a. factor 10).
> > > > > > > Do you see a consistent degradation on the performance, or it only
> > > > > > > happen when for during the beginning of the test?
> > > > > > > 
> > > > > > AFAIK the degradation is consistent.
> > > > > > 
> > > > > > > > BTW with ccw I/O there is
> > > > > > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > > > > > sense in the context of virtio-ccw.
> > > > > > > I suspect we can do optimization in qemu side.
> > > > > > > 
> > > > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > > > > > > 
> > > > > > > If this makes sense, I can draft patch to see if there's any difference.
> > > > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > > > > > then necessary translation) for virtio-ccw altogether. But Michael
> > > > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > > > > > that could be used with guests running in protected mode. I don't share
> > > > > > his opinion, but that's an ongoing discussion.
> > > > > > 
> > > > > > Should we end up having to do translation from IOVA in vhost, we are
> > > > > > very interested in that translation being fast and efficient.
> > > > > > 
> > > > > > In that sense we would be very happy to test any optimization that aim
> > > > > > into that direction.
> > > > > > 
> > > > > > Thank you very much for your input!
> > > > > Using IOTLB API on platform without IOMMU support is not intended. Please
> > > > > try the attached patch to see if it helps.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > 
> > > > > > Regards,
> > > > > > Halil
> > > > > > 
> > > > > > > Thanks
> > > > > > > 
> > > > > > > 
> > > > > > > > Halil Pasic (2):
> > > > > > > >       mm: move force_dma_unencrypted() to mem_encrypt.h
> > > > > > > >       virtio: let virtio use DMA API when guest RAM is protected
> > > > > > > > 
> > > > > > > >      drivers/virtio/virtio_ring.c |  3 +++
> > > > > > > >      include/linux/dma-direct.h   |  9 ---------
> > > > > > > >      include/linux/mem_encrypt.h  | 10 ++++++++++
> > > > > > > >      3 files changed, 13 insertions(+), 9 deletions(-)
> > > > > > > > 
> > > > > > > > 
> > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > > > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > > > > 
> > > > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > > > even if the device and vhost claims to support that. This help to
> > > > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > > > transactions on such platform.
> > > > > 
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > >    hw/virtio/virtio-bus.c | 6 +++---
> > > > >    1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > > > index d6332d45c3..2741b9fdd2 100644
> > > > > --- a/hw/virtio/virtio-bus.c
> > > > > +++ b/hw/virtio/virtio-bus.c
> > > > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > > >        VirtioBusState *bus = VIRTIO_BUS(qbus);
> > > > >        VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > > > >        VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > > > >        Error *local_err = NULL;
> > > > >        DPRINTF("%s: plug device.\n", qbus->name);
> > > > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > > >            return;
> > > > >        }
> > > > > -    if (klass->get_dma_as != NULL && has_iommu) {
> > > > > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > > > +    if (false && klass->get_dma_as != NULL &&
> > > > > +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > > >            vdev->dma_as = klass->get_dma_as(qbus->parent);
> > > > >        } else {
> > > > > +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > > >            vdev->dma_as = &address_space_memory;
> > > > >        }
> > > > >    }
> > > > This seems to clear it unconditionally. I guess it's just a debugging
> > > > patch, the real one will come later?
> > > 
> > > My bad, here's the correct one.
> > > 
> > > Thanks
> > > 
> > > 
> > > > > -- 
> > > > > 2.19.1
> > > > > 
> > > >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
> > > From: Jason Wang <jasowang@redhat.com>
> > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > > 
> > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > even if the device and vhost claims to support that. This help to
> > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > transactions on such platform.
> > > 
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >   hw/virtio/virtio-bus.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index d6332d45c3..4be64e193e 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >       VirtioBusState *bus = VIRTIO_BUS(qbus);
> > >       VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > >       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > -    bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >       Error *local_err = NULL;
> > >       DPRINTF("%s: plug device.\n", qbus->name);
> > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >           return;
> > >       }
> > > -    if (klass->get_dma_as != NULL && has_iommu) {
> > > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > So it looks like this line is unnecessary, but it's an unrelated
> > cleanup, right?
> 
> 
> Yes.
> 
> 
> > 
> > > +    if (klass->get_dma_as != NULL &&
> > > +        virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > >           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >       } else {
> > > +        virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > 
> > Of course any change like that will have to affect migration compat, etc.
> > Can't we clear the bit when we are sending the features to vhost
> > instead?
> 
> 
> That's better.
> 
> How about attached?
> 
> Thanks
> 
> 
> > 
> > 
> > >           vdev->dma_as = &address_space_memory;
> > >       }
> > >   }
> > > -- 
> > > 2.19.1
> > > 

> >From 3177c5194c729f3056b84c67664c59b9b949bb76 Mon Sep 17 00:00:00 2001
> From: Jason Wang <jasowang@redhat.com>
> Date: Mon, 24 Feb 2020 17:24:14 +0800
> Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not
>  used
> 
> We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is
> negotiated. This lead unnecessary IOTLB miss/update transactions when
> IOMMU is used. This patch fixes this.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  hw/net/virtio-net.c | 3 +++
>  hw/virtio/vhost.c   | 4 +---
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3627bb1717..0d50e8bd34 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
>          virtio_net_apply_guest_offloads(n);
>      }
>  
> +    if (vdev->dma_as == &address_space_memory)
> +        features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
> +
>      for (i = 0;  i < n->max_queues; i++) {
>          NetClientState *nc = qemu_get_subqueue(n->nic, i);

This pokes at acked features. I think they are also
guest visible ...

> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 9edfadc81d..711b1136f6 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -288,9 +288,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
>  
>  static int vhost_dev_has_iommu(struct vhost_dev *dev)
>  {
> -    VirtIODevice *vdev = dev->vdev;
> -
> -    return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> +    return virtio_has_feature(dev->acked_features, VIRTIO_F_IOMMU_PLATFORM);
>  }
>  
>  static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> -- 
> 2.19.1
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24  9:26             ` Jason Wang
  2020-02-24 13:40               ` Michael S. Tsirkin
@ 2020-02-24 13:56               ` Halil Pasic
  2020-02-25  3:30                 ` Jason Wang
  1 sibling, 1 reply; 58+ messages in thread
From: Halil Pasic @ 2020-02-24 13:56 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Christian Borntraeger,
	iommu, David Gibson, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig

On Mon, 24 Feb 2020 17:26:20 +0800
Jason Wang <jasowang@redhat.com> wrote:

> That's better.
> 
> How about attached?
> 
> Thanks

Thanks Jason! It does avoid the translation overhead in vhost.

Tested-by: Halil Pasic <pasic@linux.ibm.com>

Regarding the code, you fence it in virtio-net.c, but AFAIU this feature
has relevance for other vhost devices as well. E.g. what about vhost
user? Would it be the responsibility of each virtio device to fence this
on its own?

I'm also a bit confused about the semantics of the vhost feature bit
F_ACCESS_PLATFORM. What we have specified on virtio level is:
"""
This feature indicates that the device can be used on a platform where
device access to data in memory is limited and/or translated. E.g. this
is the case if the device can be located behind an IOMMU that translates
bus addresses from the device into physical addresses in memory, if the
device can be limited to only access certain memory addresses or if
special commands such as a cache flush can be needed to synchronise data
in memory with the device. Whether accesses are actually limited or
translated is described by platform-specific means. If this feature bit
is set to 0, then the device has same access to memory addresses
supplied to it as the driver has. In particular, the device will always
use physical addresses matching addresses used by the driver (typically
meaning physical addresses used by the CPU) and not translated further,
and can access any address supplied to it by the driver. When clear,
this overrides any platform-specific description of whether device
access is limited or translated in any way, e.g. whether an IOMMU may be
present.
"""

I read this like the addresses may be IOVAs which require
IMMU translation or GPAs which don't.

On the vhost level however, it seems that F_IOMMU_PLATFORM means that
vhost has to do the translation (via IOTLB API).

Do I understand this correctly? If yes, I believe we should document
this properly.

BTW I'm still not 100% on the purpose and semantics of the
F_ACCESS_PLATFORM feature bit. But that is a different problem.

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-22 19:07       ` Michael S. Tsirkin
@ 2020-02-24 17:16         ` Christoph Hellwig
  2020-10-28 14:24           ` Alexander Graf via iommu
  0 siblings, 1 reply; 58+ messages in thread
From: Christoph Hellwig @ 2020-02-24 17:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, David Gibson,
	Michael Mueller, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote:
> On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
> > AFAIU you have a positive attitude towards the idea, that 
> > !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio' 
> > should be scrapped. 
> > 
> > I would like to accomplish that without adverse effects to virtio-ccw
> > (because caring for virtio-ccw is a part of job description).
> > 
> > Regards,
> > Halil
> 
> It is possible, in theory. IIRC the main challenge is that DMA API
> has overhead of indirect function calls even when all it
> does it return back the PA without changes.

That overhead is gone now, the DMA direct calls are direct calls these
days.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-24  3:33                 ` David Gibson
@ 2020-02-24 18:49                   ` Halil Pasic
  2020-02-25 18:08                     ` Cornelia Huck
  0 siblings, 1 reply; 58+ messages in thread
From: Halil Pasic @ 2020-02-24 18:49 UTC (permalink / raw)
  To: David Gibson
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 7234 bytes --]

On Mon, 24 Feb 2020 14:33:14 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote:
> > On Fri, 21 Feb 2020 10:48:15 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:
> > > > On Fri, 21 Feb 2020 14:27:27 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > 
> > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > > > > > > >From a users perspective it makes absolutely perfect sense to use the
> > > > > > > bounce buffers when they are NEEDED. 
> > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > > > because of the indirections.
> > > > > > 
> > > > > > The point is that the user should not have to specify iommu_platform.
> > > > > > We need to make sure any new hypervisor (especially one that might require
> > > > > > bounce buffering) always sets it,
> > > > > 
> > > > > So, I have draft qemu patches which enable iommu_platform by default.
> > > > > But that's really because of other problems with !iommu_platform, not
> > > > > anything to do with bounce buffering or secure VMs.
> > > > > 
> > > > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > > > to know whether the guest needs bounce buffering.  As far as the
> > > > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > > > just expects to get addresses it can access whether those are GPAs
> > > > > (iommu_platform=off) or IOVAs (iommu_platform=on).
> > > > 
> > > > I very much agree!
> > > > 
> > > > > 
> > > > > > as was a rather bogus legacy hack
> > > > > 
> > > > > It was certainly a bad idea, but it was a bad idea that went into a
> > > > > public spec and has been widely deployed for many years.  We can't
> > > > > just pretend it didn't happen and move on.
> > > > > 
> > > > > Turning iommu_platform=on by default breaks old guests, some of which
> > > > > we still care about.  We can't (automatically) do it only for guests
> > > > > that need bounce buffering, because the hypervisor doesn't know that
> > > > > ahead of time.
> > > > 
> > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> > > > because for CCW I/O there is no such thing as IOMMU and the addresses
> > > > are always physical addresses.
> > > 
> > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which
> > > makes much more sense.
> > 
> > I don't quite get it. Sorry. Maybe I will revisit this later.
> 
> Halil, I think I can clarify this.
> 
> The "iommu_platform" flag doesn't necessarily have anything to do with
> an iommu, although it often will.  Basically it means "access guest
> memory via the bus's normal DMA mechanism" rather than "access guest
> memory using GPA, because you're the hypervisor and you can do that".
> 

Unfortunately, I don't think this is what is conveyed to the end users.
Let's see what do we have documented:

Neither Qemu user documentation
(https://www.qemu.org/docs/master/qemu-doc.html) nor online help:
qemu-system-s390x -device virtio-net-ccw,?|grep iommu
  iommu_platform=<bool>  - on/off (default: false)
has any documentation on it.

But libvirt does have have documenttion on the knob that contros
iommu_platform for QEMU (when  QEMU is managed by libvirt):
"""
Virtio-related options 

QEMU's virtio devices have some attributes related to the virtio
transport under the driver element: The iommu attribute enables the use
of emulated IOMMU by the device. The attribute ats controls the Address
Translation Service support for PCIe devices. This is needed to make use
of IOTLB support (see IOMMU device). Possible values are on or off.
Since 3.5.0 
"""
(https://libvirt.org/formatdomain.html#elementsVirtio)

Thus it seems the only available documentation says that it "enables the use
of emulated IOMMU by the device".

And for vhost-user we have
"""
When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated:

* Guest addresses map to the vhost memory region containing that guest
  address.

When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated:

* Guest addresses are also called I/O virtual addresses (IOVAs).  They are
  translated to user addresses via the IOTLB.
"""
(docs/interop/vhost-user.rst)

> For the case of ccw, both mechanisms end up being the same thing,
> since CCW's normal DMA *is* untranslated GPA access.
> 

Nod.

> For this reason, the flag in the spec was renamed to ACCESS_PLATFORM,
> but the flag in qemu still has the old name.
> 

Yes, the name in the spec is more neutral.

> AIUI, Michael is saying you could trivially change the name in qemu
> (obviously you'd need to alias the old name to the new one for
> compatibility).
> 

I could, and the I could also ask the libvirt guys to change <driver
iommu='X'> to <driver access_platform='X'> or similar and to change
their documentation to something that is harder to comprehend. Although
I'm not sure they would like the idea.

> 
> Actually, the fact that ccw has no translation makes things easier for
> you: you don't really have any impediment to turning ACCESS_PLATFORM
> on by default, since it doesn't make any real change to how things
> work.

Yeah, it should not, in theory, but currently it does in practice.
Currently vhost will try to do the IOTLB dance (Jason has a patch that
should help with that), and we get the 'use dma api' side effects in the
guest (e.g. virtqueue's data go <2G + some overhead).

> 
> The remaining difficulty is that the virtio driver - since it can sit
> on multiple buses - won't know this, and will reject the
> ACCESS_PLATFORM flag, even though it could just do what it normally
> does on ccw and it would work.

Right ACCESS_PLATFORM is a funny feature where the device refuses to
work if the driver does not ack.

> 
> For that case, we could consider a hack in qemu where for virtio-ccw
> devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and
> carry on anyway.  Normally we insist that the guest accept the
> ACCESS_PLATFORM flag if offered, because on most platforms they
> *don't* amount to the same thing.

Jason found a nice way to differentiate between needs translation and
does not need translation. But that patch still requires the ack by the
driver (and as Michael has pointed out we have to consider migration).

I'm afraid that  F_IOMMU_PLATFORM means different things in different
contexts, and that this ain't sufficiently documented. I'm tempted to do
a proper write-up on this (let's hope my motivation will and my time
will allow). I would also very much like to have Conny's opinion on this.

Regards,
Halil







[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-21 16:36       ` Christoph Hellwig
  2020-02-24  6:50         ` David Gibson
@ 2020-02-24 18:59         ` Halil Pasic
  1 sibling, 0 replies; 58+ messages in thread
From: Halil Pasic @ 2020-02-24 18:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Viktor Mihajlovski, Janosch Frank,
	Michael S. Tsirkin, Jason Wang, Cornelia Huck, Ram Pai,
	linux-kernel, virtualization, Christian Borntraeger, iommu,
	Michael Mueller, Lendacky, Thomas, David Gibson, Robin Murphy

On Fri, 21 Feb 2020 17:36:45 +0100
Christoph Hellwig <hch@lst.de> wrote:

> > By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
> > lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
> > see how being a legacy device or not relates to use of the DMA API.  
> 
> No.   "legacy" is anything that does not set F_ACCESS_PLATFORM.

FYI in virtio-speak the term 'legacy devices' is already taken and it
ain't congruent with your intended semantics. Please look it up
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-60001

But with that understood your statement does provide insisting in how do
you see F_ACCESS_PLATFORM. I.e. something that should be enabled in
general, except for legacy reasons. But I'm afraid Michael sees it
differently: i.e. something that should be enabled when necessary, and
otherwise not (because it is likely to cost performance).

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24 13:56               ` Halil Pasic
@ 2020-02-25  3:30                 ` Jason Wang
  0 siblings, 0 replies; 58+ messages in thread
From: Jason Wang @ 2020-02-25  3:30 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Christian Borntraeger,
	iommu, David Gibson, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


On 2020/2/24 下午9:56, Halil Pasic wrote:
> On Mon, 24 Feb 2020 17:26:20 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> That's better.
>>
>> How about attached?
>>
>> Thanks
> Thanks Jason! It does avoid the translation overhead in vhost.
>
> Tested-by: Halil Pasic <pasic@linux.ibm.com>
>
> Regarding the code, you fence it in virtio-net.c, but AFAIU this feature
> has relevance for other vhost devices as well. E.g. what about vhost
> user? Would it be the responsibility of each virtio device to fence this
> on its own?


Yes, it looks to me it's better to do that in virtio_set_features_nocheck()


>
> I'm also a bit confused about the semantics of the vhost feature bit
> F_ACCESS_PLATFORM. What we have specified on virtio level is:
> """
> This feature indicates that the device can be used on a platform where
> device access to data in memory is limited and/or translated. E.g. this
> is the case if the device can be located behind an IOMMU that translates
> bus addresses from the device into physical addresses in memory, if the
> device can be limited to only access certain memory addresses or if
> special commands such as a cache flush can be needed to synchronise data
> in memory with the device. Whether accesses are actually limited or
> translated is described by platform-specific means. If this feature bit
> is set to 0, then the device has same access to memory addresses
> supplied to it as the driver has. In particular, the device will always
> use physical addresses matching addresses used by the driver (typically
> meaning physical addresses used by the CPU) and not translated further,
> and can access any address supplied to it by the driver. When clear,
> this overrides any platform-specific description of whether device
> access is limited or translated in any way, e.g. whether an IOMMU may be
> present.
> """
>
> I read this like the addresses may be IOVAs which require
> IMMU translation or GPAs which don't.
>
> On the vhost level however, it seems that F_IOMMU_PLATFORM means that
> vhost has to do the translation (via IOTLB API).


Yes.


>
> Do I understand this correctly? If yes, I believe we should document
> this properly.


Good point. I think it was probably wrong to tie F_IOMMU_PLATFORM to 
IOTLB API. Technically IOTLB can work with GPA->HVA mapping. I 
originally use a dedicated feature bit (you can see that from commit 
log), but for some reason Michael tweak it to virtio feature bit. I 
guess it was probably because at that time there's no way to specify e.g 
backend capability but now we have VHOST_GET_BACKEND_FEATURES.

For now, it was probably too late to fix that but document or we can add 
the support of enabling IOTLB via new backend features.


>
> BTW I'm still not 100% on the purpose and semantics of the
> F_ACCESS_PLATFORM feature bit. But that is a different problem.


Yes, I aggree that we should decouple the features that does not belongs 
to device (protected, encrypted, swiotlb etc) from F_IOMMU_PLATFORM. But 
Michael and other have their points as well.

Thanks


>
> Regards,
> Halil
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM
  2020-02-24 13:40               ` Michael S. Tsirkin
@ 2020-02-25  3:38                 ` Jason Wang
  0 siblings, 0 replies; 58+ messages in thread
From: Jason Wang @ 2020-02-25  3:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Cornelia Huck,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, David Gibson, Michael Mueller,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


On 2020/2/24 下午9:40, Michael S. Tsirkin wrote:
>> Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not
>>   used
>>
>> We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is
>> negotiated. This lead unnecessary IOTLB miss/update transactions when
>> IOMMU is used. This patch fixes this.
>>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
>>   hw/net/virtio-net.c | 3 +++
>>   hw/virtio/vhost.c   | 4 +---
>>   2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 3627bb1717..0d50e8bd34 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
>>           virtio_net_apply_guest_offloads(n);
>>       }
>>   
>> +    if (vdev->dma_as == &address_space_memory)
>> +        features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
>> +
>>       for (i = 0;  i < n->max_queues; i++) {
>>           NetClientState *nc = qemu_get_subqueue(n->nic, i);
> This pokes at acked features. I think they are also
> guest visible ...


It's the acked features of vhost device, so I guess not?

E.g virtio_set_features_nocheck() did:

     val &= vdev->host_features;
     if (k->set_features) {
         k->set_features(vdev, val);
     }
     vdev->guest_features = val;

Thanks

>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-24 18:49                   ` Halil Pasic
@ 2020-02-25 18:08                     ` Cornelia Huck
  2020-02-28  0:23                       ` David Gibson
  0 siblings, 1 reply; 58+ messages in thread
From: Cornelia Huck @ 2020-02-25 18:08 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Viktor Mihajlovski, Janosch Frank,
	Michael S. Tsirkin, Jason Wang, Ram Pai, linux-kernel,
	virtualization, Christian Borntraeger, iommu, Michael Mueller,
	Lendacky, Thomas, David Gibson, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 8297 bytes --]

On Mon, 24 Feb 2020 19:49:53 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 24 Feb 2020 14:33:14 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote:  
> > > On Fri, 21 Feb 2020 10:48:15 -0500
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >   
> > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:  
> > > > > On Fri, 21 Feb 2020 14:27:27 +1100
> > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > >   
> > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:  
> > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:  
> > > > > > > > >From a users perspective it makes absolutely perfect sense to use the  
> > > > > > > > bounce buffers when they are NEEDED. 
> > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > > > > because of the indirections.  
> > > > > > > 
> > > > > > > The point is that the user should not have to specify iommu_platform.
> > > > > > > We need to make sure any new hypervisor (especially one that might require
> > > > > > > bounce buffering) always sets it,  
> > > > > > 
> > > > > > So, I have draft qemu patches which enable iommu_platform by default.
> > > > > > But that's really because of other problems with !iommu_platform, not
> > > > > > anything to do with bounce buffering or secure VMs.
> > > > > > 
> > > > > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > > > > to know whether the guest needs bounce buffering.  As far as the
> > > > > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > > > > just expects to get addresses it can access whether those are GPAs
> > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on).  
> > > > > 
> > > > > I very much agree!
> > > > >   
> > > > > >   
> > > > > > > as was a rather bogus legacy hack  
> > > > > > 
> > > > > > It was certainly a bad idea, but it was a bad idea that went into a
> > > > > > public spec and has been widely deployed for many years.  We can't
> > > > > > just pretend it didn't happen and move on.
> > > > > > 
> > > > > > Turning iommu_platform=on by default breaks old guests, some of which
> > > > > > we still care about.  We can't (automatically) do it only for guests
> > > > > > that need bounce buffering, because the hypervisor doesn't know that
> > > > > > ahead of time.  

We could default to iommu_platform=on on s390 when the host has active
support for protected virtualization... but that's just another kind of
horrible, so let's just pretend I didn't suggest it.

> > > > > 
> > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> > > > > because for CCW I/O there is no such thing as IOMMU and the addresses
> > > > > are always physical addresses.  
> > > > 
> > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which
> > > > makes much more sense.  
> > > 
> > > I don't quite get it. Sorry. Maybe I will revisit this later.  
> > 
> > Halil, I think I can clarify this.
> > 
> > The "iommu_platform" flag doesn't necessarily have anything to do with
> > an iommu, although it often will.  Basically it means "access guest
> > memory via the bus's normal DMA mechanism" rather than "access guest
> > memory using GPA, because you're the hypervisor and you can do that".
> >   
> 
> Unfortunately, I don't think this is what is conveyed to the end users.
> Let's see what do we have documented:
> 
> Neither Qemu user documentation
> (https://www.qemu.org/docs/master/qemu-doc.html) nor online help:
> qemu-system-s390x -device virtio-net-ccw,?|grep iommu
>   iommu_platform=<bool>  - on/off (default: false)
> has any documentation on it.

Now, that's 'helpful' -- this certainly calls out for a bit of doc...

> 
> But libvirt does have have documenttion on the knob that contros
> iommu_platform for QEMU (when  QEMU is managed by libvirt):
> """
> Virtio-related options 
> 
> QEMU's virtio devices have some attributes related to the virtio
> transport under the driver element: The iommu attribute enables the use
> of emulated IOMMU by the device. The attribute ats controls the Address
> Translation Service support for PCIe devices. This is needed to make use
> of IOTLB support (see IOMMU device). Possible values are on or off.
> Since 3.5.0 
> """
> (https://libvirt.org/formatdomain.html#elementsVirtio)
> 
> Thus it seems the only available documentation says that it "enables the use
> of emulated IOMMU by the device".
> 
> And for vhost-user we have
> """
> When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated:
> 
> * Guest addresses map to the vhost memory region containing that guest
>   address.
> 
> When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated:
> 
> * Guest addresses are also called I/O virtual addresses (IOVAs).  They are
>   translated to user addresses via the IOTLB.
> """
> (docs/interop/vhost-user.rst)
> 
> > For the case of ccw, both mechanisms end up being the same thing,
> > since CCW's normal DMA *is* untranslated GPA access.
> >   
> 
> Nod.
> 
> > For this reason, the flag in the spec was renamed to ACCESS_PLATFORM,
> > but the flag in qemu still has the old name.
> >   
> 
> Yes, the name in the spec is more neutral.
> 
> > AIUI, Michael is saying you could trivially change the name in qemu
> > (obviously you'd need to alias the old name to the new one for
> > compatibility).
> >   
> 
> I could, and the I could also ask the libvirt guys to change <driver
> iommu='X'> to <driver access_platform='X'> or similar and to change  
> their documentation to something that is harder to comprehend. Although
> I'm not sure they would like the idea.

Hopefully, the documentation can be changed to something that is _not_
harder to comprehend :) (with a bit more text, I suppose.) Renaming to
something like access_platform seems like a good idea, even with the
required compat dance.

> 
> > 
> > Actually, the fact that ccw has no translation makes things easier for
> > you: you don't really have any impediment to turning ACCESS_PLATFORM
> > on by default, since it doesn't make any real change to how things
> > work.  
> 
> Yeah, it should not, in theory, but currently it does in practice.
> Currently vhost will try to do the IOTLB dance (Jason has a patch that
> should help with that), and we get the 'use dma api' side effects in the
> guest (e.g. virtqueue's data go <2G + some overhead).

Nod.

> 
> > 
> > The remaining difficulty is that the virtio driver - since it can sit
> > on multiple buses - won't know this, and will reject the
> > ACCESS_PLATFORM flag, even though it could just do what it normally
> > does on ccw and it would work.  
> 
> Right ACCESS_PLATFORM is a funny feature where the device refuses to
> work if the driver does not ack.
> 
> > 
> > For that case, we could consider a hack in qemu where for virtio-ccw
> > devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and
> > carry on anyway.  Normally we insist that the guest accept the
> > ACCESS_PLATFORM flag if offered, because on most platforms they
> > *don't* amount to the same thing.  
> 
> Jason found a nice way to differentiate between needs translation and
> does not need translation. But that patch still requires the ack by the
> driver (and as Michael has pointed out we have to consider migration).
> 
> I'm afraid that  F_IOMMU_PLATFORM means different things in different
> contexts, and that this ain't sufficiently documented. I'm tempted to do
> a proper write-up on this (let's hope my motivation will and my time
> will allow). I would also very much like to have Conny's opinion on this.

More documentation is never a bad idea; but I'm afraid I don't have any
further insights at the moment.

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
  2020-02-25 18:08                     ` Cornelia Huck
@ 2020-02-28  0:23                       ` David Gibson
  0 siblings, 0 replies; 58+ messages in thread
From: David Gibson @ 2020-02-28  0:23 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, Janosch Frank, Michael S. Tsirkin, Jason Wang,
	Ram Pai, linux-kernel, virtualization, Halil Pasic,
	Christian Borntraeger, iommu, Michael Mueller, Lendacky, Thomas,
	Viktor Mihajlovski, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 3626 bytes --]

On Tue, Feb 25, 2020 at 07:08:02PM +0100, Cornelia Huck wrote:
> On Mon, 24 Feb 2020 19:49:53 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Mon, 24 Feb 2020 14:33:14 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote:  
> > > > On Fri, 21 Feb 2020 10:48:15 -0500
> > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >   
> > > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:  
> > > > > > On Fri, 21 Feb 2020 14:27:27 +1100
> > > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > > >   
> > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:  
> > > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:  
> > > > > > > > > >From a users perspective it makes absolutely perfect sense to use the  
> > > > > > > > > bounce buffers when they are NEEDED. 
> > > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > > > > > because of the indirections.  
> > > > > > > > 
> > > > > > > > The point is that the user should not have to specify iommu_platform.
> > > > > > > > We need to make sure any new hypervisor (especially one that might require
> > > > > > > > bounce buffering) always sets it,  
> > > > > > > 
> > > > > > > So, I have draft qemu patches which enable iommu_platform by default.
> > > > > > > But that's really because of other problems with !iommu_platform, not
> > > > > > > anything to do with bounce buffering or secure VMs.
> > > > > > > 
> > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > > > > > to know whether the guest needs bounce buffering.  As far as the
> > > > > > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > > > > > just expects to get addresses it can access whether those are GPAs
> > > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on).  
> > > > > > 
> > > > > > I very much agree!
> > > > > >   
> > > > > > >   
> > > > > > > > as was a rather bogus legacy hack  
> > > > > > > 
> > > > > > > It was certainly a bad idea, but it was a bad idea that went into a
> > > > > > > public spec and has been widely deployed for many years.  We can't
> > > > > > > just pretend it didn't happen and move on.
> > > > > > > 
> > > > > > > Turning iommu_platform=on by default breaks old guests, some of which
> > > > > > > we still care about.  We can't (automatically) do it only for guests
> > > > > > > that need bounce buffering, because the hypervisor doesn't know that
> > > > > > > ahead of time.  
> 
> We could default to iommu_platform=on on s390 when the host has active
> support for protected virtualization... but that's just another kind of
> horrible, so let's just pretend I didn't suggest it.

Yeah, that would break migration between hosts with the feature and
hosts without - for everything, not just protected guests.  In general
any kind of guest visible configuration change based on host
properties is incompatible with the qemu/KVM migration model.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-02-24 17:16         ` Christoph Hellwig
@ 2020-10-28 14:24           ` Alexander Graf via iommu
  2020-10-28 18:01             ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Alexander Graf via iommu @ 2020-10-28 14:24 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, David Gibson,
	Michael Mueller, Viktor Mihajlovski, Robin Murphy

On 24.02.20 18:16, Christoph Hellwig wrote:
> On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote:
>> On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
>>> AFAIU you have a positive attitude towards the idea, that
>>> !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio'
>>> should be scrapped.
>>>
>>> I would like to accomplish that without adverse effects to virtio-ccw
>>> (because caring for virtio-ccw is a part of job description).
>>>
>>> Regards,
>>> Halil
>>
>> It is possible, in theory. IIRC the main challenge is that DMA API
>> has overhead of indirect function calls even when all it
>> does it return back the PA without changes.
> 
> That overhead is gone now, the DMA direct calls are direct calls these
> days.

Michael, would that mitigate your concerns to just always use the DMA 
API? If not, wouldn't it make sense to benchmark and pinpoint Christoph 
to paths that do slow things down, so we can finally move virtio into a 
world where it doesn't bypass the kernel DMA infrastructure.


Alex





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected
  2020-10-28 14:24           ` Alexander Graf via iommu
@ 2020-10-28 18:01             ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2020-10-28 18:01 UTC (permalink / raw)
  To: Alexander Graf
  Cc: linux-s390, Janosch Frank, Lendacky, Thomas, Jason Wang,
	Cornelia Huck, Ram Pai, linux-kernel, virtualization,
	Halil Pasic, Christian Borntraeger, iommu, David Gibson,
	Michael Mueller, Viktor Mihajlovski, Robin Murphy,
	Christoph Hellwig

On Wed, Oct 28, 2020 at 03:24:03PM +0100, Alexander Graf wrote:
> On 24.02.20 18:16, Christoph Hellwig wrote:
> > On Sat, Feb 22, 2020 at 02:07:58PM -0500, Michael S. Tsirkin wrote:
> > > On Fri, Feb 21, 2020 at 03:33:40PM +0100, Halil Pasic wrote:
> > > > AFAIU you have a positive attitude towards the idea, that
> > > > !F_VIRTIO_PLATFORM implies 'no DMA API is used by virtio'
> > > > should be scrapped.
> > > > 
> > > > I would like to accomplish that without adverse effects to virtio-ccw
> > > > (because caring for virtio-ccw is a part of job description).
> > > > 
> > > > Regards,
> > > > Halil
> > > 
> > > It is possible, in theory. IIRC the main challenge is that DMA API
> > > has overhead of indirect function calls even when all it
> > > does it return back the PA without changes.
> > 
> > That overhead is gone now, the DMA direct calls are direct calls these
> > days.
> 
> Michael, would that mitigate your concerns to just always use the DMA API?
> If not, wouldn't it make sense to benchmark and pinpoint Christoph to paths
> that do slow things down, so we can finally move virtio into a world where
> it doesn't bypass the kernel DMA infrastructure.
> 
> 
> Alex

There's no specific concern really. One can in theory move the code
handling !VIRTIO_F_ACCESS_PLATFORM such that instead of having a branch
in virtio code, you instead override platform DMA API and then use DMA
API unconditionally.

The gain from that will probably be marginal, but maybe I'm wrong.
Let's see the patches.

We still need to do the override when !VIRTIO_F_ACCESS_PLATFORM,
according to spec. Encrypted hypervisors still need to set
VIRTIO_F_ACCESS_PLATFORM.

Is VIRTIO_F_ACCESS_PLATFORM a good default? Need to support
legacy guests does not allow that at the qemu level, we
need to be conservative there. But yes, if you have a very
modern guest then it might be a good idea to just always
enable VIRTIO_F_ACCESS_PLATFORM. I say might since unfortunately
most people only use VIRTIO_F_ACCESS_PLATFORM with
vIOMMU, using it without VIRTIO_F_ACCESS_PLATFORM is supported
but is a path less well trodden. Benchmarking that,
fixing issues that surface if any would be imho effort
well spent, and would be a prerequisite to making it
a default in a production system.

> 
> 
> 
> 
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
> 
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2020-10-28 18:01 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
2020-02-20 16:11   ` Christoph Hellwig
2020-02-20 16:23     ` Christian Borntraeger
2020-02-20 16:31       ` Christoph Hellwig
2020-02-20 17:00         ` Christian Borntraeger
2020-02-21  3:27         ` David Gibson
2020-02-21 13:06           ` Halil Pasic
2020-02-21 15:48             ` Michael S. Tsirkin
2020-02-21 18:07               ` Halil Pasic
2020-02-24  3:33                 ` David Gibson
2020-02-24 18:49                   ` Halil Pasic
2020-02-25 18:08                     ` Cornelia Huck
2020-02-28  0:23                       ` David Gibson
2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
2020-02-20 16:13   ` Christoph Hellwig
2020-02-21  2:59     ` David Gibson
2020-02-21  3:41       ` Jason Wang
2020-02-21 13:31         ` Halil Pasic
2020-02-21 13:27       ` Halil Pasic
2020-02-21 16:36       ` Christoph Hellwig
2020-02-24  6:50         ` David Gibson
2020-02-24 18:59         ` Halil Pasic
2020-02-21 14:33     ` Halil Pasic
2020-02-21 16:39       ` Christoph Hellwig
2020-02-21 18:16         ` Halil Pasic
2020-02-22 19:07       ` Michael S. Tsirkin
2020-02-24 17:16         ` Christoph Hellwig
2020-10-28 14:24           ` Alexander Graf via iommu
2020-10-28 18:01             ` Michael S. Tsirkin
2020-02-20 20:55   ` Michael S. Tsirkin
2020-02-21  1:17     ` Ram Pai
2020-02-21  3:29       ` David Gibson
2020-02-21 13:12     ` Halil Pasic
2020-02-21 15:39       ` Tom Lendacky
2020-02-24  6:40         ` David Gibson
2020-02-21 15:56       ` Michael S. Tsirkin
2020-02-21 16:35         ` Christoph Hellwig
2020-02-21 18:03         ` Halil Pasic
2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin
2020-02-20 21:29 ` Michael S. Tsirkin
2020-02-21 13:37   ` Halil Pasic
2020-02-20 21:33 ` Michael S. Tsirkin
2020-02-21 13:49   ` Halil Pasic
2020-02-21 16:41   ` Christoph Hellwig
2020-02-24  5:44     ` David Gibson
2020-02-21  6:22 ` Jason Wang
2020-02-21 14:56   ` Halil Pasic
2020-02-24  3:38     ` David Gibson
2020-02-24  4:01     ` Jason Wang
2020-02-24  6:06       ` Michael S. Tsirkin
2020-02-24  6:45         ` Jason Wang
2020-02-24  7:48           ` Michael S. Tsirkin
2020-02-24  9:26             ` Jason Wang
2020-02-24 13:40               ` Michael S. Tsirkin
2020-02-25  3:38                 ` Jason Wang
2020-02-24 13:56               ` Halil Pasic
2020-02-25  3:30                 ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).