* [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-29 17:08 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:08 UTC (permalink / raw) To: virtualization Cc: linuxppc-devel, iommu, linux-kernel, Michael S . Tsirkin, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai Hello, With Christoph's rework of the DMA API that recently landed, the patch below is the only change needed in virtio to make it work in a POWER secure guest under the ultravisor. The other change we need (making sure the device's dma_map_ops is NULL so that the dma-direct/swiotlb code is used) can be made in powerpc-specific code. Of course, I also have patches (soon to be posted as RFC) which hook up <linux/mem_encrypt.h> to the powerpc secure guest support code. What do you think? From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 From: Thiago Jung Bauermann <bauerman@linux.ibm.com> Date: Thu, 24 Jan 2019 22:08:02 -0200 Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted The host can't access the guest memory when it's encrypted, so using regular memory pages for the ring isn't an option. Go through the DMA API. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> --- drivers/virtio/virtio_ring.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..321a27075380 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) * not work without an even larger kludge. Instead, enable * the DMA API if we're a Xen guest, which at least allows * all of the sensible Xen configurations to work correctly. + * + * Also, if guest memory is encrypted the host can't access + * it directly. In this case, we'll need to use the DMA API. */ - if (xen_domain()) + if (xen_domain() || sev_active()) return true; return false; ^ permalink raw reply related [flat|nested] 198+ messages in thread
* [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-29 17:08 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:08 UTC (permalink / raw) To: virtualization Cc: linuxppc-devel, iommu, linux-kernel, Michael S . Tsirkin, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai Hello, With Christoph's rework of the DMA API that recently landed, the patch below is the only change needed in virtio to make it work in a POWER secure guest under the ultravisor. The other change we need (making sure the device's dma_map_ops is NULL so that the dma-direct/swiotlb code is used) can be made in powerpc-specific code. Of course, I also have patches (soon to be posted as RFC) which hook up <linux/mem_encrypt.h> to the powerpc secure guest support code. What do you think? >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 From: Thiago Jung Bauermann <bauerman@linux.ibm.com> Date: Thu, 24 Jan 2019 22:08:02 -0200 Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted The host can't access the guest memory when it's encrypted, so using regular memory pages for the ring isn't an option. Go through the DMA API. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> --- drivers/virtio/virtio_ring.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..321a27075380 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) * not work without an even larger kludge. Instead, enable * the DMA API if we're a Xen guest, which at least allows * all of the sensible Xen configurations to work correctly. + * + * Also, if guest memory is encrypted the host can't access + * it directly. In this case, we'll need to use the DMA API. */ - if (xen_domain()) + if (xen_domain() || sev_active()) return true; return false; ^ permalink raw reply related [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 17:08 ` Thiago Jung Bauermann @ 2019-01-29 17:42 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:42 UTC (permalink / raw) To: virtualization Cc: linuxppc-dev, iommu, linux-kernel, Michael S . Tsirkin, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai Fixing address of powerpc mailing list. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-29 17:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:42 UTC (permalink / raw) To: virtualization Cc: Michael S . Tsirkin, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Fixing address of powerpc mailing list. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 17:42 ` Thiago Jung Bauermann @ 2019-01-29 19:02 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-29 19:02 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > Fixing address of powerpc mailing list. > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Well I think this will come back to bite us (witness xen which is now reworking precisely this path - but at least they aren't to blame, xen came before ACCESS_PLATFORM). I also still think the right thing would have been to set ACCESS_PLATFORM for all systems where device can't access all memory. But I also think I don't have the energy to argue about power secure guest anymore. So be it for power secure guest since the involved engineers disagree with me. Hey I've been wrong in the past ;). But the name "sev_active" makes me scared because at least AMD guys who were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm wrong? I reemember distinctly that's so) will likely be affected too. We don't want that. So let's find a way to make sure it's just power secure guest for now pls. I also think we should add a dma_api near features under virtio_device such that these hacks can move off data path. By the way could you please respond about virtio-iommu and why there's no support for ACCESS_PLATFORM on power? I have Cc'd you on these discussions. Thanks! > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-29 19:02 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-29 19:02 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > Fixing address of powerpc mailing list. > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Well I think this will come back to bite us (witness xen which is now reworking precisely this path - but at least they aren't to blame, xen came before ACCESS_PLATFORM). I also still think the right thing would have been to set ACCESS_PLATFORM for all systems where device can't access all memory. But I also think I don't have the energy to argue about power secure guest anymore. So be it for power secure guest since the involved engineers disagree with me. Hey I've been wrong in the past ;). But the name "sev_active" makes me scared because at least AMD guys who were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm wrong? I reemember distinctly that's so) will likely be affected too. We don't want that. So let's find a way to make sure it's just power secure guest for now pls. I also think we should add a dma_api near features under virtio_device such that these hacks can move off data path. By the way could you please respond about virtio-iommu and why there's no support for ACCESS_PLATFORM on power? I have Cc'd you on these discussions. Thanks! > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 19:02 ` Michael S. Tsirkin @ 2019-01-30 2:24 ` Jason Wang -1 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 2:24 UTC (permalink / raw) To: Michael S. Tsirkin, Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >>> Hello, >>> >>> With Christoph's rework of the DMA API that recently landed, the patch >>> below is the only change needed in virtio to make it work in a POWER >>> secure guest under the ultravisor. >>> >>> The other change we need (making sure the device's dma_map_ops is NULL >>> so that the dma-direct/swiotlb code is used) can be made in >>> powerpc-specific code. >>> >>> Of course, I also have patches (soon to be posted as RFC) which hook up >>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>> >>> What do you think? >>> >>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>> >>> The host can't access the guest memory when it's encrypted, so using >>> regular memory pages for the ring isn't an option. Go through the DMA API. >>> >>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. > > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). > > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. > > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Anyway the current Xen code is conflict with spec which said: "If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. " I wonder how much value that the above description can give us. It's kind of odd that the behavior of "when the feature is not negotiated" is described in the spec. Personally I think we can remove the above and then we can switch to use DMA API unconditionally in guest driver. It may have single digit regression probably, we can try to overcome it. Thanks > > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? > > I have Cc'd you on these discussions. > > > Thanks! > > >>> --- >>> drivers/virtio/virtio_ring.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index cd7e755484e3..321a27075380 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>> * not work without an even larger kludge. Instead, enable >>> * the DMA API if we're a Xen guest, which at least allows >>> * all of the sensible Xen configurations to work correctly. >>> + * >>> + * Also, if guest memory is encrypted the host can't access >>> + * it directly. In this case, we'll need to use the DMA API. >>> */ >>> - if (xen_domain()) >>> + if (xen_domain() || sev_active()) >>> return true; >>> >>> return false; >> >> -- >> Thiago Jung Bauermann >> IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-30 2:24 ` Jason Wang 0 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 2:24 UTC (permalink / raw) To: Michael S. Tsirkin, Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >>> Hello, >>> >>> With Christoph's rework of the DMA API that recently landed, the patch >>> below is the only change needed in virtio to make it work in a POWER >>> secure guest under the ultravisor. >>> >>> The other change we need (making sure the device's dma_map_ops is NULL >>> so that the dma-direct/swiotlb code is used) can be made in >>> powerpc-specific code. >>> >>> Of course, I also have patches (soon to be posted as RFC) which hook up >>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>> >>> What do you think? >>> >>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>> >>> The host can't access the guest memory when it's encrypted, so using >>> regular memory pages for the ring isn't an option. Go through the DMA API. >>> >>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. > > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). > > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. > > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Anyway the current Xen code is conflict with spec which said: "If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. " I wonder how much value that the above description can give us. It's kind of odd that the behavior of "when the feature is not negotiated" is described in the spec. Personally I think we can remove the above and then we can switch to use DMA API unconditionally in guest driver. It may have single digit regression probably, we can try to overcome it. Thanks > > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? > > I have Cc'd you on these discussions. > > > Thanks! > > >>> --- >>> drivers/virtio/virtio_ring.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index cd7e755484e3..321a27075380 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>> * not work without an even larger kludge. Instead, enable >>> * the DMA API if we're a Xen guest, which at least allows >>> * all of the sensible Xen configurations to work correctly. >>> + * >>> + * Also, if guest memory is encrypted the host can't access >>> + * it directly. In this case, we'll need to use the DMA API. >>> */ >>> - if (xen_domain()) >>> + if (xen_domain() || sev_active()) >>> return true; >>> >>> return false; >> >> -- >> Thiago Jung Bauermann >> IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:24 ` Jason Wang (?) @ 2019-01-30 2:36 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 2:36 UTC (permalink / raw) To: Jason Wang Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > Fixing address of powerpc mailing list. > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > Hello, > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > below is the only change needed in virtio to make it work in a POWER > > > > secure guest under the ultravisor. > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > powerpc-specific code. > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > What do you think? > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > > Anyway the current Xen code is conflict with spec which said: > > "If this feature bit is set to 0, then the device has same access to memory > addresses supplied to it as the driver has. In particular, the device will > always use physical addresses matching addresses used by the driver > (typically meaning physical addresses used by the CPU) and not translated > further, and can access any address supplied to it by the driver. When > clear, this overrides any platform-specific description of whether device > access is limited or translated in any way, e.g. whether an IOMMU may be > present. " > > I wonder how much value that the above description can give us. It's kind of > odd that the behavior of "when the feature is not negotiated" is described > in the spec. Hmm what's odd about it? We need to describe the behaviour is all cases. > Personally I think we can remove the above and then we can > switch to use DMA API unconditionally in guest driver. It may have single > digit regression probably, we can try to overcome it. > > Thanks This has been discussed ad nauseum. virtio is all about compatibility. Losing a couple of lines of code isn't worth breaking working setups. People that want "just use DMA API no tricks" now have the option. Setting a flag in a feature bit map is literally a single line of code in the hypervisor. So stop pushing for breaking working legacy setups and just fix it in the right place. > > > > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > > > I have Cc'd you on these discussions. > > > > > > Thanks! > > > > > > > > --- > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > index cd7e755484e3..321a27075380 100644 > > > > --- a/drivers/virtio/virtio_ring.c > > > > +++ b/drivers/virtio/virtio_ring.c > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > * not work without an even larger kludge. Instead, enable > > > > * the DMA API if we're a Xen guest, which at least allows > > > > * all of the sensible Xen configurations to work correctly. > > > > + * > > > > + * Also, if guest memory is encrypted the host can't access > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > */ > > > > - if (xen_domain()) > > > > + if (xen_domain() || sev_active()) > > > > return true; > > > > > > > > return false; > > > > > > -- > > > Thiago Jung Bauermann > > > IBM Linux Technology Center _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:24 ` Jason Wang @ 2019-01-30 2:36 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 2:36 UTC (permalink / raw) To: Jason Wang Cc: Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > Fixing address of powerpc mailing list. > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > Hello, > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > below is the only change needed in virtio to make it work in a POWER > > > > secure guest under the ultravisor. > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > powerpc-specific code. > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > What do you think? > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > > Anyway the current Xen code is conflict with spec which said: > > "If this feature bit is set to 0, then the device has same access to memory > addresses supplied to it as the driver has. In particular, the device will > always use physical addresses matching addresses used by the driver > (typically meaning physical addresses used by the CPU) and not translated > further, and can access any address supplied to it by the driver. When > clear, this overrides any platform-specific description of whether device > access is limited or translated in any way, e.g. whether an IOMMU may be > present. " > > I wonder how much value that the above description can give us. It's kind of > odd that the behavior of "when the feature is not negotiated" is described > in the spec. Hmm what's odd about it? We need to describe the behaviour is all cases. > Personally I think we can remove the above and then we can > switch to use DMA API unconditionally in guest driver. It may have single > digit regression probably, we can try to overcome it. > > Thanks This has been discussed ad nauseum. virtio is all about compatibility. Losing a couple of lines of code isn't worth breaking working setups. People that want "just use DMA API no tricks" now have the option. Setting a flag in a feature bit map is literally a single line of code in the hypervisor. So stop pushing for breaking working legacy setups and just fix it in the right place. > > > > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > > > I have Cc'd you on these discussions. > > > > > > Thanks! > > > > > > > > --- > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > index cd7e755484e3..321a27075380 100644 > > > > --- a/drivers/virtio/virtio_ring.c > > > > +++ b/drivers/virtio/virtio_ring.c > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > * not work without an even larger kludge. Instead, enable > > > > * the DMA API if we're a Xen guest, which at least allows > > > > * all of the sensible Xen configurations to work correctly. > > > > + * > > > > + * Also, if guest memory is encrypted the host can't access > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > */ > > > > - if (xen_domain()) > > > > + if (xen_domain() || sev_active()) > > > > return true; > > > > > > > > return false; > > > > > > -- > > > Thiago Jung Bauermann > > > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-30 2:36 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 2:36 UTC (permalink / raw) To: Jason Wang Cc: Jean-Philippe Brucker, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > Fixing address of powerpc mailing list. > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > Hello, > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > below is the only change needed in virtio to make it work in a POWER > > > > secure guest under the ultravisor. > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > powerpc-specific code. > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > What do you think? > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > > Anyway the current Xen code is conflict with spec which said: > > "If this feature bit is set to 0, then the device has same access to memory > addresses supplied to it as the driver has. In particular, the device will > always use physical addresses matching addresses used by the driver > (typically meaning physical addresses used by the CPU) and not translated > further, and can access any address supplied to it by the driver. When > clear, this overrides any platform-specific description of whether device > access is limited or translated in any way, e.g. whether an IOMMU may be > present. " > > I wonder how much value that the above description can give us. It's kind of > odd that the behavior of "when the feature is not negotiated" is described > in the spec. Hmm what's odd about it? We need to describe the behaviour is all cases. > Personally I think we can remove the above and then we can > switch to use DMA API unconditionally in guest driver. It may have single > digit regression probably, we can try to overcome it. > > Thanks This has been discussed ad nauseum. virtio is all about compatibility. Losing a couple of lines of code isn't worth breaking working setups. People that want "just use DMA API no tricks" now have the option. Setting a flag in a feature bit map is literally a single line of code in the hypervisor. So stop pushing for breaking working legacy setups and just fix it in the right place. > > > > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > > > I have Cc'd you on these discussions. > > > > > > Thanks! > > > > > > > > --- > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > index cd7e755484e3..321a27075380 100644 > > > > --- a/drivers/virtio/virtio_ring.c > > > > +++ b/drivers/virtio/virtio_ring.c > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > * not work without an even larger kludge. Instead, enable > > > > * the DMA API if we're a Xen guest, which at least allows > > > > * all of the sensible Xen configurations to work correctly. > > > > + * > > > > + * Also, if guest memory is encrypted the host can't access > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > */ > > > > - if (xen_domain()) > > > > + if (xen_domain() || sev_active()) > > > > return true; > > > > > > > > return false; > > > > > > -- > > > Thiago Jung Bauermann > > > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:36 ` Michael S. Tsirkin (?) @ 2019-01-30 3:05 ` Jason Wang -1 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 3:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: >> On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: >>> On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >>>> Fixing address of powerpc mailing list. >>>> >>>> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >>>> >>>>> Hello, >>>>> >>>>> With Christoph's rework of the DMA API that recently landed, the patch >>>>> below is the only change needed in virtio to make it work in a POWER >>>>> secure guest under the ultravisor. >>>>> >>>>> The other change we need (making sure the device's dma_map_ops is NULL >>>>> so that the dma-direct/swiotlb code is used) can be made in >>>>> powerpc-specific code. >>>>> >>>>> Of course, I also have patches (soon to be posted as RFC) which hook up >>>>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>>>> >>>>> What do you think? >>>>> >>>>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>>>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>>>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>>>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>>>> >>>>> The host can't access the guest memory when it's encrypted, so using >>>>> regular memory pages for the ring isn't an option. Go through the DMA API. >>>>> >>>>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Well I think this will come back to bite us (witness xen which is now >>> reworking precisely this path - but at least they aren't to blame, xen >>> came before ACCESS_PLATFORM). >>> >>> I also still think the right thing would have been to set >>> ACCESS_PLATFORM for all systems where device can't access all memory. >>> >>> But I also think I don't have the energy to argue about power secure >>> guest anymore. So be it for power secure guest since the involved >>> engineers disagree with me. Hey I've been wrong in the past ;). >>> >>> But the name "sev_active" makes me scared because at least AMD guys who >>> were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm >>> wrong? I reemember distinctly that's so) will likely be affected too. >>> We don't want that. >>> >>> So let's find a way to make sure it's just power secure guest for now >>> pls. >>> >>> I also think we should add a dma_api near features under virtio_device >>> such that these hacks can move off data path. >> >> Anyway the current Xen code is conflict with spec which said: >> >> "If this feature bit is set to 0, then the device has same access to memory >> addresses supplied to it as the driver has. In particular, the device will >> always use physical addresses matching addresses used by the driver >> (typically meaning physical addresses used by the CPU) and not translated >> further, and can access any address supplied to it by the driver. When >> clear, this overrides any platform-specific description of whether device >> access is limited or translated in any way, e.g. whether an IOMMU may be >> present. " >> >> I wonder how much value that the above description can give us. It's kind of >> odd that the behavior of "when the feature is not negotiated" is described >> in the spec. > Hmm what's odd about it? We need to describe the behaviour is all cases. Well, try to limit the behavior of 'legacy' driver is tricky or even impossible. Xen is an exact example for this. > >> Personally I think we can remove the above and then we can >> switch to use DMA API unconditionally in guest driver. It may have single >> digit regression probably, we can try to overcome it. >> >> Thanks > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I may miss soemthing, which kind of legacy setup is broken? Do you mean using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually unbreak this setup. Thanks > >>> By the way could you please respond about virtio-iommu and >>> why there's no support for ACCESS_PLATFORM on power? >>> >>> I have Cc'd you on these discussions. >>> >>> >>> Thanks! >>> >>> >>>>> --- >>>>> drivers/virtio/virtio_ring.c | 5 ++++- >>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>>>> index cd7e755484e3..321a27075380 100644 >>>>> --- a/drivers/virtio/virtio_ring.c >>>>> +++ b/drivers/virtio/virtio_ring.c >>>>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>>>> * not work without an even larger kludge. Instead, enable >>>>> * the DMA API if we're a Xen guest, which at least allows >>>>> * all of the sensible Xen configurations to work correctly. >>>>> + * >>>>> + * Also, if guest memory is encrypted the host can't access >>>>> + * it directly. In this case, we'll need to use the DMA API. >>>>> */ >>>>> - if (xen_domain()) >>>>> + if (xen_domain() || sev_active()) >>>>> return true; >>>>> >>>>> return false; >>>> -- >>>> Thiago Jung Bauermann >>>> IBM Linux Technology Center _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:36 ` Michael S. Tsirkin @ 2019-01-30 3:05 ` Jason Wang -1 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 3:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: >> On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: >>> On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >>>> Fixing address of powerpc mailing list. >>>> >>>> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >>>> >>>>> Hello, >>>>> >>>>> With Christoph's rework of the DMA API that recently landed, the patch >>>>> below is the only change needed in virtio to make it work in a POWER >>>>> secure guest under the ultravisor. >>>>> >>>>> The other change we need (making sure the device's dma_map_ops is NULL >>>>> so that the dma-direct/swiotlb code is used) can be made in >>>>> powerpc-specific code. >>>>> >>>>> Of course, I also have patches (soon to be posted as RFC) which hook up >>>>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>>>> >>>>> What do you think? >>>>> >>>>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>>>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>>>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>>>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>>>> >>>>> The host can't access the guest memory when it's encrypted, so using >>>>> regular memory pages for the ring isn't an option. Go through the DMA API. >>>>> >>>>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Well I think this will come back to bite us (witness xen which is now >>> reworking precisely this path - but at least they aren't to blame, xen >>> came before ACCESS_PLATFORM). >>> >>> I also still think the right thing would have been to set >>> ACCESS_PLATFORM for all systems where device can't access all memory. >>> >>> But I also think I don't have the energy to argue about power secure >>> guest anymore. So be it for power secure guest since the involved >>> engineers disagree with me. Hey I've been wrong in the past ;). >>> >>> But the name "sev_active" makes me scared because at least AMD guys who >>> were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm >>> wrong? I reemember distinctly that's so) will likely be affected too. >>> We don't want that. >>> >>> So let's find a way to make sure it's just power secure guest for now >>> pls. >>> >>> I also think we should add a dma_api near features under virtio_device >>> such that these hacks can move off data path. >> >> Anyway the current Xen code is conflict with spec which said: >> >> "If this feature bit is set to 0, then the device has same access to memory >> addresses supplied to it as the driver has. In particular, the device will >> always use physical addresses matching addresses used by the driver >> (typically meaning physical addresses used by the CPU) and not translated >> further, and can access any address supplied to it by the driver. When >> clear, this overrides any platform-specific description of whether device >> access is limited or translated in any way, e.g. whether an IOMMU may be >> present. " >> >> I wonder how much value that the above description can give us. It's kind of >> odd that the behavior of "when the feature is not negotiated" is described >> in the spec. > Hmm what's odd about it? We need to describe the behaviour is all cases. Well, try to limit the behavior of 'legacy' driver is tricky or even impossible. Xen is an exact example for this. > >> Personally I think we can remove the above and then we can >> switch to use DMA API unconditionally in guest driver. It may have single >> digit regression probably, we can try to overcome it. >> >> Thanks > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I may miss soemthing, which kind of legacy setup is broken? Do you mean using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually unbreak this setup. Thanks > >>> By the way could you please respond about virtio-iommu and >>> why there's no support for ACCESS_PLATFORM on power? >>> >>> I have Cc'd you on these discussions. >>> >>> >>> Thanks! >>> >>> >>>>> --- >>>>> drivers/virtio/virtio_ring.c | 5 ++++- >>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>>>> index cd7e755484e3..321a27075380 100644 >>>>> --- a/drivers/virtio/virtio_ring.c >>>>> +++ b/drivers/virtio/virtio_ring.c >>>>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>>>> * not work without an even larger kludge. Instead, enable >>>>> * the DMA API if we're a Xen guest, which at least allows >>>>> * all of the sensible Xen configurations to work correctly. >>>>> + * >>>>> + * Also, if guest memory is encrypted the host can't access >>>>> + * it directly. In this case, we'll need to use the DMA API. >>>>> */ >>>>> - if (xen_domain()) >>>>> + if (xen_domain() || sev_active()) >>>>> return true; >>>>> >>>>> return false; >>>> -- >>>> Thiago Jung Bauermann >>>> IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-30 3:05 ` Jason Wang 0 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 3:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: >> On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: >>> On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >>>> Fixing address of powerpc mailing list. >>>> >>>> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >>>> >>>>> Hello, >>>>> >>>>> With Christoph's rework of the DMA API that recently landed, the patch >>>>> below is the only change needed in virtio to make it work in a POWER >>>>> secure guest under the ultravisor. >>>>> >>>>> The other change we need (making sure the device's dma_map_ops is NULL >>>>> so that the dma-direct/swiotlb code is used) can be made in >>>>> powerpc-specific code. >>>>> >>>>> Of course, I also have patches (soon to be posted as RFC) which hook up >>>>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>>>> >>>>> What do you think? >>>>> >>>>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>>>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>>>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>>>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>>>> >>>>> The host can't access the guest memory when it's encrypted, so using >>>>> regular memory pages for the ring isn't an option. Go through the DMA API. >>>>> >>>>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Well I think this will come back to bite us (witness xen which is now >>> reworking precisely this path - but at least they aren't to blame, xen >>> came before ACCESS_PLATFORM). >>> >>> I also still think the right thing would have been to set >>> ACCESS_PLATFORM for all systems where device can't access all memory. >>> >>> But I also think I don't have the energy to argue about power secure >>> guest anymore. So be it for power secure guest since the involved >>> engineers disagree with me. Hey I've been wrong in the past ;). >>> >>> But the name "sev_active" makes me scared because at least AMD guys who >>> were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm >>> wrong? I reemember distinctly that's so) will likely be affected too. >>> We don't want that. >>> >>> So let's find a way to make sure it's just power secure guest for now >>> pls. >>> >>> I also think we should add a dma_api near features under virtio_device >>> such that these hacks can move off data path. >> >> Anyway the current Xen code is conflict with spec which said: >> >> "If this feature bit is set to 0, then the device has same access to memory >> addresses supplied to it as the driver has. In particular, the device will >> always use physical addresses matching addresses used by the driver >> (typically meaning physical addresses used by the CPU) and not translated >> further, and can access any address supplied to it by the driver. When >> clear, this overrides any platform-specific description of whether device >> access is limited or translated in any way, e.g. whether an IOMMU may be >> present. " >> >> I wonder how much value that the above description can give us. It's kind of >> odd that the behavior of "when the feature is not negotiated" is described >> in the spec. > Hmm what's odd about it? We need to describe the behaviour is all cases. Well, try to limit the behavior of 'legacy' driver is tricky or even impossible. Xen is an exact example for this. > >> Personally I think we can remove the above and then we can >> switch to use DMA API unconditionally in guest driver. It may have single >> digit regression probably, we can try to overcome it. >> >> Thanks > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I may miss soemthing, which kind of legacy setup is broken? Do you mean using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually unbreak this setup. Thanks > >>> By the way could you please respond about virtio-iommu and >>> why there's no support for ACCESS_PLATFORM on power? >>> >>> I have Cc'd you on these discussions. >>> >>> >>> Thanks! >>> >>> >>>>> --- >>>>> drivers/virtio/virtio_ring.c | 5 ++++- >>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>>>> index cd7e755484e3..321a27075380 100644 >>>>> --- a/drivers/virtio/virtio_ring.c >>>>> +++ b/drivers/virtio/virtio_ring.c >>>>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>>>> * not work without an even larger kludge. Instead, enable >>>>> * the DMA API if we're a Xen guest, which at least allows >>>>> * all of the sensible Xen configurations to work correctly. >>>>> + * >>>>> + * Also, if guest memory is encrypted the host can't access >>>>> + * it directly. In this case, we'll need to use the DMA API. >>>>> */ >>>>> - if (xen_domain()) >>>>> + if (xen_domain() || sev_active()) >>>>> return true; >>>>> >>>>> return false; >>>> -- >>>> Thiago Jung Bauermann >>>> IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 3:05 ` Jason Wang (?) @ 2019-01-30 3:26 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 3:26 UTC (permalink / raw) To: Jason Wang Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Wed, Jan 30, 2019 at 11:05:42AM +0800, Jason Wang wrote: > > On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > > > Fixing address of powerpc mailing list. > > > > > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > > > below is the only change needed in virtio to make it work in a POWER > > > > > > secure guest under the ultravisor. > > > > > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > > > powerpc-specific code. > > > > > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > > > reworking precisely this path - but at least they aren't to blame, xen > > > > came before ACCESS_PLATFORM). > > > > > > > > I also still think the right thing would have been to set > > > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > > > > > But I also think I don't have the energy to argue about power secure > > > > guest anymore. So be it for power secure guest since the involved > > > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > > > > > But the name "sev_active" makes me scared because at least AMD guys who > > > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > > > wrong? I reemember distinctly that's so) will likely be affected too. > > > > We don't want that. > > > > > > > > So let's find a way to make sure it's just power secure guest for now > > > > pls. > > > > > > > > I also think we should add a dma_api near features under virtio_device > > > > such that these hacks can move off data path. > > > > > > Anyway the current Xen code is conflict with spec which said: > > > > > > "If this feature bit is set to 0, then the device has same access to memory > > > addresses supplied to it as the driver has. In particular, the device will > > > always use physical addresses matching addresses used by the driver > > > (typically meaning physical addresses used by the CPU) and not translated > > > further, and can access any address supplied to it by the driver. When > > > clear, this overrides any platform-specific description of whether device > > > access is limited or translated in any way, e.g. whether an IOMMU may be > > > present. " > > > > > > I wonder how much value that the above description can give us. It's kind of > > > odd that the behavior of "when the feature is not negotiated" is described > > > in the spec. > > Hmm what's odd about it? We need to describe the behaviour is all cases. > > > Well, try to limit the behavior of 'legacy' driver is tricky or even > impossible. Xen is an exact example for this. So don't. Xen got grand-fathered in because when that came along we thought it's a one off thing. Was easier to just add that as a special case. But really >99% of people have a hypervisor device with direct guest memory access. All else is esoterica. > > > > > > Personally I think we can remove the above and then we can > > > switch to use DMA API unconditionally in guest driver. It may have single > > > digit regression probably, we can try to overcome it. > > > > > > Thanks > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > > I may miss soemthing, which kind of legacy setup is broken? Do you mean > using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually > unbreak this setup. > > Thanks Legacy setups by definition are working setups. The rules are pretty simple. By default virtio == full guest memory access. If your access is limited or translated in any way, you use a device with ACCESS_PLATFORM. When in doubt use ACCESS_PLATFORM. Xen was there before, and it does not have a flag and it still wants ACCESS_PLATFORM semantics without setting ACCESS_PLATFORM sometimes. So we don't want to break existing setups, and we make an exception in that case. I don't really see any good reason to make more exceptions. Nor IMHO should we trust all platform people to know about virtio and have special kind of DMA API just for virtio. > > > > > > > By the way could you please respond about virtio-iommu and > > > > why there's no support for ACCESS_PLATFORM on power? > > > > > > > > I have Cc'd you on these discussions. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > --- > > > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > > index cd7e755484e3..321a27075380 100644 > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > > > * not work without an even larger kludge. Instead, enable > > > > > > * the DMA API if we're a Xen guest, which at least allows > > > > > > * all of the sensible Xen configurations to work correctly. > > > > > > + * > > > > > > + * Also, if guest memory is encrypted the host can't access > > > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > > > */ > > > > > > - if (xen_domain()) > > > > > > + if (xen_domain() || sev_active()) > > > > > > return true; > > > > > > > > > > > > return false; > > > > > -- > > > > > Thiago Jung Bauermann > > > > > IBM Linux Technology Center _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 3:05 ` Jason Wang @ 2019-01-30 3:26 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 3:26 UTC (permalink / raw) To: Jason Wang Cc: Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Wed, Jan 30, 2019 at 11:05:42AM +0800, Jason Wang wrote: > > On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > > > Fixing address of powerpc mailing list. > > > > > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > > > below is the only change needed in virtio to make it work in a POWER > > > > > > secure guest under the ultravisor. > > > > > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > > > powerpc-specific code. > > > > > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > > > reworking precisely this path - but at least they aren't to blame, xen > > > > came before ACCESS_PLATFORM). > > > > > > > > I also still think the right thing would have been to set > > > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > > > > > But I also think I don't have the energy to argue about power secure > > > > guest anymore. So be it for power secure guest since the involved > > > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > > > > > But the name "sev_active" makes me scared because at least AMD guys who > > > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > > > wrong? I reemember distinctly that's so) will likely be affected too. > > > > We don't want that. > > > > > > > > So let's find a way to make sure it's just power secure guest for now > > > > pls. > > > > > > > > I also think we should add a dma_api near features under virtio_device > > > > such that these hacks can move off data path. > > > > > > Anyway the current Xen code is conflict with spec which said: > > > > > > "If this feature bit is set to 0, then the device has same access to memory > > > addresses supplied to it as the driver has. In particular, the device will > > > always use physical addresses matching addresses used by the driver > > > (typically meaning physical addresses used by the CPU) and not translated > > > further, and can access any address supplied to it by the driver. When > > > clear, this overrides any platform-specific description of whether device > > > access is limited or translated in any way, e.g. whether an IOMMU may be > > > present. " > > > > > > I wonder how much value that the above description can give us. It's kind of > > > odd that the behavior of "when the feature is not negotiated" is described > > > in the spec. > > Hmm what's odd about it? We need to describe the behaviour is all cases. > > > Well, try to limit the behavior of 'legacy' driver is tricky or even > impossible. Xen is an exact example for this. So don't. Xen got grand-fathered in because when that came along we thought it's a one off thing. Was easier to just add that as a special case. But really >99% of people have a hypervisor device with direct guest memory access. All else is esoterica. > > > > > > Personally I think we can remove the above and then we can > > > switch to use DMA API unconditionally in guest driver. It may have single > > > digit regression probably, we can try to overcome it. > > > > > > Thanks > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > > I may miss soemthing, which kind of legacy setup is broken? Do you mean > using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually > unbreak this setup. > > Thanks Legacy setups by definition are working setups. The rules are pretty simple. By default virtio == full guest memory access. If your access is limited or translated in any way, you use a device with ACCESS_PLATFORM. When in doubt use ACCESS_PLATFORM. Xen was there before, and it does not have a flag and it still wants ACCESS_PLATFORM semantics without setting ACCESS_PLATFORM sometimes. So we don't want to break existing setups, and we make an exception in that case. I don't really see any good reason to make more exceptions. Nor IMHO should we trust all platform people to know about virtio and have special kind of DMA API just for virtio. > > > > > > > By the way could you please respond about virtio-iommu and > > > > why there's no support for ACCESS_PLATFORM on power? > > > > > > > > I have Cc'd you on these discussions. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > --- > > > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > > index cd7e755484e3..321a27075380 100644 > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > > > * not work without an even larger kludge. Instead, enable > > > > > > * the DMA API if we're a Xen guest, which at least allows > > > > > > * all of the sensible Xen configurations to work correctly. > > > > > > + * > > > > > > + * Also, if guest memory is encrypted the host can't access > > > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > > > */ > > > > > > - if (xen_domain()) > > > > > > + if (xen_domain() || sev_active()) > > > > > > return true; > > > > > > > > > > > > return false; > > > > > -- > > > > > Thiago Jung Bauermann > > > > > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-30 3:26 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-30 3:26 UTC (permalink / raw) To: Jason Wang Cc: Jean-Philippe Brucker, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Wed, Jan 30, 2019 at 11:05:42AM +0800, Jason Wang wrote: > > On 2019/1/30 上午10:36, Michael S. Tsirkin wrote: > > On Wed, Jan 30, 2019 at 10:24:01AM +0800, Jason Wang wrote: > > > On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > > > > Fixing address of powerpc mailing list. > > > > > > > > > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > > > > below is the only change needed in virtio to make it work in a POWER > > > > > > secure guest under the ultravisor. > > > > > > > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > > > > so that the dma-direct/swiotlb code is used) can be made in > > > > > > powerpc-specific code. > > > > > > > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > > > > > > > The host can't access the guest memory when it's encrypted, so using > > > > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > > > reworking precisely this path - but at least they aren't to blame, xen > > > > came before ACCESS_PLATFORM). > > > > > > > > I also still think the right thing would have been to set > > > > ACCESS_PLATFORM for all systems where device can't access all memory. > > > > > > > > But I also think I don't have the energy to argue about power secure > > > > guest anymore. So be it for power secure guest since the involved > > > > engineers disagree with me. Hey I've been wrong in the past ;). > > > > > > > > But the name "sev_active" makes me scared because at least AMD guys who > > > > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > > > > wrong? I reemember distinctly that's so) will likely be affected too. > > > > We don't want that. > > > > > > > > So let's find a way to make sure it's just power secure guest for now > > > > pls. > > > > > > > > I also think we should add a dma_api near features under virtio_device > > > > such that these hacks can move off data path. > > > > > > Anyway the current Xen code is conflict with spec which said: > > > > > > "If this feature bit is set to 0, then the device has same access to memory > > > addresses supplied to it as the driver has. In particular, the device will > > > always use physical addresses matching addresses used by the driver > > > (typically meaning physical addresses used by the CPU) and not translated > > > further, and can access any address supplied to it by the driver. When > > > clear, this overrides any platform-specific description of whether device > > > access is limited or translated in any way, e.g. whether an IOMMU may be > > > present. " > > > > > > I wonder how much value that the above description can give us. It's kind of > > > odd that the behavior of "when the feature is not negotiated" is described > > > in the spec. > > Hmm what's odd about it? We need to describe the behaviour is all cases. > > > Well, try to limit the behavior of 'legacy' driver is tricky or even > impossible. Xen is an exact example for this. So don't. Xen got grand-fathered in because when that came along we thought it's a one off thing. Was easier to just add that as a special case. But really >99% of people have a hypervisor device with direct guest memory access. All else is esoterica. > > > > > > Personally I think we can remove the above and then we can > > > switch to use DMA API unconditionally in guest driver. It may have single > > > digit regression probably, we can try to overcome it. > > > > > > Thanks > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > > I may miss soemthing, which kind of legacy setup is broken? Do you mean > using virtio without IOMMU_PLATFORM on platform with IOMMU? We actually > unbreak this setup. > > Thanks Legacy setups by definition are working setups. The rules are pretty simple. By default virtio == full guest memory access. If your access is limited or translated in any way, you use a device with ACCESS_PLATFORM. When in doubt use ACCESS_PLATFORM. Xen was there before, and it does not have a flag and it still wants ACCESS_PLATFORM semantics without setting ACCESS_PLATFORM sometimes. So we don't want to break existing setups, and we make an exception in that case. I don't really see any good reason to make more exceptions. Nor IMHO should we trust all platform people to know about virtio and have special kind of DMA API just for virtio. > > > > > > > By the way could you please respond about virtio-iommu and > > > > why there's no support for ACCESS_PLATFORM on power? > > > > > > > > I have Cc'd you on these discussions. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > --- > > > > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > > index cd7e755484e3..321a27075380 100644 > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > > > > * not work without an even larger kludge. Instead, enable > > > > > > * the DMA API if we're a Xen guest, which at least allows > > > > > > * all of the sensible Xen configurations to work correctly. > > > > > > + * > > > > > > + * Also, if guest memory is encrypted the host can't access > > > > > > + * it directly. In this case, we'll need to use the DMA API. > > > > > > */ > > > > > > - if (xen_domain()) > > > > > > + if (xen_domain() || sev_active()) > > > > > > return true; > > > > > > > > > > > > return false; > > > > > -- > > > > > Thiago Jung Bauermann > > > > > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:36 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-01-30 7:44 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-01-30 7:44 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I agree with the legacy aspect. What I am missing is an extremely strong wording that says you SHOULD always set this flag for new hosts, including an explanation why. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 2:36 ` Michael S. Tsirkin @ 2019-01-30 7:44 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-01-30 7:44 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jason Wang, Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I agree with the legacy aspect. What I am missing is an extremely strong wording that says you SHOULD always set this flag for new hosts, including an explanation why. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-30 7:44 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-01-30 7:44 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > This has been discussed ad nauseum. virtio is all about compatibility. > Losing a couple of lines of code isn't worth breaking working setups. > People that want "just use DMA API no tricks" now have the option. > Setting a flag in a feature bit map is literally a single line > of code in the hypervisor. So stop pushing for breaking working > legacy setups and just fix it in the right place. I agree with the legacy aspect. What I am missing is an extremely strong wording that says you SHOULD always set this flag for new hosts, including an explanation why. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 7:44 ` Christoph Hellwig (?) @ 2019-02-04 18:15 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:15 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, David Gibson Christoph Hellwig <hch@lst.de> writes: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: >> This has been discussed ad nauseum. virtio is all about compatibility. >> Losing a couple of lines of code isn't worth breaking working setups. >> People that want "just use DMA API no tricks" now have the option. >> Setting a flag in a feature bit map is literally a single line >> of code in the hypervisor. So stop pushing for breaking working >> legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. My understanding of ACCESS_PLATFORM is that it means "this device will behave in all aspects like a regular device attached to this bus". Is that it? Therefore it should be set because it's the sane thing to do? -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 7:44 ` Christoph Hellwig @ 2019-02-04 18:15 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:15 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Jason Wang, virtualization, linuxppc-dev, iommu, linux-kernel, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker Christoph Hellwig <hch@lst.de> writes: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: >> This has been discussed ad nauseum. virtio is all about compatibility. >> Losing a couple of lines of code isn't worth breaking working setups. >> People that want "just use DMA API no tricks" now have the option. >> Setting a flag in a feature bit map is literally a single line >> of code in the hypervisor. So stop pushing for breaking working >> legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. My understanding of ACCESS_PLATFORM is that it means "this device will behave in all aspects like a regular device attached to this bus". Is that it? Therefore it should be set because it's the sane thing to do? -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-04 18:15 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:15 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, David Gibson Christoph Hellwig <hch@lst.de> writes: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: >> This has been discussed ad nauseum. virtio is all about compatibility. >> Losing a couple of lines of code isn't worth breaking working setups. >> People that want "just use DMA API no tricks" now have the option. >> Setting a flag in a feature bit map is literally a single line >> of code in the hypervisor. So stop pushing for breaking working >> legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. My understanding of ACCESS_PLATFORM is that it means "this device will behave in all aspects like a regular device attached to this bus". Is that it? Therefore it should be set because it's the sane thing to do? -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 18:15 ` Thiago Jung Bauermann @ 2019-02-04 21:38 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 21:38 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Christoph Hellwig, Jason Wang, virtualization, linuxppc-dev, iommu, linux-kernel, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Mon, Feb 04, 2019 at 04:15:41PM -0200, Thiago Jung Bauermann wrote: > > Christoph Hellwig <hch@lst.de> writes: > > > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > >> This has been discussed ad nauseum. virtio is all about compatibility. > >> Losing a couple of lines of code isn't worth breaking working setups. > >> People that want "just use DMA API no tricks" now have the option. > >> Setting a flag in a feature bit map is literally a single line > >> of code in the hypervisor. So stop pushing for breaking working > >> legacy setups and just fix it in the right place. > > > > I agree with the legacy aspect. What I am missing is an extremely > > strong wording that says you SHOULD always set this flag for new > > hosts, including an explanation why. > > My understanding of ACCESS_PLATFORM is that it means "this device will > behave in all aspects like a regular device attached to this bus". Not really. Look it up in the spec: VIRTIO_F_ACCESS_PLATFORM(33) This feature indicates that the device can be used on a platform where device access to data in memory is limited and/or translated. E.g. this is the case if the device can be located behind an IOMMU that translates bus addresses from the device into physical addresses in memory, if the device can be limited to only access certain memory addresses or if special commands such as a cache flush can be needed to synchronise data in memory with the device. Whether accesses are actually limited or translated is described by platform-specific means. If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. > Is > that it? Therefore it should be set because it's the sane thing to do? It's the sane thing to do unless you want the very specific thing that having it clear means, which is just have it be another CPU. It was designed to make, when set, as many guests as we can work correctly, and it seems to be successful in doing exactly that. Unfortunately there could be legacy guests that do work correctly but become slow. Whether trying to somehow work around that can paint us into a corner where things again don't work for some people is a question worth discussing. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-04 21:38 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 21:38 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Feb 04, 2019 at 04:15:41PM -0200, Thiago Jung Bauermann wrote: > > Christoph Hellwig <hch@lst.de> writes: > > > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > >> This has been discussed ad nauseum. virtio is all about compatibility. > >> Losing a couple of lines of code isn't worth breaking working setups. > >> People that want "just use DMA API no tricks" now have the option. > >> Setting a flag in a feature bit map is literally a single line > >> of code in the hypervisor. So stop pushing for breaking working > >> legacy setups and just fix it in the right place. > > > > I agree with the legacy aspect. What I am missing is an extremely > > strong wording that says you SHOULD always set this flag for new > > hosts, including an explanation why. > > My understanding of ACCESS_PLATFORM is that it means "this device will > behave in all aspects like a regular device attached to this bus". Not really. Look it up in the spec: VIRTIO_F_ACCESS_PLATFORM(33) This feature indicates that the device can be used on a platform where device access to data in memory is limited and/or translated. E.g. this is the case if the device can be located behind an IOMMU that translates bus addresses from the device into physical addresses in memory, if the device can be limited to only access certain memory addresses or if special commands such as a cache flush can be needed to synchronise data in memory with the device. Whether accesses are actually limited or translated is described by platform-specific means. If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. > Is > that it? Therefore it should be set because it's the sane thing to do? It's the sane thing to do unless you want the very specific thing that having it clear means, which is just have it be another CPU. It was designed to make, when set, as many guests as we can work correctly, and it seems to be successful in doing exactly that. Unfortunately there could be legacy guests that do work correctly but become slow. Whether trying to somehow work around that can paint us into a corner where things again don't work for some people is a question worth discussing. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 21:38 ` Michael S. Tsirkin @ 2019-02-05 7:24 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-02-05 7:24 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Thiago Jung Bauermann, Christoph Hellwig, Jason Wang, virtualization, linuxppc-dev, iommu, linux-kernel, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > It was designed to make, when set, as many guests as we can work > correctly, and it seems to be successful in doing exactly that. > > Unfortunately there could be legacy guests that do work correctly but > become slow. Whether trying to somehow work around that > can paint us into a corner where things again don't > work for some people is a question worth discussing. The other problem is that some qemu machines just throw passthrough devices and virtio devices on the same virtual PCI(e) bus, and have a common IOMMU setup for the whole bus / root port / domain. I think this is completely bogus, but unfortunately it is out in the field. Given that power is one of these examples I suspect that is what Thiago referes to. But in this case the answer can't be that we pile on hack ontop of another, but instead introduce a new qemu machine that separates these clearly, and make that mandatory for the secure guest support. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-05 7:24 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-02-05 7:24 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > It was designed to make, when set, as many guests as we can work > correctly, and it seems to be successful in doing exactly that. > > Unfortunately there could be legacy guests that do work correctly but > become slow. Whether trying to somehow work around that > can paint us into a corner where things again don't > work for some people is a question worth discussing. The other problem is that some qemu machines just throw passthrough devices and virtio devices on the same virtual PCI(e) bus, and have a common IOMMU setup for the whole bus / root port / domain. I think this is completely bogus, but unfortunately it is out in the field. Given that power is one of these examples I suspect that is what Thiago referes to. But in this case the answer can't be that we pile on hack ontop of another, but instead introduce a new qemu machine that separates these clearly, and make that mandatory for the secure guest support. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-05 7:24 ` Christoph Hellwig (?) @ 2019-02-05 16:13 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-05 16:13 UTC (permalink / raw) To: Christoph Hellwig Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Thiago Jung Bauermann, David Gibson On Tue, Feb 05, 2019 at 08:24:07AM +0100, Christoph Hellwig wrote: > On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > > It was designed to make, when set, as many guests as we can work > > correctly, and it seems to be successful in doing exactly that. > > > > Unfortunately there could be legacy guests that do work correctly but > > become slow. Whether trying to somehow work around that > > can paint us into a corner where things again don't > > work for some people is a question worth discussing. > > The other problem is that some qemu machines just throw passthrough > devices and virtio devices on the same virtual PCI(e) bus, and have a > common IOMMU setup for the whole bus / root port / domain. I think > this is completely bogus, but unfortunately it is out in the field. > > Given that power is one of these examples I suspect that is what > Thiago referes to. But in this case the answer can't be that we > pile on hack ontop of another, but instead introduce a new qemu > machine that separates these clearly, and make that mandatory for > the secure guest support. That could we one approach, assuming one exists that guests already support. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-05 7:24 ` Christoph Hellwig (?) @ 2019-02-05 16:13 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-05 16:13 UTC (permalink / raw) To: Christoph Hellwig Cc: Thiago Jung Bauermann, Jason Wang, virtualization, linuxppc-dev, iommu, linux-kernel, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Tue, Feb 05, 2019 at 08:24:07AM +0100, Christoph Hellwig wrote: > On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > > It was designed to make, when set, as many guests as we can work > > correctly, and it seems to be successful in doing exactly that. > > > > Unfortunately there could be legacy guests that do work correctly but > > become slow. Whether trying to somehow work around that > > can paint us into a corner where things again don't > > work for some people is a question worth discussing. > > The other problem is that some qemu machines just throw passthrough > devices and virtio devices on the same virtual PCI(e) bus, and have a > common IOMMU setup for the whole bus / root port / domain. I think > this is completely bogus, but unfortunately it is out in the field. > > Given that power is one of these examples I suspect that is what > Thiago referes to. But in this case the answer can't be that we > pile on hack ontop of another, but instead introduce a new qemu > machine that separates these clearly, and make that mandatory for > the secure guest support. That could we one approach, assuming one exists that guests already support. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-05 16:13 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-05 16:13 UTC (permalink / raw) To: Christoph Hellwig Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel-u79uwXL29TY76Z2rM5mHXA, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Mackerras, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, David Gibson On Tue, Feb 05, 2019 at 08:24:07AM +0100, Christoph Hellwig wrote: > On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > > It was designed to make, when set, as many guests as we can work > > correctly, and it seems to be successful in doing exactly that. > > > > Unfortunately there could be legacy guests that do work correctly but > > become slow. Whether trying to somehow work around that > > can paint us into a corner where things again don't > > work for some people is a question worth discussing. > > The other problem is that some qemu machines just throw passthrough > devices and virtio devices on the same virtual PCI(e) bus, and have a > common IOMMU setup for the whole bus / root port / domain. I think > this is completely bogus, but unfortunately it is out in the field. > > Given that power is one of these examples I suspect that is what > Thiago referes to. But in this case the answer can't be that we > pile on hack ontop of another, but instead introduce a new qemu > machine that separates these clearly, and make that mandatory for > the secure guest support. That could we one approach, assuming one exists that guests already support. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-05 16:13 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-05 16:13 UTC (permalink / raw) To: Christoph Hellwig Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Thiago Jung Bauermann, David Gibson On Tue, Feb 05, 2019 at 08:24:07AM +0100, Christoph Hellwig wrote: > On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > > It was designed to make, when set, as many guests as we can work > > correctly, and it seems to be successful in doing exactly that. > > > > Unfortunately there could be legacy guests that do work correctly but > > become slow. Whether trying to somehow work around that > > can paint us into a corner where things again don't > > work for some people is a question worth discussing. > > The other problem is that some qemu machines just throw passthrough > devices and virtio devices on the same virtual PCI(e) bus, and have a > common IOMMU setup for the whole bus / root port / domain. I think > this is completely bogus, but unfortunately it is out in the field. > > Given that power is one of these examples I suspect that is what > Thiago referes to. But in this case the answer can't be that we > pile on hack ontop of another, but instead introduce a new qemu > machine that separates these clearly, and make that mandatory for > the secure guest support. That could we one approach, assuming one exists that guests already support. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 21:38 ` Michael S. Tsirkin (?) (?) @ 2019-02-05 7:24 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-02-05 7:24 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann, David Gibson On Mon, Feb 04, 2019 at 04:38:21PM -0500, Michael S. Tsirkin wrote: > It was designed to make, when set, as many guests as we can work > correctly, and it seems to be successful in doing exactly that. > > Unfortunately there could be legacy guests that do work correctly but > become slow. Whether trying to somehow work around that > can paint us into a corner where things again don't > work for some people is a question worth discussing. The other problem is that some qemu machines just throw passthrough devices and virtio devices on the same virtual PCI(e) bus, and have a common IOMMU setup for the whole bus / root port / domain. I think this is completely bogus, but unfortunately it is out in the field. Given that power is one of these examples I suspect that is what Thiago referes to. But in this case the answer can't be that we pile on hack ontop of another, but instead introduce a new qemu machine that separates these clearly, and make that mandatory for the secure guest support. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 18:15 ` Thiago Jung Bauermann (?) (?) @ 2019-02-04 21:38 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 21:38 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Feb 04, 2019 at 04:15:41PM -0200, Thiago Jung Bauermann wrote: > > Christoph Hellwig <hch@lst.de> writes: > > > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > >> This has been discussed ad nauseum. virtio is all about compatibility. > >> Losing a couple of lines of code isn't worth breaking working setups. > >> People that want "just use DMA API no tricks" now have the option. > >> Setting a flag in a feature bit map is literally a single line > >> of code in the hypervisor. So stop pushing for breaking working > >> legacy setups and just fix it in the right place. > > > > I agree with the legacy aspect. What I am missing is an extremely > > strong wording that says you SHOULD always set this flag for new > > hosts, including an explanation why. > > My understanding of ACCESS_PLATFORM is that it means "this device will > behave in all aspects like a regular device attached to this bus". Not really. Look it up in the spec: VIRTIO_F_ACCESS_PLATFORM(33) This feature indicates that the device can be used on a platform where device access to data in memory is limited and/or translated. E.g. this is the case if the device can be located behind an IOMMU that translates bus addresses from the device into physical addresses in memory, if the device can be limited to only access certain memory addresses or if special commands such as a cache flush can be needed to synchronise data in memory with the device. Whether accesses are actually limited or translated is described by platform-specific means. If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. > Is > that it? Therefore it should be set because it's the sane thing to do? It's the sane thing to do unless you want the very specific thing that having it clear means, which is just have it be another CPU. It was designed to make, when set, as many guests as we can work correctly, and it seems to be successful in doing exactly that. Unfortunately there could be legacy guests that do work correctly but become slow. Whether trying to somehow work around that can paint us into a corner where things again don't work for some people is a question worth discussing. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 7:44 ` Christoph Hellwig ` (2 preceding siblings ...) (?) @ 2019-03-26 16:53 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-26 16:53 UTC (permalink / raw) To: Christoph Hellwig Cc: Lorenzo.Pieralisi, tnowicki, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, Will.Deacon, virtualization, Paul Mackerras, eric.auger, iommu, Marc.Zyngier, Robin.Murphy, linuxppc-dev, joro, David Gibson On Wed, Jan 30, 2019 at 08:44:27AM +0100, Christoph Hellwig wrote: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. So as far as power is concerned, IIUC the issue they are struggling with is that some platforms do not support pass-through mode in the emulated IOMMU. Disabling PLATFORM_ACCESS is so far a way around that, unfortunately just for virtio devices. I would like virtio-iommu to be able to address that need as well. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-30 7:44 ` Christoph Hellwig @ 2019-03-26 16:53 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-26 16:53 UTC (permalink / raw) To: Christoph Hellwig Cc: Jason Wang, Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, eric.auger, joro, tnowicki, kevin.tian, Lorenzo.Pieralisi, bharat.bhushan, Will.Deacon, Robin.Murphy, Marc.Zyngier On Wed, Jan 30, 2019 at 08:44:27AM +0100, Christoph Hellwig wrote: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. So as far as power is concerned, IIUC the issue they are struggling with is that some platforms do not support pass-through mode in the emulated IOMMU. Disabling PLATFORM_ACCESS is so far a way around that, unfortunately just for virtio devices. I would like virtio-iommu to be able to address that need as well. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-26 16:53 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-26 16:53 UTC (permalink / raw) To: Christoph Hellwig Cc: kevin.tian, Lorenzo.Pieralisi, tnowicki, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, Will.Deacon, virtualization, eric.auger, iommu, Marc.Zyngier, Robin.Murphy, bharat.bhushan, linuxppc-dev, joro, Thiago Jung Bauermann, David Gibson On Wed, Jan 30, 2019 at 08:44:27AM +0100, Christoph Hellwig wrote: > On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote: > > This has been discussed ad nauseum. virtio is all about compatibility. > > Losing a couple of lines of code isn't worth breaking working setups. > > People that want "just use DMA API no tricks" now have the option. > > Setting a flag in a feature bit map is literally a single line > > of code in the hypervisor. So stop pushing for breaking working > > legacy setups and just fix it in the right place. > > I agree with the legacy aspect. What I am missing is an extremely > strong wording that says you SHOULD always set this flag for new > hosts, including an explanation why. So as far as power is concerned, IIUC the issue they are struggling with is that some platforms do not support pass-through mode in the emulated IOMMU. Disabling PLATFORM_ACCESS is so far a way around that, unfortunately just for virtio devices. I would like virtio-iommu to be able to address that need as well. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 19:02 ` Michael S. Tsirkin (?) (?) @ 2019-01-30 2:24 ` Jason Wang -1 siblings, 0 replies; 198+ messages in thread From: Jason Wang @ 2019-01-30 2:24 UTC (permalink / raw) To: Michael S. Tsirkin, Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On 2019/1/30 上午3:02, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >>> Hello, >>> >>> With Christoph's rework of the DMA API that recently landed, the patch >>> below is the only change needed in virtio to make it work in a POWER >>> secure guest under the ultravisor. >>> >>> The other change we need (making sure the device's dma_map_ops is NULL >>> so that the dma-direct/swiotlb code is used) can be made in >>> powerpc-specific code. >>> >>> Of course, I also have patches (soon to be posted as RFC) which hook up >>> <linux/mem_encrypt.h> to the powerpc secure guest support code. >>> >>> What do you think? >>> >>> From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >>> From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >>> Date: Thu, 24 Jan 2019 22:08:02 -0200 >>> Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >>> >>> The host can't access the guest memory when it's encrypted, so using >>> regular memory pages for the ring isn't an option. Go through the DMA API. >>> >>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. > > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). > > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. > > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Anyway the current Xen code is conflict with spec which said: "If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. " I wonder how much value that the above description can give us. It's kind of odd that the behavior of "when the feature is not negotiated" is described in the spec. Personally I think we can remove the above and then we can switch to use DMA API unconditionally in guest driver. It may have single digit regression probably, we can try to overcome it. Thanks > > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? > > I have Cc'd you on these discussions. > > > Thanks! > > >>> --- >>> drivers/virtio/virtio_ring.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index cd7e755484e3..321a27075380 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) >>> * not work without an even larger kludge. Instead, enable >>> * the DMA API if we're a Xen guest, which at least allows >>> * all of the sensible Xen configurations to work correctly. >>> + * >>> + * Also, if guest memory is encrypted the host can't access >>> + * it directly. In this case, we'll need to use the DMA API. >>> */ >>> - if (xen_domain()) >>> + if (xen_domain() || sev_active()) >>> return true; >>> >>> return false; >> >> -- >> Thiago Jung Bauermann >> IBM Linux Technology Center _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 19:02 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-02-04 18:14 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:14 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Hello Michael, Michael S. Tsirkin <mst@redhat.com> writes: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >> > Hello, >> > >> > With Christoph's rework of the DMA API that recently landed, the patch >> > below is the only change needed in virtio to make it work in a POWER >> > secure guest under the ultravisor. >> > >> > The other change we need (making sure the device's dma_map_ops is NULL >> > so that the dma-direct/swiotlb code is used) can be made in >> > powerpc-specific code. >> > >> > Of course, I also have patches (soon to be posted as RFC) which hook up >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. >> > >> > What do you think? >> > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >> > >> > The host can't access the guest memory when it's encrypted, so using >> > regular memory pages for the ring isn't an option. Go through the DMA API. >> > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. I understand. The problem with that approach for us is that because we don't know which guests will become secure guests and which will remain regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. And the problem with that is that for QEMU on POWER, having ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the device (which makes sense considering that the name of the flag was IOMMU_PLATFORM). And we need that for regular guests to avoid performance degradation. So while ACCESS_PLATFORM solves our problems for secure guests, we can't turn it on by default because we can't affect legacy systems. Doing so would penalize existing systems that can access all memory. They would all have to unnecessarily go through address translations, and take a performance hit. The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows in advance - right when the VM is instantiated - that it will not have access to all guest memory. Unfortunately that assumption is subtly broken on our secure-platform. The hypervisor/QEMU realizes that the platform is going secure only *after the VM is instantiated*. It's the kernel running in the VM that determines that it wants to switch the platform to secure-mode. Another way of looking at this issue which also explains our reluctance is that the only difference between a secure guest and a regular guest (at least regarding virtio) is that the former uses swiotlb while the latter doens't. And from the device's point of view they're indistinguishable. It can't tell one guest that is using swiotlb from one that isn't. And that implies that secure guest vs regular guest isn't a virtio interface issue, it's "guest internal affairs". So there's no reason to reflect that in the feature flags. That said, we still would like to arrive at a proper design for this rather than add yet another hack if we can avoid it. So here's another proposal: considering that the dma-direct code (in kernel/dma/direct.c) automatically uses swiotlb when necessary (thanks to Christoph's recent DMA work), would it be ok to replace virtio's own direct-memory code that is used in the !ACCESS_PLATFORM case with the dma-direct code? That way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a code cleanup (replace open-coded stuff with calls to existing infrastructure). > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). Yeah, it's been a difficult discussion. Thanks for still engaging! I honestly thought that this patch was a good solution (if the guest has encrypted memory it means that the DMA API needs to be used), but I can see where you are coming from. As I said, we'd like to arrive at a good solution if possible. > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM My understanding is, AMD guest-platform knows in advance that their guest will run in secure mode and hence sets the flag at the time of VM instantiation. Unfortunately we dont have that luxury on our platforms. > (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. Yes, my understanding is that they turn ACCESS_PLATFORM on. And because of that, IIUC this patch wouldn't affect them because in their platform vring_use_dma_api() returns true earlier in the "if !virtio_has_iommu_quirk(vdev)" condition. > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Sorry, I don't understand this. > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? There is support for ACCESS_PLATFORM on POWER. We don't enable it because it causes a performance hit. > I have Cc'd you on these discussions. I'm having a look at the spec and the patches, but to be honest I'm not the best powerpc guy for this. I'll see if I can get others to have a look. > Thanks! Thanks as well! -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 19:02 ` Michael S. Tsirkin @ 2019-02-04 18:14 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:14 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker Hello Michael, Michael S. Tsirkin <mst@redhat.com> writes: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >> > Hello, >> > >> > With Christoph's rework of the DMA API that recently landed, the patch >> > below is the only change needed in virtio to make it work in a POWER >> > secure guest under the ultravisor. >> > >> > The other change we need (making sure the device's dma_map_ops is NULL >> > so that the dma-direct/swiotlb code is used) can be made in >> > powerpc-specific code. >> > >> > Of course, I also have patches (soon to be posted as RFC) which hook up >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. >> > >> > What do you think? >> > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >> > >> > The host can't access the guest memory when it's encrypted, so using >> > regular memory pages for the ring isn't an option. Go through the DMA API. >> > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. I understand. The problem with that approach for us is that because we don't know which guests will become secure guests and which will remain regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. And the problem with that is that for QEMU on POWER, having ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the device (which makes sense considering that the name of the flag was IOMMU_PLATFORM). And we need that for regular guests to avoid performance degradation. So while ACCESS_PLATFORM solves our problems for secure guests, we can't turn it on by default because we can't affect legacy systems. Doing so would penalize existing systems that can access all memory. They would all have to unnecessarily go through address translations, and take a performance hit. The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows in advance - right when the VM is instantiated - that it will not have access to all guest memory. Unfortunately that assumption is subtly broken on our secure-platform. The hypervisor/QEMU realizes that the platform is going secure only *after the VM is instantiated*. It's the kernel running in the VM that determines that it wants to switch the platform to secure-mode. Another way of looking at this issue which also explains our reluctance is that the only difference between a secure guest and a regular guest (at least regarding virtio) is that the former uses swiotlb while the latter doens't. And from the device's point of view they're indistinguishable. It can't tell one guest that is using swiotlb from one that isn't. And that implies that secure guest vs regular guest isn't a virtio interface issue, it's "guest internal affairs". So there's no reason to reflect that in the feature flags. That said, we still would like to arrive at a proper design for this rather than add yet another hack if we can avoid it. So here's another proposal: considering that the dma-direct code (in kernel/dma/direct.c) automatically uses swiotlb when necessary (thanks to Christoph's recent DMA work), would it be ok to replace virtio's own direct-memory code that is used in the !ACCESS_PLATFORM case with the dma-direct code? That way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a code cleanup (replace open-coded stuff with calls to existing infrastructure). > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). Yeah, it's been a difficult discussion. Thanks for still engaging! I honestly thought that this patch was a good solution (if the guest has encrypted memory it means that the DMA API needs to be used), but I can see where you are coming from. As I said, we'd like to arrive at a good solution if possible. > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM My understanding is, AMD guest-platform knows in advance that their guest will run in secure mode and hence sets the flag at the time of VM instantiation. Unfortunately we dont have that luxury on our platforms. > (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. Yes, my understanding is that they turn ACCESS_PLATFORM on. And because of that, IIUC this patch wouldn't affect them because in their platform vring_use_dma_api() returns true earlier in the "if !virtio_has_iommu_quirk(vdev)" condition. > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Sorry, I don't understand this. > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? There is support for ACCESS_PLATFORM on POWER. We don't enable it because it causes a performance hit. > I have Cc'd you on these discussions. I'm having a look at the spec and the patches, but to be honest I'm not the best powerpc guy for this. I'll see if I can get others to have a look. > Thanks! Thanks as well! -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-04 18:14 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-02-04 18:14 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Hello Michael, Michael S. Tsirkin <mst@redhat.com> writes: > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> >> Fixing address of powerpc mailing list. >> >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: >> >> > Hello, >> > >> > With Christoph's rework of the DMA API that recently landed, the patch >> > below is the only change needed in virtio to make it work in a POWER >> > secure guest under the ultravisor. >> > >> > The other change we need (making sure the device's dma_map_ops is NULL >> > so that the dma-direct/swiotlb code is used) can be made in >> > powerpc-specific code. >> > >> > Of course, I also have patches (soon to be posted as RFC) which hook up >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. >> > >> > What do you think? >> > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted >> > >> > The host can't access the guest memory when it's encrypted, so using >> > regular memory pages for the ring isn't an option. Go through the DMA API. >> > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Well I think this will come back to bite us (witness xen which is now > reworking precisely this path - but at least they aren't to blame, xen > came before ACCESS_PLATFORM). > > I also still think the right thing would have been to set > ACCESS_PLATFORM for all systems where device can't access all memory. I understand. The problem with that approach for us is that because we don't know which guests will become secure guests and which will remain regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. And the problem with that is that for QEMU on POWER, having ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the device (which makes sense considering that the name of the flag was IOMMU_PLATFORM). And we need that for regular guests to avoid performance degradation. So while ACCESS_PLATFORM solves our problems for secure guests, we can't turn it on by default because we can't affect legacy systems. Doing so would penalize existing systems that can access all memory. They would all have to unnecessarily go through address translations, and take a performance hit. The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows in advance - right when the VM is instantiated - that it will not have access to all guest memory. Unfortunately that assumption is subtly broken on our secure-platform. The hypervisor/QEMU realizes that the platform is going secure only *after the VM is instantiated*. It's the kernel running in the VM that determines that it wants to switch the platform to secure-mode. Another way of looking at this issue which also explains our reluctance is that the only difference between a secure guest and a regular guest (at least regarding virtio) is that the former uses swiotlb while the latter doens't. And from the device's point of view they're indistinguishable. It can't tell one guest that is using swiotlb from one that isn't. And that implies that secure guest vs regular guest isn't a virtio interface issue, it's "guest internal affairs". So there's no reason to reflect that in the feature flags. That said, we still would like to arrive at a proper design for this rather than add yet another hack if we can avoid it. So here's another proposal: considering that the dma-direct code (in kernel/dma/direct.c) automatically uses swiotlb when necessary (thanks to Christoph's recent DMA work), would it be ok to replace virtio's own direct-memory code that is used in the !ACCESS_PLATFORM case with the dma-direct code? That way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a code cleanup (replace open-coded stuff with calls to existing infrastructure). > But I also think I don't have the energy to argue about power secure > guest anymore. So be it for power secure guest since the involved > engineers disagree with me. Hey I've been wrong in the past ;). Yeah, it's been a difficult discussion. Thanks for still engaging! I honestly thought that this patch was a good solution (if the guest has encrypted memory it means that the DMA API needs to be used), but I can see where you are coming from. As I said, we'd like to arrive at a good solution if possible. > But the name "sev_active" makes me scared because at least AMD guys who > were doing the sensible thing and setting ACCESS_PLATFORM My understanding is, AMD guest-platform knows in advance that their guest will run in secure mode and hence sets the flag at the time of VM instantiation. Unfortunately we dont have that luxury on our platforms. > (unless I'm > wrong? I reemember distinctly that's so) will likely be affected too. > We don't want that. > > So let's find a way to make sure it's just power secure guest for now > pls. Yes, my understanding is that they turn ACCESS_PLATFORM on. And because of that, IIUC this patch wouldn't affect them because in their platform vring_use_dma_api() returns true earlier in the "if !virtio_has_iommu_quirk(vdev)" condition. > I also think we should add a dma_api near features under virtio_device > such that these hacks can move off data path. Sorry, I don't understand this. > By the way could you please respond about virtio-iommu and > why there's no support for ACCESS_PLATFORM on power? There is support for ACCESS_PLATFORM on POWER. We don't enable it because it causes a performance hit. > I have Cc'd you on these discussions. I'm having a look at the spec and the patches, but to be honest I'm not the best powerpc guy for this. I'll see if I can get others to have a look. > Thanks! Thanks as well! -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 18:14 ` Thiago Jung Bauermann @ 2019-02-04 20:23 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 20:23 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: > > Hello Michael, > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > >> > >> Fixing address of powerpc mailing list. > >> > >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > >> > >> > Hello, > >> > > >> > With Christoph's rework of the DMA API that recently landed, the patch > >> > below is the only change needed in virtio to make it work in a POWER > >> > secure guest under the ultravisor. > >> > > >> > The other change we need (making sure the device's dma_map_ops is NULL > >> > so that the dma-direct/swiotlb code is used) can be made in > >> > powerpc-specific code. > >> > > >> > Of course, I also have patches (soon to be posted as RFC) which hook up > >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. > >> > > >> > What do you think? > >> > > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > >> > > >> > The host can't access the guest memory when it's encrypted, so using > >> > regular memory pages for the ring isn't an option. Go through the DMA API. > >> > > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > I understand. The problem with that approach for us is that because we > don't know which guests will become secure guests and which will remain > regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. > > And the problem with that is that for QEMU on POWER, having > ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the > device (which makes sense considering that the name of the flag was > IOMMU_PLATFORM). And we need that for regular guests to avoid > performance degradation. You don't really, ACCESS_PLATFORM means just that, platform decides. > So while ACCESS_PLATFORM solves our problems for secure guests, we can't > turn it on by default because we can't affect legacy systems. Doing so > would penalize existing systems that can access all memory. They would > all have to unnecessarily go through address translations, and take a > performance hit. So as step one, you just give hypervisor admin an option to run legacy systems faster by blocking secure mode. I don't see why that is so terrible. But as step two, assuming you use above step one to make legacy guests go fast - maybe there is a point in detecting such a hypervisor and doing something smarter with it. By all means let's have a discussion around this but that is no longer "to make it work" as the commit log says it's more a performance optimization. > The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows > in advance - right when the VM is instantiated - that it will not have > access to all guest memory. Not quite. It just means that hypervisor can live with not having access to all memory. If platform wants to give it access to all memory that is quite all right. > Unfortunately that assumption is subtly > broken on our secure-platform. The hypervisor/QEMU realizes that the > platform is going secure only *after the VM is instantiated*. It's the > kernel running in the VM that determines that it wants to switch the > platform to secure-mode. ACCESS_PLATFORM is there so guests can detect legacy hypervisors which always assumed it's another CPU. > Another way of looking at this issue which also explains our reluctance > is that the only difference between a secure guest and a regular guest > (at least regarding virtio) is that the former uses swiotlb while the > latter doens't. But swiotlb is just one implementation. It's a guest internal thing. The issue is that memory isn't host accessible. Yes linux does not use that info too much right now but it already begins to seep out of the abstraction. For example as you are doing data copies you should maybe calculate the packet checksum just as well. Not something DMA API will let you know right now, but that's because any bounce buffer users so far weren't terribly fast anyway - it was all for 16 bit hardware and such. > And from the device's point of view they're > indistinguishable. It can't tell one guest that is using swiotlb from > one that isn't. And that implies that secure guest vs regular guest > isn't a virtio interface issue, it's "guest internal affairs". So > there's no reason to reflect that in the feature flags. So don't. The way not to reflect that in the feature flags is to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. Without ACCESS_PLATFORM virtio has a very specific opinion about the security of the device, and that opinion is that device is part of the guest supervisor security domain. > That said, we still would like to arrive at a proper design for this > rather than add yet another hack if we can avoid it. So here's another > proposal: considering that the dma-direct code (in kernel/dma/direct.c) > automatically uses swiotlb when necessary (thanks to Christoph's recent > DMA work), would it be ok to replace virtio's own direct-memory code > that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > code cleanup (replace open-coded stuff with calls to existing > infrastructure). Let's say I have some doubts that there's an API that matches what virtio with its bag of legacy compatibility exactly. But taking a step back you seem to keep looking at it at the code level. And I think that's not necessarily right. If ACCESS_PLATFORM isn't what you are looking for then maybe you need another feature bit. But you/we need to figure out what it means first. > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > Yeah, it's been a difficult discussion. Thanks for still engaging! > I honestly thought that this patch was a good solution (if the guest has > encrypted memory it means that the DMA API needs to be used), but I can > see where you are coming from. As I said, we'd like to arrive at a good > solution if possible. > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM > > My understanding is, AMD guest-platform knows in advance that their > guest will run in secure mode and hence sets the flag at the time of VM > instantiation. Unfortunately we dont have that luxury on our platforms. Well you do have that luxury. It looks like that there are existing guests that already acknowledge ACCESS_PLATFORM and you are not happy with how that path is slow. So you are trying to optimize for them by clearing ACCESS_PLATFORM and then you have lost ability to invoke DMA API. For example if there was another flag just like ACCESS_PLATFORM just not yet used by anyone, you would be all fine using that right? Is there any justification to doing that beyond someone putting out slow code in the past? > > (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > Yes, my understanding is that they turn ACCESS_PLATFORM on. And because > of that, IIUC this patch wouldn't affect them because in their platform > vring_use_dma_api() returns true earlier in the > "if !virtio_has_iommu_quirk(vdev)" condition. Let's just say I don't think we should assume how the specific hypervisor behaves. It seems to follow the spec and so should Linux. > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > Sorry, I don't understand this. I mean we can set a flag within struct virtio_device instead of poking at features checking xen etc etc. > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > There is support for ACCESS_PLATFORM on POWER. We don't enable it > because it causes a performance hit. For legacy guests. > > I have Cc'd you on these discussions. > > I'm having a look at the spec and the patches, but to be honest I'm not > the best powerpc guy for this. I'll see if I can get others to have a > look. > > > Thanks! > > Thanks as well! > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-02-04 20:23 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 20:23 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: > > Hello Michael, > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > >> > >> Fixing address of powerpc mailing list. > >> > >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > >> > >> > Hello, > >> > > >> > With Christoph's rework of the DMA API that recently landed, the patch > >> > below is the only change needed in virtio to make it work in a POWER > >> > secure guest under the ultravisor. > >> > > >> > The other change we need (making sure the device's dma_map_ops is NULL > >> > so that the dma-direct/swiotlb code is used) can be made in > >> > powerpc-specific code. > >> > > >> > Of course, I also have patches (soon to be posted as RFC) which hook up > >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. > >> > > >> > What do you think? > >> > > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > >> > > >> > The host can't access the guest memory when it's encrypted, so using > >> > regular memory pages for the ring isn't an option. Go through the DMA API. > >> > > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > I understand. The problem with that approach for us is that because we > don't know which guests will become secure guests and which will remain > regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. > > And the problem with that is that for QEMU on POWER, having > ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the > device (which makes sense considering that the name of the flag was > IOMMU_PLATFORM). And we need that for regular guests to avoid > performance degradation. You don't really, ACCESS_PLATFORM means just that, platform decides. > So while ACCESS_PLATFORM solves our problems for secure guests, we can't > turn it on by default because we can't affect legacy systems. Doing so > would penalize existing systems that can access all memory. They would > all have to unnecessarily go through address translations, and take a > performance hit. So as step one, you just give hypervisor admin an option to run legacy systems faster by blocking secure mode. I don't see why that is so terrible. But as step two, assuming you use above step one to make legacy guests go fast - maybe there is a point in detecting such a hypervisor and doing something smarter with it. By all means let's have a discussion around this but that is no longer "to make it work" as the commit log says it's more a performance optimization. > The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows > in advance - right when the VM is instantiated - that it will not have > access to all guest memory. Not quite. It just means that hypervisor can live with not having access to all memory. If platform wants to give it access to all memory that is quite all right. > Unfortunately that assumption is subtly > broken on our secure-platform. The hypervisor/QEMU realizes that the > platform is going secure only *after the VM is instantiated*. It's the > kernel running in the VM that determines that it wants to switch the > platform to secure-mode. ACCESS_PLATFORM is there so guests can detect legacy hypervisors which always assumed it's another CPU. > Another way of looking at this issue which also explains our reluctance > is that the only difference between a secure guest and a regular guest > (at least regarding virtio) is that the former uses swiotlb while the > latter doens't. But swiotlb is just one implementation. It's a guest internal thing. The issue is that memory isn't host accessible. Yes linux does not use that info too much right now but it already begins to seep out of the abstraction. For example as you are doing data copies you should maybe calculate the packet checksum just as well. Not something DMA API will let you know right now, but that's because any bounce buffer users so far weren't terribly fast anyway - it was all for 16 bit hardware and such. > And from the device's point of view they're > indistinguishable. It can't tell one guest that is using swiotlb from > one that isn't. And that implies that secure guest vs regular guest > isn't a virtio interface issue, it's "guest internal affairs". So > there's no reason to reflect that in the feature flags. So don't. The way not to reflect that in the feature flags is to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. Without ACCESS_PLATFORM virtio has a very specific opinion about the security of the device, and that opinion is that device is part of the guest supervisor security domain. > That said, we still would like to arrive at a proper design for this > rather than add yet another hack if we can avoid it. So here's another > proposal: considering that the dma-direct code (in kernel/dma/direct.c) > automatically uses swiotlb when necessary (thanks to Christoph's recent > DMA work), would it be ok to replace virtio's own direct-memory code > that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > code cleanup (replace open-coded stuff with calls to existing > infrastructure). Let's say I have some doubts that there's an API that matches what virtio with its bag of legacy compatibility exactly. But taking a step back you seem to keep looking at it at the code level. And I think that's not necessarily right. If ACCESS_PLATFORM isn't what you are looking for then maybe you need another feature bit. But you/we need to figure out what it means first. > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > Yeah, it's been a difficult discussion. Thanks for still engaging! > I honestly thought that this patch was a good solution (if the guest has > encrypted memory it means that the DMA API needs to be used), but I can > see where you are coming from. As I said, we'd like to arrive at a good > solution if possible. > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM > > My understanding is, AMD guest-platform knows in advance that their > guest will run in secure mode and hence sets the flag at the time of VM > instantiation. Unfortunately we dont have that luxury on our platforms. Well you do have that luxury. It looks like that there are existing guests that already acknowledge ACCESS_PLATFORM and you are not happy with how that path is slow. So you are trying to optimize for them by clearing ACCESS_PLATFORM and then you have lost ability to invoke DMA API. For example if there was another flag just like ACCESS_PLATFORM just not yet used by anyone, you would be all fine using that right? Is there any justification to doing that beyond someone putting out slow code in the past? > > (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > Yes, my understanding is that they turn ACCESS_PLATFORM on. And because > of that, IIUC this patch wouldn't affect them because in their platform > vring_use_dma_api() returns true earlier in the > "if !virtio_has_iommu_quirk(vdev)" condition. Let's just say I don't think we should assume how the specific hypervisor behaves. It seems to follow the spec and so should Linux. > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > Sorry, I don't understand this. I mean we can set a flag within struct virtio_device instead of poking at features checking xen etc etc. > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > There is support for ACCESS_PLATFORM on POWER. We don't enable it > because it causes a performance hit. For legacy guests. > > I have Cc'd you on these discussions. > > I'm having a look at the spec and the patches, but to be honest I'm not > the best powerpc guy for this. I'll see if I can get others to have a > look. > > > Thanks! > > Thanks as well! > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 20:23 ` Michael S. Tsirkin (?) @ 2019-03-20 16:13 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-20 16:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Hello Michael, Sorry for the delay in responding. We had some internal discussions on this. Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: >> >> Hello Michael, >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> So while ACCESS_PLATFORM solves our problems for secure guests, we can't >> turn it on by default because we can't affect legacy systems. Doing so >> would penalize existing systems that can access all memory. They would >> all have to unnecessarily go through address translations, and take a >> performance hit. > > So as step one, you just give hypervisor admin an option to run legacy > systems faster by blocking secure mode. I don't see why that is > so terrible. There are a few reasons why: 1. It's bad user experience to require people to fiddle with knobs for obscure reasons if it's possible to design things such that they Just Work. 2. "User" in this case can be a human directly calling QEMU, but could also be libvirt or one of its users, or some other framework. This means having to adjust and/or educate an open-ended number of people and software. It's best avoided if possible. 3. The hypervisor admin and the admin of the guest system don't necessarily belong to the same organization (e.g., cloud provider and cloud customer), so there may be some friction when they need to coordinate to get this right. 4. A feature of our design is that the guest may or may not decide to "go secure" at boot time, so it's best not to depend on flags that may or may not have been set at the time QEMU was started. >> The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows >> in advance - right when the VM is instantiated - that it will not have >> access to all guest memory. > > Not quite. It just means that hypervisor can live with not having > access to all memory. If platform wants to give it access > to all memory that is quite all right. Except that on powerpc it also means "there's an IOMMU present" and there's no way to say "bypass IOMMU translation". :-/ >> Another way of looking at this issue which also explains our reluctance >> is that the only difference between a secure guest and a regular guest >> (at least regarding virtio) is that the former uses swiotlb while the >> latter doens't. > > But swiotlb is just one implementation. It's a guest internal thing. The > issue is that memory isn't host accessible. From what I understand of the ACCESS_PLATFORM definition, the host will only ever try to access memory addresses that are supplied to it by the guest, so all of the secure guest memory that the host cares about is accessible: If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. All of the above is true for POWER guests, whether they are secure guests or not. Or are you saying that a virtio device may want to access memory addresses that weren't supplied to it by the driver? >> And from the device's point of view they're >> indistinguishable. It can't tell one guest that is using swiotlb from >> one that isn't. And that implies that secure guest vs regular guest >> isn't a virtio interface issue, it's "guest internal affairs". So >> there's no reason to reflect that in the feature flags. > > So don't. The way not to reflect that in the feature flags is > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > Without ACCESS_PLATFORM > virtio has a very specific opinion about the security of the > device, and that opinion is that device is part of the guest > supervisor security domain. Sorry for being a bit dense, but not sure what "the device is part of the guest supervisor security domain" means. In powerpc-speak, "supervisor" is the operating system so perhaps that explains my confusion. Are you saying that without ACCESS_PLATFORM, the guest considers the host to be part of the guest operating system's security domain? If so, does that have any other implication besides "the host can access any address supplied to it by the driver"? If that is the case, perhaps the definition of ACCESS_PLATFORM needs to be amended to include that information because it's not part of the current definition. >> That said, we still would like to arrive at a proper design for this >> rather than add yet another hack if we can avoid it. So here's another >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) >> automatically uses swiotlb when necessary (thanks to Christoph's recent >> DMA work), would it be ok to replace virtio's own direct-memory code >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a >> code cleanup (replace open-coded stuff with calls to existing >> infrastructure). > > Let's say I have some doubts that there's an API that > matches what virtio with its bag of legacy compatibility exactly. Ok. >> > But the name "sev_active" makes me scared because at least AMD guys who >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> My understanding is, AMD guest-platform knows in advance that their >> guest will run in secure mode and hence sets the flag at the time of VM >> instantiation. Unfortunately we dont have that luxury on our platforms. > > Well you do have that luxury. It looks like that there are existing > guests that already acknowledge ACCESS_PLATFORM and you are not happy > with how that path is slow. So you are trying to optimize for > them by clearing ACCESS_PLATFORM and then you have lost ability > to invoke DMA API. > > For example if there was another flag just like ACCESS_PLATFORM > just not yet used by anyone, you would be all fine using that right? Yes, a new flag sounds like a great idea. What about the definition below? VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that the IOMMU is explicitly defined to be off or bypassed when accessing memory addresses supplied to the device by the driver. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. > Is there any justification to doing that beyond someone putting > out slow code in the past? The definition of the ACCESS_PLATFORM flag is generic and captures the notion of memory access restrictions for the device. Unfortunately, on powerpc pSeries guests it also implies that the IOMMU is turned on even though pSeries guests have never used IOMMU for virtio devices. Combined with the lack of a way to turn off or bypass the IOMMU for virtio devices, this means that existing guests in the field are compelled to use the IOMMU even though that never was the case before, and said guests having no mechanism to turn it off. Therefore, we need a new flag to signal the memory access restriction present in secure guests which doesn't also imply turning on the IOMMU. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-20 16:13 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-20 16:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Hello Michael, Sorry for the delay in responding. We had some internal discussions on this. Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: >> >> Hello Michael, >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> So while ACCESS_PLATFORM solves our problems for secure guests, we can't >> turn it on by default because we can't affect legacy systems. Doing so >> would penalize existing systems that can access all memory. They would >> all have to unnecessarily go through address translations, and take a >> performance hit. > > So as step one, you just give hypervisor admin an option to run legacy > systems faster by blocking secure mode. I don't see why that is > so terrible. There are a few reasons why: 1. It's bad user experience to require people to fiddle with knobs for obscure reasons if it's possible to design things such that they Just Work. 2. "User" in this case can be a human directly calling QEMU, but could also be libvirt or one of its users, or some other framework. This means having to adjust and/or educate an open-ended number of people and software. It's best avoided if possible. 3. The hypervisor admin and the admin of the guest system don't necessarily belong to the same organization (e.g., cloud provider and cloud customer), so there may be some friction when they need to coordinate to get this right. 4. A feature of our design is that the guest may or may not decide to "go secure" at boot time, so it's best not to depend on flags that may or may not have been set at the time QEMU was started. >> The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows >> in advance - right when the VM is instantiated - that it will not have >> access to all guest memory. > > Not quite. It just means that hypervisor can live with not having > access to all memory. If platform wants to give it access > to all memory that is quite all right. Except that on powerpc it also means "there's an IOMMU present" and there's no way to say "bypass IOMMU translation". :-/ >> Another way of looking at this issue which also explains our reluctance >> is that the only difference between a secure guest and a regular guest >> (at least regarding virtio) is that the former uses swiotlb while the >> latter doens't. > > But swiotlb is just one implementation. It's a guest internal thing. The > issue is that memory isn't host accessible. >From what I understand of the ACCESS_PLATFORM definition, the host will only ever try to access memory addresses that are supplied to it by the guest, so all of the secure guest memory that the host cares about is accessible: If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. All of the above is true for POWER guests, whether they are secure guests or not. Or are you saying that a virtio device may want to access memory addresses that weren't supplied to it by the driver? >> And from the device's point of view they're >> indistinguishable. It can't tell one guest that is using swiotlb from >> one that isn't. And that implies that secure guest vs regular guest >> isn't a virtio interface issue, it's "guest internal affairs". So >> there's no reason to reflect that in the feature flags. > > So don't. The way not to reflect that in the feature flags is > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > Without ACCESS_PLATFORM > virtio has a very specific opinion about the security of the > device, and that opinion is that device is part of the guest > supervisor security domain. Sorry for being a bit dense, but not sure what "the device is part of the guest supervisor security domain" means. In powerpc-speak, "supervisor" is the operating system so perhaps that explains my confusion. Are you saying that without ACCESS_PLATFORM, the guest considers the host to be part of the guest operating system's security domain? If so, does that have any other implication besides "the host can access any address supplied to it by the driver"? If that is the case, perhaps the definition of ACCESS_PLATFORM needs to be amended to include that information because it's not part of the current definition. >> That said, we still would like to arrive at a proper design for this >> rather than add yet another hack if we can avoid it. So here's another >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) >> automatically uses swiotlb when necessary (thanks to Christoph's recent >> DMA work), would it be ok to replace virtio's own direct-memory code >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a >> code cleanup (replace open-coded stuff with calls to existing >> infrastructure). > > Let's say I have some doubts that there's an API that > matches what virtio with its bag of legacy compatibility exactly. Ok. >> > But the name "sev_active" makes me scared because at least AMD guys who >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> My understanding is, AMD guest-platform knows in advance that their >> guest will run in secure mode and hence sets the flag at the time of VM >> instantiation. Unfortunately we dont have that luxury on our platforms. > > Well you do have that luxury. It looks like that there are existing > guests that already acknowledge ACCESS_PLATFORM and you are not happy > with how that path is slow. So you are trying to optimize for > them by clearing ACCESS_PLATFORM and then you have lost ability > to invoke DMA API. > > For example if there was another flag just like ACCESS_PLATFORM > just not yet used by anyone, you would be all fine using that right? Yes, a new flag sounds like a great idea. What about the definition below? VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that the IOMMU is explicitly defined to be off or bypassed when accessing memory addresses supplied to the device by the driver. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. > Is there any justification to doing that beyond someone putting > out slow code in the past? The definition of the ACCESS_PLATFORM flag is generic and captures the notion of memory access restrictions for the device. Unfortunately, on powerpc pSeries guests it also implies that the IOMMU is turned on even though pSeries guests have never used IOMMU for virtio devices. Combined with the lack of a way to turn off or bypass the IOMMU for virtio devices, this means that existing guests in the field are compelled to use the IOMMU even though that never was the case before, and said guests having no mechanism to turn it off. Therefore, we need a new flag to signal the memory access restriction present in secure guests which doesn't also imply turning on the IOMMU. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-20 16:13 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-20 16:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Hello Michael, Sorry for the delay in responding. We had some internal discussions on this. Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: >> >> Hello Michael, >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> So while ACCESS_PLATFORM solves our problems for secure guests, we can't >> turn it on by default because we can't affect legacy systems. Doing so >> would penalize existing systems that can access all memory. They would >> all have to unnecessarily go through address translations, and take a >> performance hit. > > So as step one, you just give hypervisor admin an option to run legacy > systems faster by blocking secure mode. I don't see why that is > so terrible. There are a few reasons why: 1. It's bad user experience to require people to fiddle with knobs for obscure reasons if it's possible to design things such that they Just Work. 2. "User" in this case can be a human directly calling QEMU, but could also be libvirt or one of its users, or some other framework. This means having to adjust and/or educate an open-ended number of people and software. It's best avoided if possible. 3. The hypervisor admin and the admin of the guest system don't necessarily belong to the same organization (e.g., cloud provider and cloud customer), so there may be some friction when they need to coordinate to get this right. 4. A feature of our design is that the guest may or may not decide to "go secure" at boot time, so it's best not to depend on flags that may or may not have been set at the time QEMU was started. >> The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows >> in advance - right when the VM is instantiated - that it will not have >> access to all guest memory. > > Not quite. It just means that hypervisor can live with not having > access to all memory. If platform wants to give it access > to all memory that is quite all right. Except that on powerpc it also means "there's an IOMMU present" and there's no way to say "bypass IOMMU translation". :-/ >> Another way of looking at this issue which also explains our reluctance >> is that the only difference between a secure guest and a regular guest >> (at least regarding virtio) is that the former uses swiotlb while the >> latter doens't. > > But swiotlb is just one implementation. It's a guest internal thing. The > issue is that memory isn't host accessible. From what I understand of the ACCESS_PLATFORM definition, the host will only ever try to access memory addresses that are supplied to it by the guest, so all of the secure guest memory that the host cares about is accessible: If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. All of the above is true for POWER guests, whether they are secure guests or not. Or are you saying that a virtio device may want to access memory addresses that weren't supplied to it by the driver? >> And from the device's point of view they're >> indistinguishable. It can't tell one guest that is using swiotlb from >> one that isn't. And that implies that secure guest vs regular guest >> isn't a virtio interface issue, it's "guest internal affairs". So >> there's no reason to reflect that in the feature flags. > > So don't. The way not to reflect that in the feature flags is > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > Without ACCESS_PLATFORM > virtio has a very specific opinion about the security of the > device, and that opinion is that device is part of the guest > supervisor security domain. Sorry for being a bit dense, but not sure what "the device is part of the guest supervisor security domain" means. In powerpc-speak, "supervisor" is the operating system so perhaps that explains my confusion. Are you saying that without ACCESS_PLATFORM, the guest considers the host to be part of the guest operating system's security domain? If so, does that have any other implication besides "the host can access any address supplied to it by the driver"? If that is the case, perhaps the definition of ACCESS_PLATFORM needs to be amended to include that information because it's not part of the current definition. >> That said, we still would like to arrive at a proper design for this >> rather than add yet another hack if we can avoid it. So here's another >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) >> automatically uses swiotlb when necessary (thanks to Christoph's recent >> DMA work), would it be ok to replace virtio's own direct-memory code >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a >> code cleanup (replace open-coded stuff with calls to existing >> infrastructure). > > Let's say I have some doubts that there's an API that > matches what virtio with its bag of legacy compatibility exactly. Ok. >> > But the name "sev_active" makes me scared because at least AMD guys who >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> My understanding is, AMD guest-platform knows in advance that their >> guest will run in secure mode and hence sets the flag at the time of VM >> instantiation. Unfortunately we dont have that luxury on our platforms. > > Well you do have that luxury. It looks like that there are existing > guests that already acknowledge ACCESS_PLATFORM and you are not happy > with how that path is slow. So you are trying to optimize for > them by clearing ACCESS_PLATFORM and then you have lost ability > to invoke DMA API. > > For example if there was another flag just like ACCESS_PLATFORM > just not yet used by anyone, you would be all fine using that right? Yes, a new flag sounds like a great idea. What about the definition below? VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that the IOMMU is explicitly defined to be off or bypassed when accessing memory addresses supplied to the device by the driver. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. > Is there any justification to doing that beyond someone putting > out slow code in the past? The definition of the ACCESS_PLATFORM flag is generic and captures the notion of memory access restrictions for the device. Unfortunately, on powerpc pSeries guests it also implies that the IOMMU is turned on even though pSeries guests have never used IOMMU for virtio devices. Combined with the lack of a way to turn off or bypass the IOMMU for virtio devices, this means that existing guests in the field are compelled to use the IOMMU even though that never was the case before, and said guests having no mechanism to turn it off. Therefore, we need a new flag to signal the memory access restriction present in secure guests which doesn't also imply turning on the IOMMU. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-20 16:13 ` Thiago Jung Bauermann (?) (?) @ 2019-03-20 21:17 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-20 21:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> Another way of looking at this issue which also explains our reluctance > >> is that the only difference between a secure guest and a regular guest > >> (at least regarding virtio) is that the former uses swiotlb while the > >> latter doens't. > > > > But swiotlb is just one implementation. It's a guest internal thing. The > > issue is that memory isn't host accessible. > > >From what I understand of the ACCESS_PLATFORM definition, the host will > only ever try to access memory addresses that are supplied to it by the > guest, so all of the secure guest memory that the host cares about is > accessible: > > If this feature bit is set to 0, then the device has same access to > memory addresses supplied to it as the driver has. In particular, > the device will always use physical addresses matching addresses > used by the driver (typically meaning physical addresses used by the > CPU) and not translated further, and can access any address supplied > to it by the driver. When clear, this overrides any > platform-specific description of whether device access is limited or > translated in any way, e.g. whether an IOMMU may be present. > > All of the above is true for POWER guests, whether they are secure > guests or not. > > Or are you saying that a virtio device may want to access memory > addresses that weren't supplied to it by the driver? Your logic would apply to IOMMUs as well. For your mode, there are specific encrypted memory regions that driver has access to but device does not. that seems to violate the constraint. > >> And from the device's point of view they're > >> indistinguishable. It can't tell one guest that is using swiotlb from > >> one that isn't. And that implies that secure guest vs regular guest > >> isn't a virtio interface issue, it's "guest internal affairs". So > >> there's no reason to reflect that in the feature flags. > > > > So don't. The way not to reflect that in the feature flags is > > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > > > > Without ACCESS_PLATFORM > > virtio has a very specific opinion about the security of the > > device, and that opinion is that device is part of the guest > > supervisor security domain. > > Sorry for being a bit dense, but not sure what "the device is part of > the guest supervisor security domain" means. In powerpc-speak, > "supervisor" is the operating system so perhaps that explains my > confusion. Are you saying that without ACCESS_PLATFORM, the guest > considers the host to be part of the guest operating system's security > domain? I think so. The spec says "device has same access as driver". > If so, does that have any other implication besides "the host > can access any address supplied to it by the driver"? If that is the > case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > include that information because it's not part of the current > definition. > > >> That said, we still would like to arrive at a proper design for this > >> rather than add yet another hack if we can avoid it. So here's another > >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) > >> automatically uses swiotlb when necessary (thanks to Christoph's recent > >> DMA work), would it be ok to replace virtio's own direct-memory code > >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > >> code cleanup (replace open-coded stuff with calls to existing > >> infrastructure). > > > > Let's say I have some doubts that there's an API that > > matches what virtio with its bag of legacy compatibility exactly. > > Ok. > > >> > But the name "sev_active" makes me scared because at least AMD guys who > >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> > >> My understanding is, AMD guest-platform knows in advance that their > >> guest will run in secure mode and hence sets the flag at the time of VM > >> instantiation. Unfortunately we dont have that luxury on our platforms. > > > > Well you do have that luxury. It looks like that there are existing > > guests that already acknowledge ACCESS_PLATFORM and you are not happy > > with how that path is slow. So you are trying to optimize for > > them by clearing ACCESS_PLATFORM and then you have lost ability > > to invoke DMA API. > > > > For example if there was another flag just like ACCESS_PLATFORM > > just not yet used by anyone, you would be all fine using that right? > > Yes, a new flag sounds like a great idea. What about the definition > below? > > VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > exception that the IOMMU is explicitly defined to be off or bypassed > when accessing memory addresses supplied to the device by the > driver. This flag should be set by the guest if offered, but to > allow for backward-compatibility device implementations allow for it > to be left unset by the guest. It is an error to set both this flag > and VIRTIO_F_ACCESS_PLATFORM. It looks kind of narrow but it's an option. I wonder how we'll define what's an iommu though. Another idea is maybe something like virtio-iommu? > > Is there any justification to doing that beyond someone putting > > out slow code in the past? > > The definition of the ACCESS_PLATFORM flag is generic and captures the > notion of memory access restrictions for the device. Unfortunately, on > powerpc pSeries guests it also implies that the IOMMU is turned on IIUC that's really because on pSeries IOMMU is *always* turned on. Platform has no way to say what you want it to say which is bypass the iommu for the specific device. > even > though pSeries guests have never used IOMMU for virtio devices. Combined > with the lack of a way to turn off or bypass the IOMMU for virtio > devices, this means that existing guests in the field are compelled to > use the IOMMU even though that never was the case before, and said > guests having no mechanism to turn it off. > > Therefore, we need a new flag to signal the memory access restriction > present in secure guests which doesn't also imply turning on the IOMMU. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-20 16:13 ` Thiago Jung Bauermann @ 2019-03-20 21:17 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-20 21:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> Another way of looking at this issue which also explains our reluctance > >> is that the only difference between a secure guest and a regular guest > >> (at least regarding virtio) is that the former uses swiotlb while the > >> latter doens't. > > > > But swiotlb is just one implementation. It's a guest internal thing. The > > issue is that memory isn't host accessible. > > >From what I understand of the ACCESS_PLATFORM definition, the host will > only ever try to access memory addresses that are supplied to it by the > guest, so all of the secure guest memory that the host cares about is > accessible: > > If this feature bit is set to 0, then the device has same access to > memory addresses supplied to it as the driver has. In particular, > the device will always use physical addresses matching addresses > used by the driver (typically meaning physical addresses used by the > CPU) and not translated further, and can access any address supplied > to it by the driver. When clear, this overrides any > platform-specific description of whether device access is limited or > translated in any way, e.g. whether an IOMMU may be present. > > All of the above is true for POWER guests, whether they are secure > guests or not. > > Or are you saying that a virtio device may want to access memory > addresses that weren't supplied to it by the driver? Your logic would apply to IOMMUs as well. For your mode, there are specific encrypted memory regions that driver has access to but device does not. that seems to violate the constraint. > >> And from the device's point of view they're > >> indistinguishable. It can't tell one guest that is using swiotlb from > >> one that isn't. And that implies that secure guest vs regular guest > >> isn't a virtio interface issue, it's "guest internal affairs". So > >> there's no reason to reflect that in the feature flags. > > > > So don't. The way not to reflect that in the feature flags is > > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > > > > Without ACCESS_PLATFORM > > virtio has a very specific opinion about the security of the > > device, and that opinion is that device is part of the guest > > supervisor security domain. > > Sorry for being a bit dense, but not sure what "the device is part of > the guest supervisor security domain" means. In powerpc-speak, > "supervisor" is the operating system so perhaps that explains my > confusion. Are you saying that without ACCESS_PLATFORM, the guest > considers the host to be part of the guest operating system's security > domain? I think so. The spec says "device has same access as driver". > If so, does that have any other implication besides "the host > can access any address supplied to it by the driver"? If that is the > case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > include that information because it's not part of the current > definition. > > >> That said, we still would like to arrive at a proper design for this > >> rather than add yet another hack if we can avoid it. So here's another > >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) > >> automatically uses swiotlb when necessary (thanks to Christoph's recent > >> DMA work), would it be ok to replace virtio's own direct-memory code > >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > >> code cleanup (replace open-coded stuff with calls to existing > >> infrastructure). > > > > Let's say I have some doubts that there's an API that > > matches what virtio with its bag of legacy compatibility exactly. > > Ok. > > >> > But the name "sev_active" makes me scared because at least AMD guys who > >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> > >> My understanding is, AMD guest-platform knows in advance that their > >> guest will run in secure mode and hence sets the flag at the time of VM > >> instantiation. Unfortunately we dont have that luxury on our platforms. > > > > Well you do have that luxury. It looks like that there are existing > > guests that already acknowledge ACCESS_PLATFORM and you are not happy > > with how that path is slow. So you are trying to optimize for > > them by clearing ACCESS_PLATFORM and then you have lost ability > > to invoke DMA API. > > > > For example if there was another flag just like ACCESS_PLATFORM > > just not yet used by anyone, you would be all fine using that right? > > Yes, a new flag sounds like a great idea. What about the definition > below? > > VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > exception that the IOMMU is explicitly defined to be off or bypassed > when accessing memory addresses supplied to the device by the > driver. This flag should be set by the guest if offered, but to > allow for backward-compatibility device implementations allow for it > to be left unset by the guest. It is an error to set both this flag > and VIRTIO_F_ACCESS_PLATFORM. It looks kind of narrow but it's an option. I wonder how we'll define what's an iommu though. Another idea is maybe something like virtio-iommu? > > Is there any justification to doing that beyond someone putting > > out slow code in the past? > > The definition of the ACCESS_PLATFORM flag is generic and captures the > notion of memory access restrictions for the device. Unfortunately, on > powerpc pSeries guests it also implies that the IOMMU is turned on IIUC that's really because on pSeries IOMMU is *always* turned on. Platform has no way to say what you want it to say which is bypass the iommu for the specific device. > even > though pSeries guests have never used IOMMU for virtio devices. Combined > with the lack of a way to turn off or bypass the IOMMU for virtio > devices, this means that existing guests in the field are compelled to > use the IOMMU even though that never was the case before, and said > guests having no mechanism to turn it off. > > Therefore, we need a new flag to signal the memory access restriction > present in secure guests which doesn't also imply turning on the IOMMU. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-20 21:17 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-20 21:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> Another way of looking at this issue which also explains our reluctance > >> is that the only difference between a secure guest and a regular guest > >> (at least regarding virtio) is that the former uses swiotlb while the > >> latter doens't. > > > > But swiotlb is just one implementation. It's a guest internal thing. The > > issue is that memory isn't host accessible. > > >From what I understand of the ACCESS_PLATFORM definition, the host will > only ever try to access memory addresses that are supplied to it by the > guest, so all of the secure guest memory that the host cares about is > accessible: > > If this feature bit is set to 0, then the device has same access to > memory addresses supplied to it as the driver has. In particular, > the device will always use physical addresses matching addresses > used by the driver (typically meaning physical addresses used by the > CPU) and not translated further, and can access any address supplied > to it by the driver. When clear, this overrides any > platform-specific description of whether device access is limited or > translated in any way, e.g. whether an IOMMU may be present. > > All of the above is true for POWER guests, whether they are secure > guests or not. > > Or are you saying that a virtio device may want to access memory > addresses that weren't supplied to it by the driver? Your logic would apply to IOMMUs as well. For your mode, there are specific encrypted memory regions that driver has access to but device does not. that seems to violate the constraint. > >> And from the device's point of view they're > >> indistinguishable. It can't tell one guest that is using swiotlb from > >> one that isn't. And that implies that secure guest vs regular guest > >> isn't a virtio interface issue, it's "guest internal affairs". So > >> there's no reason to reflect that in the feature flags. > > > > So don't. The way not to reflect that in the feature flags is > > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > > > > Without ACCESS_PLATFORM > > virtio has a very specific opinion about the security of the > > device, and that opinion is that device is part of the guest > > supervisor security domain. > > Sorry for being a bit dense, but not sure what "the device is part of > the guest supervisor security domain" means. In powerpc-speak, > "supervisor" is the operating system so perhaps that explains my > confusion. Are you saying that without ACCESS_PLATFORM, the guest > considers the host to be part of the guest operating system's security > domain? I think so. The spec says "device has same access as driver". > If so, does that have any other implication besides "the host > can access any address supplied to it by the driver"? If that is the > case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > include that information because it's not part of the current > definition. > > >> That said, we still would like to arrive at a proper design for this > >> rather than add yet another hack if we can avoid it. So here's another > >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) > >> automatically uses swiotlb when necessary (thanks to Christoph's recent > >> DMA work), would it be ok to replace virtio's own direct-memory code > >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > >> code cleanup (replace open-coded stuff with calls to existing > >> infrastructure). > > > > Let's say I have some doubts that there's an API that > > matches what virtio with its bag of legacy compatibility exactly. > > Ok. > > >> > But the name "sev_active" makes me scared because at least AMD guys who > >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> > >> My understanding is, AMD guest-platform knows in advance that their > >> guest will run in secure mode and hence sets the flag at the time of VM > >> instantiation. Unfortunately we dont have that luxury on our platforms. > > > > Well you do have that luxury. It looks like that there are existing > > guests that already acknowledge ACCESS_PLATFORM and you are not happy > > with how that path is slow. So you are trying to optimize for > > them by clearing ACCESS_PLATFORM and then you have lost ability > > to invoke DMA API. > > > > For example if there was another flag just like ACCESS_PLATFORM > > just not yet used by anyone, you would be all fine using that right? > > Yes, a new flag sounds like a great idea. What about the definition > below? > > VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > exception that the IOMMU is explicitly defined to be off or bypassed > when accessing memory addresses supplied to the device by the > driver. This flag should be set by the guest if offered, but to > allow for backward-compatibility device implementations allow for it > to be left unset by the guest. It is an error to set both this flag > and VIRTIO_F_ACCESS_PLATFORM. It looks kind of narrow but it's an option. I wonder how we'll define what's an iommu though. Another idea is maybe something like virtio-iommu? > > Is there any justification to doing that beyond someone putting > > out slow code in the past? > > The definition of the ACCESS_PLATFORM flag is generic and captures the > notion of memory access restrictions for the device. Unfortunately, on > powerpc pSeries guests it also implies that the IOMMU is turned on IIUC that's really because on pSeries IOMMU is *always* turned on. Platform has no way to say what you want it to say which is bypass the iommu for the specific device. > even > though pSeries guests have never used IOMMU for virtio devices. Combined > with the lack of a way to turn off or bypass the IOMMU for virtio > devices, this means that existing guests in the field are compelled to > use the IOMMU even though that never was the case before, and said > guests having no mechanism to turn it off. > > Therefore, we need a new flag to signal the memory access restriction > present in secure guests which doesn't also imply turning on the IOMMU. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-20 21:17 ` Michael S. Tsirkin (?) @ 2019-03-22 0:05 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-22 0:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> Another way of looking at this issue which also explains our reluctance >> >> is that the only difference between a secure guest and a regular guest >> >> (at least regarding virtio) is that the former uses swiotlb while the >> >> latter doens't. >> > >> > But swiotlb is just one implementation. It's a guest internal thing. The >> > issue is that memory isn't host accessible. >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> only ever try to access memory addresses that are supplied to it by the >> guest, so all of the secure guest memory that the host cares about is >> accessible: >> >> If this feature bit is set to 0, then the device has same access to >> memory addresses supplied to it as the driver has. In particular, >> the device will always use physical addresses matching addresses >> used by the driver (typically meaning physical addresses used by the >> CPU) and not translated further, and can access any address supplied >> to it by the driver. When clear, this overrides any >> platform-specific description of whether device access is limited or >> translated in any way, e.g. whether an IOMMU may be present. >> >> All of the above is true for POWER guests, whether they are secure >> guests or not. >> >> Or are you saying that a virtio device may want to access memory >> addresses that weren't supplied to it by the driver? > > Your logic would apply to IOMMUs as well. For your mode, there are > specific encrypted memory regions that driver has access to but device > does not. that seems to violate the constraint. Right, if there's a pre-configured 1:1 mapping in the IOMMU such that the device can ignore the IOMMU for all practical purposes I would indeed say that the logic would apply to IOMMUs as well. :-) I guess I'm still struggling with the purpose of signalling to the driver that the host may not have access to memory addresses that it will never try to access. >> >> And from the device's point of view they're >> >> indistinguishable. It can't tell one guest that is using swiotlb from >> >> one that isn't. And that implies that secure guest vs regular guest >> >> isn't a virtio interface issue, it's "guest internal affairs". So >> >> there's no reason to reflect that in the feature flags. >> > >> > So don't. The way not to reflect that in the feature flags is >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. >> > >> > >> > Without ACCESS_PLATFORM >> > virtio has a very specific opinion about the security of the >> > device, and that opinion is that device is part of the guest >> > supervisor security domain. >> >> Sorry for being a bit dense, but not sure what "the device is part of >> the guest supervisor security domain" means. In powerpc-speak, >> "supervisor" is the operating system so perhaps that explains my >> confusion. Are you saying that without ACCESS_PLATFORM, the guest >> considers the host to be part of the guest operating system's security >> domain? > > I think so. The spec says "device has same access as driver". Ok, makes sense. >> If so, does that have any other implication besides "the host >> can access any address supplied to it by the driver"? If that is the >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to >> include that information because it's not part of the current >> definition. >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> > >> > Well you do have that luxury. It looks like that there are existing >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> > with how that path is slow. So you are trying to optimize for >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> > to invoke DMA API. >> > >> > For example if there was another flag just like ACCESS_PLATFORM >> > just not yet used by anyone, you would be all fine using that right? >> >> Yes, a new flag sounds like a great idea. What about the definition >> below? >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> exception that the IOMMU is explicitly defined to be off or bypassed >> when accessing memory addresses supplied to the device by the >> driver. This flag should be set by the guest if offered, but to >> allow for backward-compatibility device implementations allow for it >> to be left unset by the guest. It is an error to set both this flag >> and VIRTIO_F_ACCESS_PLATFORM. > > It looks kind of narrow but it's an option. Great! > I wonder how we'll define what's an iommu though. Hm, it didn't occur to me it could be an issue. I'll try. > Another idea is maybe something like virtio-iommu? You mean, have legacy guests use virtio-iommu to request an IOMMU bypass? If so, it's an interesting idea for new guests but it doesn't help with guests that are out today in the field, which don't have A virtio-iommu driver. >> > Is there any justification to doing that beyond someone putting >> > out slow code in the past? >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> notion of memory access restrictions for the device. Unfortunately, on >> powerpc pSeries guests it also implies that the IOMMU is turned on > > IIUC that's really because on pSeries IOMMU is *always* turned on. > Platform has no way to say what you want it to say > which is bypass the iommu for the specific device. Yes, that's correct. pSeries guests running on KVM are in a gray area where theoretically they use an IOMMU but in practice KVM ignores it. It's unfortunate but it's the reality on the ground today. :-/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-20 21:17 ` Michael S. Tsirkin @ 2019-03-22 0:05 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-22 0:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> Another way of looking at this issue which also explains our reluctance >> >> is that the only difference between a secure guest and a regular guest >> >> (at least regarding virtio) is that the former uses swiotlb while the >> >> latter doens't. >> > >> > But swiotlb is just one implementation. It's a guest internal thing. The >> > issue is that memory isn't host accessible. >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> only ever try to access memory addresses that are supplied to it by the >> guest, so all of the secure guest memory that the host cares about is >> accessible: >> >> If this feature bit is set to 0, then the device has same access to >> memory addresses supplied to it as the driver has. In particular, >> the device will always use physical addresses matching addresses >> used by the driver (typically meaning physical addresses used by the >> CPU) and not translated further, and can access any address supplied >> to it by the driver. When clear, this overrides any >> platform-specific description of whether device access is limited or >> translated in any way, e.g. whether an IOMMU may be present. >> >> All of the above is true for POWER guests, whether they are secure >> guests or not. >> >> Or are you saying that a virtio device may want to access memory >> addresses that weren't supplied to it by the driver? > > Your logic would apply to IOMMUs as well. For your mode, there are > specific encrypted memory regions that driver has access to but device > does not. that seems to violate the constraint. Right, if there's a pre-configured 1:1 mapping in the IOMMU such that the device can ignore the IOMMU for all practical purposes I would indeed say that the logic would apply to IOMMUs as well. :-) I guess I'm still struggling with the purpose of signalling to the driver that the host may not have access to memory addresses that it will never try to access. >> >> And from the device's point of view they're >> >> indistinguishable. It can't tell one guest that is using swiotlb from >> >> one that isn't. And that implies that secure guest vs regular guest >> >> isn't a virtio interface issue, it's "guest internal affairs". So >> >> there's no reason to reflect that in the feature flags. >> > >> > So don't. The way not to reflect that in the feature flags is >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. >> > >> > >> > Without ACCESS_PLATFORM >> > virtio has a very specific opinion about the security of the >> > device, and that opinion is that device is part of the guest >> > supervisor security domain. >> >> Sorry for being a bit dense, but not sure what "the device is part of >> the guest supervisor security domain" means. In powerpc-speak, >> "supervisor" is the operating system so perhaps that explains my >> confusion. Are you saying that without ACCESS_PLATFORM, the guest >> considers the host to be part of the guest operating system's security >> domain? > > I think so. The spec says "device has same access as driver". Ok, makes sense. >> If so, does that have any other implication besides "the host >> can access any address supplied to it by the driver"? If that is the >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to >> include that information because it's not part of the current >> definition. >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> > >> > Well you do have that luxury. It looks like that there are existing >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> > with how that path is slow. So you are trying to optimize for >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> > to invoke DMA API. >> > >> > For example if there was another flag just like ACCESS_PLATFORM >> > just not yet used by anyone, you would be all fine using that right? >> >> Yes, a new flag sounds like a great idea. What about the definition >> below? >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> exception that the IOMMU is explicitly defined to be off or bypassed >> when accessing memory addresses supplied to the device by the >> driver. This flag should be set by the guest if offered, but to >> allow for backward-compatibility device implementations allow for it >> to be left unset by the guest. It is an error to set both this flag >> and VIRTIO_F_ACCESS_PLATFORM. > > It looks kind of narrow but it's an option. Great! > I wonder how we'll define what's an iommu though. Hm, it didn't occur to me it could be an issue. I'll try. > Another idea is maybe something like virtio-iommu? You mean, have legacy guests use virtio-iommu to request an IOMMU bypass? If so, it's an interesting idea for new guests but it doesn't help with guests that are out today in the field, which don't have A virtio-iommu driver. >> > Is there any justification to doing that beyond someone putting >> > out slow code in the past? >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> notion of memory access restrictions for the device. Unfortunately, on >> powerpc pSeries guests it also implies that the IOMMU is turned on > > IIUC that's really because on pSeries IOMMU is *always* turned on. > Platform has no way to say what you want it to say > which is bypass the iommu for the specific device. Yes, that's correct. pSeries guests running on KVM are in a gray area where theoretically they use an IOMMU but in practice KVM ignores it. It's unfortunate but it's the reality on the ground today. :-/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-22 0:05 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-22 0:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> Another way of looking at this issue which also explains our reluctance >> >> is that the only difference between a secure guest and a regular guest >> >> (at least regarding virtio) is that the former uses swiotlb while the >> >> latter doens't. >> > >> > But swiotlb is just one implementation. It's a guest internal thing. The >> > issue is that memory isn't host accessible. >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> only ever try to access memory addresses that are supplied to it by the >> guest, so all of the secure guest memory that the host cares about is >> accessible: >> >> If this feature bit is set to 0, then the device has same access to >> memory addresses supplied to it as the driver has. In particular, >> the device will always use physical addresses matching addresses >> used by the driver (typically meaning physical addresses used by the >> CPU) and not translated further, and can access any address supplied >> to it by the driver. When clear, this overrides any >> platform-specific description of whether device access is limited or >> translated in any way, e.g. whether an IOMMU may be present. >> >> All of the above is true for POWER guests, whether they are secure >> guests or not. >> >> Or are you saying that a virtio device may want to access memory >> addresses that weren't supplied to it by the driver? > > Your logic would apply to IOMMUs as well. For your mode, there are > specific encrypted memory regions that driver has access to but device > does not. that seems to violate the constraint. Right, if there's a pre-configured 1:1 mapping in the IOMMU such that the device can ignore the IOMMU for all practical purposes I would indeed say that the logic would apply to IOMMUs as well. :-) I guess I'm still struggling with the purpose of signalling to the driver that the host may not have access to memory addresses that it will never try to access. >> >> And from the device's point of view they're >> >> indistinguishable. It can't tell one guest that is using swiotlb from >> >> one that isn't. And that implies that secure guest vs regular guest >> >> isn't a virtio interface issue, it's "guest internal affairs". So >> >> there's no reason to reflect that in the feature flags. >> > >> > So don't. The way not to reflect that in the feature flags is >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. >> > >> > >> > Without ACCESS_PLATFORM >> > virtio has a very specific opinion about the security of the >> > device, and that opinion is that device is part of the guest >> > supervisor security domain. >> >> Sorry for being a bit dense, but not sure what "the device is part of >> the guest supervisor security domain" means. In powerpc-speak, >> "supervisor" is the operating system so perhaps that explains my >> confusion. Are you saying that without ACCESS_PLATFORM, the guest >> considers the host to be part of the guest operating system's security >> domain? > > I think so. The spec says "device has same access as driver". Ok, makes sense. >> If so, does that have any other implication besides "the host >> can access any address supplied to it by the driver"? If that is the >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to >> include that information because it's not part of the current >> definition. >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> > >> > Well you do have that luxury. It looks like that there are existing >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> > with how that path is slow. So you are trying to optimize for >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> > to invoke DMA API. >> > >> > For example if there was another flag just like ACCESS_PLATFORM >> > just not yet used by anyone, you would be all fine using that right? >> >> Yes, a new flag sounds like a great idea. What about the definition >> below? >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> exception that the IOMMU is explicitly defined to be off or bypassed >> when accessing memory addresses supplied to the device by the >> driver. This flag should be set by the guest if offered, but to >> allow for backward-compatibility device implementations allow for it >> to be left unset by the guest. It is an error to set both this flag >> and VIRTIO_F_ACCESS_PLATFORM. > > It looks kind of narrow but it's an option. Great! > I wonder how we'll define what's an iommu though. Hm, it didn't occur to me it could be an issue. I'll try. > Another idea is maybe something like virtio-iommu? You mean, have legacy guests use virtio-iommu to request an IOMMU bypass? If so, it's an interesting idea for new guests but it doesn't help with guests that are out today in the field, which don't have A virtio-iommu driver. >> > Is there any justification to doing that beyond someone putting >> > out slow code in the past? >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> notion of memory access restrictions for the device. Unfortunately, on >> powerpc pSeries guests it also implies that the IOMMU is turned on > > IIUC that's really because on pSeries IOMMU is *always* turned on. > Platform has no way to say what you want it to say > which is bypass the iommu for the specific device. Yes, that's correct. pSeries guests running on KVM are in a gray area where theoretically they use an IOMMU but in practice KVM ignores it. It's unfortunate but it's the reality on the ground today. :-/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-22 0:05 ` Thiago Jung Bauermann (?) @ 2019-03-23 21:01 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-23 21:01 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> Another way of looking at this issue which also explains our reluctance > >> >> is that the only difference between a secure guest and a regular guest > >> >> (at least regarding virtio) is that the former uses swiotlb while the > >> >> latter doens't. > >> > > >> > But swiotlb is just one implementation. It's a guest internal thing. The > >> > issue is that memory isn't host accessible. > >> > >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> only ever try to access memory addresses that are supplied to it by the > >> guest, so all of the secure guest memory that the host cares about is > >> accessible: > >> > >> If this feature bit is set to 0, then the device has same access to > >> memory addresses supplied to it as the driver has. In particular, > >> the device will always use physical addresses matching addresses > >> used by the driver (typically meaning physical addresses used by the > >> CPU) and not translated further, and can access any address supplied > >> to it by the driver. When clear, this overrides any > >> platform-specific description of whether device access is limited or > >> translated in any way, e.g. whether an IOMMU may be present. > >> > >> All of the above is true for POWER guests, whether they are secure > >> guests or not. > >> > >> Or are you saying that a virtio device may want to access memory > >> addresses that weren't supplied to it by the driver? > > > > Your logic would apply to IOMMUs as well. For your mode, there are > > specific encrypted memory regions that driver has access to but device > > does not. that seems to violate the constraint. > > Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > the device can ignore the IOMMU for all practical purposes I would > indeed say that the logic would apply to IOMMUs as well. :-) > > I guess I'm still struggling with the purpose of signalling to the > driver that the host may not have access to memory addresses that it > will never try to access. For example, one of the benefits is to signal to host that driver does not expect ability to access all memory. If it does, host can fail initialization gracefully. > >> >> And from the device's point of view they're > >> >> indistinguishable. It can't tell one guest that is using swiotlb from > >> >> one that isn't. And that implies that secure guest vs regular guest > >> >> isn't a virtio interface issue, it's "guest internal affairs". So > >> >> there's no reason to reflect that in the feature flags. > >> > > >> > So don't. The way not to reflect that in the feature flags is > >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > >> > > >> > > >> > Without ACCESS_PLATFORM > >> > virtio has a very specific opinion about the security of the > >> > device, and that opinion is that device is part of the guest > >> > supervisor security domain. > >> > >> Sorry for being a bit dense, but not sure what "the device is part of > >> the guest supervisor security domain" means. In powerpc-speak, > >> "supervisor" is the operating system so perhaps that explains my > >> confusion. Are you saying that without ACCESS_PLATFORM, the guest > >> considers the host to be part of the guest operating system's security > >> domain? > > > > I think so. The spec says "device has same access as driver". > > Ok, makes sense. > > >> If so, does that have any other implication besides "the host > >> can access any address supplied to it by the driver"? If that is the > >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > >> include that information because it's not part of the current > >> definition. > >> > >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> > >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> > > >> > Well you do have that luxury. It looks like that there are existing > >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> > with how that path is slow. So you are trying to optimize for > >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> > to invoke DMA API. > >> > > >> > For example if there was another flag just like ACCESS_PLATFORM > >> > just not yet used by anyone, you would be all fine using that right? > >> > >> Yes, a new flag sounds like a great idea. What about the definition > >> below? > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> exception that the IOMMU is explicitly defined to be off or bypassed > >> when accessing memory addresses supplied to the device by the > >> driver. This flag should be set by the guest if offered, but to > >> allow for backward-compatibility device implementations allow for it > >> to be left unset by the guest. It is an error to set both this flag > >> and VIRTIO_F_ACCESS_PLATFORM. > > > > It looks kind of narrow but it's an option. > > Great! > > > I wonder how we'll define what's an iommu though. > > Hm, it didn't occur to me it could be an issue. I'll try. > > > Another idea is maybe something like virtio-iommu? > > You mean, have legacy guests use virtio-iommu to request an IOMMU > bypass? If so, it's an interesting idea for new guests but it doesn't > help with guests that are out today in the field, which don't have A > virtio-iommu driver. I presume legacy guests don't use encrypted memory so why do we worry about them at all? > >> > Is there any justification to doing that beyond someone putting > >> > out slow code in the past? > >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> notion of memory access restrictions for the device. Unfortunately, on > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > Platform has no way to say what you want it to say > > which is bypass the iommu for the specific device. > > Yes, that's correct. pSeries guests running on KVM are in a gray area > where theoretically they use an IOMMU but in practice KVM ignores it. > It's unfortunate but it's the reality on the ground today. :-/ Well it's not just the reality, virt setups need something that emulated IOMMUs don't provide. That is not uncommon, e.g. intel's VTD has a "cache mode" field which AFAIK is only used for virt. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-23 21:01 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-23 21:01 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> Another way of looking at this issue which also explains our reluctance > >> >> is that the only difference between a secure guest and a regular guest > >> >> (at least regarding virtio) is that the former uses swiotlb while the > >> >> latter doens't. > >> > > >> > But swiotlb is just one implementation. It's a guest internal thing. The > >> > issue is that memory isn't host accessible. > >> > >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> only ever try to access memory addresses that are supplied to it by the > >> guest, so all of the secure guest memory that the host cares about is > >> accessible: > >> > >> If this feature bit is set to 0, then the device has same access to > >> memory addresses supplied to it as the driver has. In particular, > >> the device will always use physical addresses matching addresses > >> used by the driver (typically meaning physical addresses used by the > >> CPU) and not translated further, and can access any address supplied > >> to it by the driver. When clear, this overrides any > >> platform-specific description of whether device access is limited or > >> translated in any way, e.g. whether an IOMMU may be present. > >> > >> All of the above is true for POWER guests, whether they are secure > >> guests or not. > >> > >> Or are you saying that a virtio device may want to access memory > >> addresses that weren't supplied to it by the driver? > > > > Your logic would apply to IOMMUs as well. For your mode, there are > > specific encrypted memory regions that driver has access to but device > > does not. that seems to violate the constraint. > > Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > the device can ignore the IOMMU for all practical purposes I would > indeed say that the logic would apply to IOMMUs as well. :-) > > I guess I'm still struggling with the purpose of signalling to the > driver that the host may not have access to memory addresses that it > will never try to access. For example, one of the benefits is to signal to host that driver does not expect ability to access all memory. If it does, host can fail initialization gracefully. > >> >> And from the device's point of view they're > >> >> indistinguishable. It can't tell one guest that is using swiotlb from > >> >> one that isn't. And that implies that secure guest vs regular guest > >> >> isn't a virtio interface issue, it's "guest internal affairs". So > >> >> there's no reason to reflect that in the feature flags. > >> > > >> > So don't. The way not to reflect that in the feature flags is > >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > >> > > >> > > >> > Without ACCESS_PLATFORM > >> > virtio has a very specific opinion about the security of the > >> > device, and that opinion is that device is part of the guest > >> > supervisor security domain. > >> > >> Sorry for being a bit dense, but not sure what "the device is part of > >> the guest supervisor security domain" means. In powerpc-speak, > >> "supervisor" is the operating system so perhaps that explains my > >> confusion. Are you saying that without ACCESS_PLATFORM, the guest > >> considers the host to be part of the guest operating system's security > >> domain? > > > > I think so. The spec says "device has same access as driver". > > Ok, makes sense. > > >> If so, does that have any other implication besides "the host > >> can access any address supplied to it by the driver"? If that is the > >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > >> include that information because it's not part of the current > >> definition. > >> > >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> > >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> > > >> > Well you do have that luxury. It looks like that there are existing > >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> > with how that path is slow. So you are trying to optimize for > >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> > to invoke DMA API. > >> > > >> > For example if there was another flag just like ACCESS_PLATFORM > >> > just not yet used by anyone, you would be all fine using that right? > >> > >> Yes, a new flag sounds like a great idea. What about the definition > >> below? > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> exception that the IOMMU is explicitly defined to be off or bypassed > >> when accessing memory addresses supplied to the device by the > >> driver. This flag should be set by the guest if offered, but to > >> allow for backward-compatibility device implementations allow for it > >> to be left unset by the guest. It is an error to set both this flag > >> and VIRTIO_F_ACCESS_PLATFORM. > > > > It looks kind of narrow but it's an option. > > Great! > > > I wonder how we'll define what's an iommu though. > > Hm, it didn't occur to me it could be an issue. I'll try. > > > Another idea is maybe something like virtio-iommu? > > You mean, have legacy guests use virtio-iommu to request an IOMMU > bypass? If so, it's an interesting idea for new guests but it doesn't > help with guests that are out today in the field, which don't have A > virtio-iommu driver. I presume legacy guests don't use encrypted memory so why do we worry about them at all? > >> > Is there any justification to doing that beyond someone putting > >> > out slow code in the past? > >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> notion of memory access restrictions for the device. Unfortunately, on > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > Platform has no way to say what you want it to say > > which is bypass the iommu for the specific device. > > Yes, that's correct. pSeries guests running on KVM are in a gray area > where theoretically they use an IOMMU but in practice KVM ignores it. > It's unfortunate but it's the reality on the ground today. :-/ Well it's not just the reality, virt setups need something that emulated IOMMUs don't provide. That is not uncommon, e.g. intel's VTD has a "cache mode" field which AFAIK is only used for virt. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-23 21:01 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-03-23 21:01 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> Another way of looking at this issue which also explains our reluctance > >> >> is that the only difference between a secure guest and a regular guest > >> >> (at least regarding virtio) is that the former uses swiotlb while the > >> >> latter doens't. > >> > > >> > But swiotlb is just one implementation. It's a guest internal thing. The > >> > issue is that memory isn't host accessible. > >> > >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> only ever try to access memory addresses that are supplied to it by the > >> guest, so all of the secure guest memory that the host cares about is > >> accessible: > >> > >> If this feature bit is set to 0, then the device has same access to > >> memory addresses supplied to it as the driver has. In particular, > >> the device will always use physical addresses matching addresses > >> used by the driver (typically meaning physical addresses used by the > >> CPU) and not translated further, and can access any address supplied > >> to it by the driver. When clear, this overrides any > >> platform-specific description of whether device access is limited or > >> translated in any way, e.g. whether an IOMMU may be present. > >> > >> All of the above is true for POWER guests, whether they are secure > >> guests or not. > >> > >> Or are you saying that a virtio device may want to access memory > >> addresses that weren't supplied to it by the driver? > > > > Your logic would apply to IOMMUs as well. For your mode, there are > > specific encrypted memory regions that driver has access to but device > > does not. that seems to violate the constraint. > > Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > the device can ignore the IOMMU for all practical purposes I would > indeed say that the logic would apply to IOMMUs as well. :-) > > I guess I'm still struggling with the purpose of signalling to the > driver that the host may not have access to memory addresses that it > will never try to access. For example, one of the benefits is to signal to host that driver does not expect ability to access all memory. If it does, host can fail initialization gracefully. > >> >> And from the device's point of view they're > >> >> indistinguishable. It can't tell one guest that is using swiotlb from > >> >> one that isn't. And that implies that secure guest vs regular guest > >> >> isn't a virtio interface issue, it's "guest internal affairs". So > >> >> there's no reason to reflect that in the feature flags. > >> > > >> > So don't. The way not to reflect that in the feature flags is > >> > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > >> > > >> > > >> > Without ACCESS_PLATFORM > >> > virtio has a very specific opinion about the security of the > >> > device, and that opinion is that device is part of the guest > >> > supervisor security domain. > >> > >> Sorry for being a bit dense, but not sure what "the device is part of > >> the guest supervisor security domain" means. In powerpc-speak, > >> "supervisor" is the operating system so perhaps that explains my > >> confusion. Are you saying that without ACCESS_PLATFORM, the guest > >> considers the host to be part of the guest operating system's security > >> domain? > > > > I think so. The spec says "device has same access as driver". > > Ok, makes sense. > > >> If so, does that have any other implication besides "the host > >> can access any address supplied to it by the driver"? If that is the > >> case, perhaps the definition of ACCESS_PLATFORM needs to be amended to > >> include that information because it's not part of the current > >> definition. > >> > >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> > >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> > > >> > Well you do have that luxury. It looks like that there are existing > >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> > with how that path is slow. So you are trying to optimize for > >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> > to invoke DMA API. > >> > > >> > For example if there was another flag just like ACCESS_PLATFORM > >> > just not yet used by anyone, you would be all fine using that right? > >> > >> Yes, a new flag sounds like a great idea. What about the definition > >> below? > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> exception that the IOMMU is explicitly defined to be off or bypassed > >> when accessing memory addresses supplied to the device by the > >> driver. This flag should be set by the guest if offered, but to > >> allow for backward-compatibility device implementations allow for it > >> to be left unset by the guest. It is an error to set both this flag > >> and VIRTIO_F_ACCESS_PLATFORM. > > > > It looks kind of narrow but it's an option. > > Great! > > > I wonder how we'll define what's an iommu though. > > Hm, it didn't occur to me it could be an issue. I'll try. > > > Another idea is maybe something like virtio-iommu? > > You mean, have legacy guests use virtio-iommu to request an IOMMU > bypass? If so, it's an interesting idea for new guests but it doesn't > help with guests that are out today in the field, which don't have A > virtio-iommu driver. I presume legacy guests don't use encrypted memory so why do we worry about them at all? > >> > Is there any justification to doing that beyond someone putting > >> > out slow code in the past? > >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> notion of memory access restrictions for the device. Unfortunately, on > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > Platform has no way to say what you want it to say > > which is bypass the iommu for the specific device. > > Yes, that's correct. pSeries guests running on KVM are in a gray area > where theoretically they use an IOMMU but in practice KVM ignores it. > It's unfortunate but it's the reality on the ground today. :-/ Well it's not just the reality, virt setups need something that emulated IOMMUs don't provide. That is not uncommon, e.g. intel's VTD has a "cache mode" field which AFAIK is only used for virt. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-23 21:01 ` Michael S. Tsirkin (?) @ 2019-03-25 0:57 ` David Gibson -1 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-03-25 0:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Thiago Jung Bauermann, virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson [-- Attachment #1: Type: text/plain, Size: 1684 bytes --] On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: [snip] > > >> > Is there any justification to doing that beyond someone putting > > >> > out slow code in the past? > > >> > > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > > >> notion of memory access restrictions for the device. Unfortunately, on > > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > > Platform has no way to say what you want it to say > > > which is bypass the iommu for the specific device. > > > > Yes, that's correct. pSeries guests running on KVM are in a gray area > > where theoretically they use an IOMMU but in practice KVM ignores it. > > It's unfortunate but it's the reality on the ground today. :-/ Um.. I'm not sure what you mean by this. As far as I'm concerned there is always a guest-visible (paravirtualized) IOMMU, and that will be backed onto the host IOMMU when necessary. [Actually there is an IOMMU bypass hack that's used by the guest firmware, but I don't think we want to expose that] > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-25 0:57 ` David Gibson 0 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-03-25 0:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig [-- Attachment #1.1: Type: text/plain, Size: 1684 bytes --] On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: [snip] > > >> > Is there any justification to doing that beyond someone putting > > >> > out slow code in the past? > > >> > > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > > >> notion of memory access restrictions for the device. Unfortunately, on > > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > > Platform has no way to say what you want it to say > > > which is bypass the iommu for the specific device. > > > > Yes, that's correct. pSeries guests running on KVM are in a gray area > > where theoretically they use an IOMMU but in practice KVM ignores it. > > It's unfortunate but it's the reality on the ground today. :-/ Um.. I'm not sure what you mean by this. As far as I'm concerned there is always a guest-visible (paravirtualized) IOMMU, and that will be backed onto the host IOMMU when necessary. [Actually there is an IOMMU bypass hack that's used by the guest firmware, but I don't think we want to expose that] > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-03-25 0:57 ` David Gibson 0 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-03-25 0:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, Thiago Jung Bauermann [-- Attachment #1: Type: text/plain, Size: 1684 bytes --] On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: [snip] > > >> > Is there any justification to doing that beyond someone putting > > >> > out slow code in the past? > > >> > > >> The definition of the ACCESS_PLATFORM flag is generic and captures the > > >> notion of memory access restrictions for the device. Unfortunately, on > > >> powerpc pSeries guests it also implies that the IOMMU is turned on > > > > > > IIUC that's really because on pSeries IOMMU is *always* turned on. > > > Platform has no way to say what you want it to say > > > which is bypass the iommu for the specific device. > > > > Yes, that's correct. pSeries guests running on KVM are in a gray area > > where theoretically they use an IOMMU but in practice KVM ignores it. > > It's unfortunate but it's the reality on the ground today. :-/ Um.. I'm not sure what you mean by this. As far as I'm concerned there is always a guest-visible (paravirtualized) IOMMU, and that will be backed onto the host IOMMU when necessary. [Actually there is an IOMMU bypass hack that's used by the guest firmware, but I don't think we want to expose that] > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-25 0:57 ` David Gibson (?) (?) @ 2019-04-17 21:42 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: David Gibson Cc: Michael S. Tsirkin, virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson David Gibson <david@gibson.dropbear.id.au> writes: > On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: >> On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> > Michael S. Tsirkin <mst@redhat.com> writes: > [snip] >> > >> > Is there any justification to doing that beyond someone putting >> > >> > out slow code in the past? >> > >> >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> > >> notion of memory access restrictions for the device. Unfortunately, on >> > >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > > >> > > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > > Platform has no way to say what you want it to say >> > > which is bypass the iommu for the specific device. >> > >> > Yes, that's correct. pSeries guests running on KVM are in a gray area >> > where theoretically they use an IOMMU but in practice KVM ignores it. >> > It's unfortunate but it's the reality on the ground today. :-/ > > Um.. I'm not sure what you mean by this. As far as I'm concerned > there is always a guest-visible (paravirtualized) IOMMU, and that will > be backed onto the host IOMMU when necessary. There is, but vhost will ignore it and directly map the guest memory when ACCESS_PLATFORM (the flag previously known as IOMMU_PLATFORM) isn't set. From QEMU's hw/virtio/vhost.c: static int vhost_dev_has_iommu(struct vhost_dev *dev) { VirtIODevice *vdev = dev->vdev; return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, hwaddr *plen, int is_write) { if (!vhost_dev_has_iommu(dev)) { return cpu_physical_memory_map(addr, plen, is_write); } else { return (void *)(uintptr_t)addr; } } -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: David Gibson Cc: Mike Anderson, Michael Roth, Michael S. Tsirkin, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig David Gibson <david@gibson.dropbear.id.au> writes: > On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: >> On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> > Michael S. Tsirkin <mst@redhat.com> writes: > [snip] >> > >> > Is there any justification to doing that beyond someone putting >> > >> > out slow code in the past? >> > >> >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> > >> notion of memory access restrictions for the device. Unfortunately, on >> > >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > > >> > > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > > Platform has no way to say what you want it to say >> > > which is bypass the iommu for the specific device. >> > >> > Yes, that's correct. pSeries guests running on KVM are in a gray area >> > where theoretically they use an IOMMU but in practice KVM ignores it. >> > It's unfortunate but it's the reality on the ground today. :-/ > > Um.. I'm not sure what you mean by this. As far as I'm concerned > there is always a guest-visible (paravirtualized) IOMMU, and that will > be backed onto the host IOMMU when necessary. There is, but vhost will ignore it and directly map the guest memory when ACCESS_PLATFORM (the flag previously known as IOMMU_PLATFORM) isn't set. From QEMU's hw/virtio/vhost.c: static int vhost_dev_has_iommu(struct vhost_dev *dev) { VirtIODevice *vdev = dev->vdev; return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, hwaddr *plen, int is_write) { if (!vhost_dev_has_iommu(dev)) { return cpu_physical_memory_map(addr, plen, is_write); } else { return (void *)(uintptr_t)addr; } } -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: David Gibson Cc: Mike Anderson, Michael S. Tsirkin, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig David Gibson <david@gibson.dropbear.id.au> writes: > On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: >> On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> > Michael S. Tsirkin <mst@redhat.com> writes: > [snip] >> > >> > Is there any justification to doing that beyond someone putting >> > >> > out slow code in the past? >> > >> >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> > >> notion of memory access restrictions for the device. Unfortunately, on >> > >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > > >> > > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > > Platform has no way to say what you want it to say >> > > which is bypass the iommu for the specific device. >> > >> > Yes, that's correct. pSeries guests running on KVM are in a gray area >> > where theoretically they use an IOMMU but in practice KVM ignores it. >> > It's unfortunate but it's the reality on the ground today. :-/ > > Um.. I'm not sure what you mean by this. As far as I'm concerned > there is always a guest-visible (paravirtualized) IOMMU, and that will > be backed onto the host IOMMU when necessary. There is, but vhost will ignore it and directly map the guest memory when ACCESS_PLATFORM (the flag previously known as IOMMU_PLATFORM) isn't set. From QEMU's hw/virtio/vhost.c: static int vhost_dev_has_iommu(struct vhost_dev *dev) { VirtIODevice *vdev = dev->vdev; return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, hwaddr *plen, int is_write) { if (!vhost_dev_has_iommu(dev)) { return cpu_physical_memory_map(addr, plen, is_write); } else { return (void *)(uintptr_t)addr; } } -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: David Gibson Cc: Mike Anderson, Michael Roth, Michael S. Tsirkin, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig David Gibson <david@gibson.dropbear.id.au> writes: > On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote: >> On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> > Michael S. Tsirkin <mst@redhat.com> writes: > [snip] >> > >> > Is there any justification to doing that beyond someone putting >> > >> > out slow code in the past? >> > >> >> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> > >> notion of memory access restrictions for the device. Unfortunately, on >> > >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > > >> > > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > > Platform has no way to say what you want it to say >> > > which is bypass the iommu for the specific device. >> > >> > Yes, that's correct. pSeries guests running on KVM are in a gray area >> > where theoretically they use an IOMMU but in practice KVM ignores it. >> > It's unfortunate but it's the reality on the ground today. :-/ > > Um.. I'm not sure what you mean by this. As far as I'm concerned > there is always a guest-visible (paravirtualized) IOMMU, and that will > be backed onto the host IOMMU when necessary. There is, but vhost will ignore it and directly map the guest memory when ACCESS_PLATFORM (the flag previously known as IOMMU_PLATFORM) isn't set. From QEMU's hw/virtio/vhost.c: static int vhost_dev_has_iommu(struct vhost_dev *dev) { VirtIODevice *vdev = dev->vdev; return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, hwaddr *plen, int is_write) { if (!vhost_dev_has_iommu(dev)) { return cpu_physical_memory_map(addr, plen, is_write); } else { return (void *)(uintptr_t)addr; } } -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-23 21:01 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-04-17 21:42 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> only ever try to access memory addresses that are supplied to it by the >> >> guest, so all of the secure guest memory that the host cares about is >> >> accessible: >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> memory addresses supplied to it as the driver has. In particular, >> >> the device will always use physical addresses matching addresses >> >> used by the driver (typically meaning physical addresses used by the >> >> CPU) and not translated further, and can access any address supplied >> >> to it by the driver. When clear, this overrides any >> >> platform-specific description of whether device access is limited or >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> guests or not. >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> addresses that weren't supplied to it by the driver? >> > >> > Your logic would apply to IOMMUs as well. For your mode, there are >> > specific encrypted memory regions that driver has access to but device >> > does not. that seems to violate the constraint. >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> the device can ignore the IOMMU for all practical purposes I would >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> I guess I'm still struggling with the purpose of signalling to the >> driver that the host may not have access to memory addresses that it >> will never try to access. > > For example, one of the benefits is to signal to host that driver does > not expect ability to access all memory. If it does, host can > fail initialization gracefully. But why would the ability to access all memory be necessary or even useful? When would the host access memory that the driver didn't tell it to access? >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> > >> >> > Well you do have that luxury. It looks like that there are existing >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> > with how that path is slow. So you are trying to optimize for >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> > to invoke DMA API. >> >> > >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> below? >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> when accessing memory addresses supplied to the device by the >> >> driver. This flag should be set by the guest if offered, but to >> >> allow for backward-compatibility device implementations allow for it >> >> to be left unset by the guest. It is an error to set both this flag >> >> and VIRTIO_F_ACCESS_PLATFORM. >> > >> > It looks kind of narrow but it's an option. >> >> Great! >> >> > I wonder how we'll define what's an iommu though. >> >> Hm, it didn't occur to me it could be an issue. I'll try. I rephrased it in terms of address translation. What do you think of this version? The flag name is slightly different too: VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that address translation is guaranteed to be unnecessary when accessing memory addresses supplied to the device by the driver. Which is to say, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. >> > Another idea is maybe something like virtio-iommu? >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> bypass? If so, it's an interesting idea for new guests but it doesn't >> help with guests that are out today in the field, which don't have A >> virtio-iommu driver. > > I presume legacy guests don't use encrypted memory so why do we > worry about them at all? They don't use encrypted memory, but a host machine will run a mix of secure and legacy guests. And since the hypervisor doesn't know whether a guest will be secure or not at the time it is launched, legacy guests will have to be launched with the same configuration as secure guests. >> >> > Is there any justification to doing that beyond someone putting >> >> > out slow code in the past? >> >> >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> >> notion of memory access restrictions for the device. Unfortunately, on >> >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > Platform has no way to say what you want it to say >> > which is bypass the iommu for the specific device. >> >> Yes, that's correct. pSeries guests running on KVM are in a gray area >> where theoretically they use an IOMMU but in practice KVM ignores it. >> It's unfortunate but it's the reality on the ground today. :-/ > > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. That's good to know. Thanks for this example. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-03-23 21:01 ` Michael S. Tsirkin (?) (?) @ 2019-04-17 21:42 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> only ever try to access memory addresses that are supplied to it by the >> >> guest, so all of the secure guest memory that the host cares about is >> >> accessible: >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> memory addresses supplied to it as the driver has. In particular, >> >> the device will always use physical addresses matching addresses >> >> used by the driver (typically meaning physical addresses used by the >> >> CPU) and not translated further, and can access any address supplied >> >> to it by the driver. When clear, this overrides any >> >> platform-specific description of whether device access is limited or >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> guests or not. >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> addresses that weren't supplied to it by the driver? >> > >> > Your logic would apply to IOMMUs as well. For your mode, there are >> > specific encrypted memory regions that driver has access to but device >> > does not. that seems to violate the constraint. >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> the device can ignore the IOMMU for all practical purposes I would >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> I guess I'm still struggling with the purpose of signalling to the >> driver that the host may not have access to memory addresses that it >> will never try to access. > > For example, one of the benefits is to signal to host that driver does > not expect ability to access all memory. If it does, host can > fail initialization gracefully. But why would the ability to access all memory be necessary or even useful? When would the host access memory that the driver didn't tell it to access? >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> > >> >> > Well you do have that luxury. It looks like that there are existing >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> > with how that path is slow. So you are trying to optimize for >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> > to invoke DMA API. >> >> > >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> below? >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> when accessing memory addresses supplied to the device by the >> >> driver. This flag should be set by the guest if offered, but to >> >> allow for backward-compatibility device implementations allow for it >> >> to be left unset by the guest. It is an error to set both this flag >> >> and VIRTIO_F_ACCESS_PLATFORM. >> > >> > It looks kind of narrow but it's an option. >> >> Great! >> >> > I wonder how we'll define what's an iommu though. >> >> Hm, it didn't occur to me it could be an issue. I'll try. I rephrased it in terms of address translation. What do you think of this version? The flag name is slightly different too: VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that address translation is guaranteed to be unnecessary when accessing memory addresses supplied to the device by the driver. Which is to say, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. >> > Another idea is maybe something like virtio-iommu? >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> bypass? If so, it's an interesting idea for new guests but it doesn't >> help with guests that are out today in the field, which don't have A >> virtio-iommu driver. > > I presume legacy guests don't use encrypted memory so why do we > worry about them at all? They don't use encrypted memory, but a host machine will run a mix of secure and legacy guests. And since the hypervisor doesn't know whether a guest will be secure or not at the time it is launched, legacy guests will have to be launched with the same configuration as secure guests. >> >> > Is there any justification to doing that beyond someone putting >> >> > out slow code in the past? >> >> >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> >> notion of memory access restrictions for the device. Unfortunately, on >> >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > Platform has no way to say what you want it to say >> > which is bypass the iommu for the specific device. >> >> Yes, that's correct. pSeries guests running on KVM are in a gray area >> where theoretically they use an IOMMU but in practice KVM ignores it. >> It's unfortunate but it's the reality on the ground today. :-/ > > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. That's good to know. Thanks for this example. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> only ever try to access memory addresses that are supplied to it by the >> >> guest, so all of the secure guest memory that the host cares about is >> >> accessible: >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> memory addresses supplied to it as the driver has. In particular, >> >> the device will always use physical addresses matching addresses >> >> used by the driver (typically meaning physical addresses used by the >> >> CPU) and not translated further, and can access any address supplied >> >> to it by the driver. When clear, this overrides any >> >> platform-specific description of whether device access is limited or >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> guests or not. >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> addresses that weren't supplied to it by the driver? >> > >> > Your logic would apply to IOMMUs as well. For your mode, there are >> > specific encrypted memory regions that driver has access to but device >> > does not. that seems to violate the constraint. >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> the device can ignore the IOMMU for all practical purposes I would >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> I guess I'm still struggling with the purpose of signalling to the >> driver that the host may not have access to memory addresses that it >> will never try to access. > > For example, one of the benefits is to signal to host that driver does > not expect ability to access all memory. If it does, host can > fail initialization gracefully. But why would the ability to access all memory be necessary or even useful? When would the host access memory that the driver didn't tell it to access? >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> > >> >> > Well you do have that luxury. It looks like that there are existing >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> > with how that path is slow. So you are trying to optimize for >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> > to invoke DMA API. >> >> > >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> below? >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> when accessing memory addresses supplied to the device by the >> >> driver. This flag should be set by the guest if offered, but to >> >> allow for backward-compatibility device implementations allow for it >> >> to be left unset by the guest. It is an error to set both this flag >> >> and VIRTIO_F_ACCESS_PLATFORM. >> > >> > It looks kind of narrow but it's an option. >> >> Great! >> >> > I wonder how we'll define what's an iommu though. >> >> Hm, it didn't occur to me it could be an issue. I'll try. I rephrased it in terms of address translation. What do you think of this version? The flag name is slightly different too: VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that address translation is guaranteed to be unnecessary when accessing memory addresses supplied to the device by the driver. Which is to say, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. >> > Another idea is maybe something like virtio-iommu? >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> bypass? If so, it's an interesting idea for new guests but it doesn't >> help with guests that are out today in the field, which don't have A >> virtio-iommu driver. > > I presume legacy guests don't use encrypted memory so why do we > worry about them at all? They don't use encrypted memory, but a host machine will run a mix of secure and legacy guests. And since the hypervisor doesn't know whether a guest will be secure or not at the time it is launched, legacy guests will have to be launched with the same configuration as secure guests. >> >> > Is there any justification to doing that beyond someone putting >> >> > out slow code in the past? >> >> >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> >> notion of memory access restrictions for the device. Unfortunately, on >> >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > Platform has no way to say what you want it to say >> > which is bypass the iommu for the specific device. >> >> Yes, that's correct. pSeries guests running on KVM are in a gray area >> where theoretically they use an IOMMU but in practice KVM ignores it. >> It's unfortunate but it's the reality on the ground today. :-/ > > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. That's good to know. Thanks for this example. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel-u79uwXL29TY76Z2rM5mHXA, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Mackerras, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> only ever try to access memory addresses that are supplied to it by the >> >> guest, so all of the secure guest memory that the host cares about is >> >> accessible: >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> memory addresses supplied to it as the driver has. In particular, >> >> the device will always use physical addresses matching addresses >> >> used by the driver (typically meaning physical addresses used by the >> >> CPU) and not translated further, and can access any address supplied >> >> to it by the driver. When clear, this overrides any >> >> platform-specific description of whether device access is limited or >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> guests or not. >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> addresses that weren't supplied to it by the driver? >> > >> > Your logic would apply to IOMMUs as well. For your mode, there are >> > specific encrypted memory regions that driver has access to but device >> > does not. that seems to violate the constraint. >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> the device can ignore the IOMMU for all practical purposes I would >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> I guess I'm still struggling with the purpose of signalling to the >> driver that the host may not have access to memory addresses that it >> will never try to access. > > For example, one of the benefits is to signal to host that driver does > not expect ability to access all memory. If it does, host can > fail initialization gracefully. But why would the ability to access all memory be necessary or even useful? When would the host access memory that the driver didn't tell it to access? >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> > >> >> > Well you do have that luxury. It looks like that there are existing >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> > with how that path is slow. So you are trying to optimize for >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> > to invoke DMA API. >> >> > >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> below? >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> when accessing memory addresses supplied to the device by the >> >> driver. This flag should be set by the guest if offered, but to >> >> allow for backward-compatibility device implementations allow for it >> >> to be left unset by the guest. It is an error to set both this flag >> >> and VIRTIO_F_ACCESS_PLATFORM. >> > >> > It looks kind of narrow but it's an option. >> >> Great! >> >> > I wonder how we'll define what's an iommu though. >> >> Hm, it didn't occur to me it could be an issue. I'll try. I rephrased it in terms of address translation. What do you think of this version? The flag name is slightly different too: VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that address translation is guaranteed to be unnecessary when accessing memory addresses supplied to the device by the driver. Which is to say, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. >> > Another idea is maybe something like virtio-iommu? >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> bypass? If so, it's an interesting idea for new guests but it doesn't >> help with guests that are out today in the field, which don't have A >> virtio-iommu driver. > > I presume legacy guests don't use encrypted memory so why do we > worry about them at all? They don't use encrypted memory, but a host machine will run a mix of secure and legacy guests. And since the hypervisor doesn't know whether a guest will be secure or not at the time it is launched, legacy guests will have to be launched with the same configuration as secure guests. >> >> > Is there any justification to doing that beyond someone putting >> >> > out slow code in the past? >> >> >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> >> notion of memory access restrictions for the device. Unfortunately, on >> >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > Platform has no way to say what you want it to say >> > which is bypass the iommu for the specific device. >> >> Yes, that's correct. pSeries guests running on KVM are in a gray area >> where theoretically they use an IOMMU but in practice KVM ignores it. >> It's unfortunate but it's the reality on the ground today. :-/ > > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. That's good to know. Thanks for this example. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-17 21:42 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-17 21:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> only ever try to access memory addresses that are supplied to it by the >> >> guest, so all of the secure guest memory that the host cares about is >> >> accessible: >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> memory addresses supplied to it as the driver has. In particular, >> >> the device will always use physical addresses matching addresses >> >> used by the driver (typically meaning physical addresses used by the >> >> CPU) and not translated further, and can access any address supplied >> >> to it by the driver. When clear, this overrides any >> >> platform-specific description of whether device access is limited or >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> guests or not. >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> addresses that weren't supplied to it by the driver? >> > >> > Your logic would apply to IOMMUs as well. For your mode, there are >> > specific encrypted memory regions that driver has access to but device >> > does not. that seems to violate the constraint. >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> the device can ignore the IOMMU for all practical purposes I would >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> I guess I'm still struggling with the purpose of signalling to the >> driver that the host may not have access to memory addresses that it >> will never try to access. > > For example, one of the benefits is to signal to host that driver does > not expect ability to access all memory. If it does, host can > fail initialization gracefully. But why would the ability to access all memory be necessary or even useful? When would the host access memory that the driver didn't tell it to access? >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> > >> >> > Well you do have that luxury. It looks like that there are existing >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> > with how that path is slow. So you are trying to optimize for >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> > to invoke DMA API. >> >> > >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> below? >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> when accessing memory addresses supplied to the device by the >> >> driver. This flag should be set by the guest if offered, but to >> >> allow for backward-compatibility device implementations allow for it >> >> to be left unset by the guest. It is an error to set both this flag >> >> and VIRTIO_F_ACCESS_PLATFORM. >> > >> > It looks kind of narrow but it's an option. >> >> Great! >> >> > I wonder how we'll define what's an iommu though. >> >> Hm, it didn't occur to me it could be an issue. I'll try. I rephrased it in terms of address translation. What do you think of this version? The flag name is slightly different too: VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that address translation is guaranteed to be unnecessary when accessing memory addresses supplied to the device by the driver. Which is to say, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. >> > Another idea is maybe something like virtio-iommu? >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> bypass? If so, it's an interesting idea for new guests but it doesn't >> help with guests that are out today in the field, which don't have A >> virtio-iommu driver. > > I presume legacy guests don't use encrypted memory so why do we > worry about them at all? They don't use encrypted memory, but a host machine will run a mix of secure and legacy guests. And since the hypervisor doesn't know whether a guest will be secure or not at the time it is launched, legacy guests will have to be launched with the same configuration as secure guests. >> >> > Is there any justification to doing that beyond someone putting >> >> > out slow code in the past? >> >> >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the >> >> notion of memory access restrictions for the device. Unfortunately, on >> >> powerpc pSeries guests it also implies that the IOMMU is turned on >> > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. >> > Platform has no way to say what you want it to say >> > which is bypass the iommu for the specific device. >> >> Yes, that's correct. pSeries guests running on KVM are in a gray area >> where theoretically they use an IOMMU but in practice KVM ignores it. >> It's unfortunate but it's the reality on the ground today. :-/ > > Well it's not just the reality, virt setups need something that > emulated IOMMUs don't provide. That is not uncommon, e.g. > intel's VTD has a "cache mode" field which AFAIK is only used for virt. That's good to know. Thanks for this example. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-17 21:42 ` Thiago Jung Bauermann ` (2 preceding siblings ...) (?) @ 2019-04-19 23:09 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-19 23:09 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> only ever try to access memory addresses that are supplied to it by the > >> >> guest, so all of the secure guest memory that the host cares about is > >> >> accessible: > >> >> > >> >> If this feature bit is set to 0, then the device has same access to > >> >> memory addresses supplied to it as the driver has. In particular, > >> >> the device will always use physical addresses matching addresses > >> >> used by the driver (typically meaning physical addresses used by the > >> >> CPU) and not translated further, and can access any address supplied > >> >> to it by the driver. When clear, this overrides any > >> >> platform-specific description of whether device access is limited or > >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> > >> >> All of the above is true for POWER guests, whether they are secure > >> >> guests or not. > >> >> > >> >> Or are you saying that a virtio device may want to access memory > >> >> addresses that weren't supplied to it by the driver? > >> > > >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> > specific encrypted memory regions that driver has access to but device > >> > does not. that seems to violate the constraint. > >> > >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> the device can ignore the IOMMU for all practical purposes I would > >> indeed say that the logic would apply to IOMMUs as well. :-) > >> > >> I guess I'm still struggling with the purpose of signalling to the > >> driver that the host may not have access to memory addresses that it > >> will never try to access. > > > > For example, one of the benefits is to signal to host that driver does > > not expect ability to access all memory. If it does, host can > > fail initialization gracefully. > > But why would the ability to access all memory be necessary or even > useful? When would the host access memory that the driver didn't tell it > to access? When I say all memory I mean even memory not allowed by the IOMMU. > >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> > >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> > > >> >> > Well you do have that luxury. It looks like that there are existing > >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> > with how that path is slow. So you are trying to optimize for > >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> > to invoke DMA API. > >> >> > > >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> > >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> below? > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> when accessing memory addresses supplied to the device by the > >> >> driver. This flag should be set by the guest if offered, but to > >> >> allow for backward-compatibility device implementations allow for it > >> >> to be left unset by the guest. It is an error to set both this flag > >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > It looks kind of narrow but it's an option. > >> > >> Great! > >> > >> > I wonder how we'll define what's an iommu though. > >> > >> Hm, it didn't occur to me it could be an issue. I'll try. > > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. Thanks, I'll think about this approach. Will respond next week. > >> > Another idea is maybe something like virtio-iommu? > >> > >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> help with guests that are out today in the field, which don't have A > >> virtio-iommu driver. > > > > I presume legacy guests don't use encrypted memory so why do we > > worry about them at all? > > They don't use encrypted memory, but a host machine will run a mix of > secure and legacy guests. And since the hypervisor doesn't know whether > a guest will be secure or not at the time it is launched, legacy guests > will have to be launched with the same configuration as secure guests. OK and so I think the issue is that hosts generally fail if they set ACCESS_PLATFORM and guests do not negotiate it. So you can not just set ACCESS_PLATFORM for everyone. Is that the issue here? > >> >> > Is there any justification to doing that beyond someone putting > >> >> > out slow code in the past? > >> >> > >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> >> notion of memory access restrictions for the device. Unfortunately, on > >> >> powerpc pSeries guests it also implies that the IOMMU is turned on > >> > > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. > >> > Platform has no way to say what you want it to say > >> > which is bypass the iommu for the specific device. > >> > >> Yes, that's correct. pSeries guests running on KVM are in a gray area > >> where theoretically they use an IOMMU but in practice KVM ignores it. > >> It's unfortunate but it's the reality on the ground today. :-/ > > > > Well it's not just the reality, virt setups need something that > > emulated IOMMUs don't provide. That is not uncommon, e.g. > > intel's VTD has a "cache mode" field which AFAIK is only used for virt. > > That's good to know. Thanks for this example. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-17 21:42 ` Thiago Jung Bauermann (?) @ 2019-04-19 23:09 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-19 23:09 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> only ever try to access memory addresses that are supplied to it by the > >> >> guest, so all of the secure guest memory that the host cares about is > >> >> accessible: > >> >> > >> >> If this feature bit is set to 0, then the device has same access to > >> >> memory addresses supplied to it as the driver has. In particular, > >> >> the device will always use physical addresses matching addresses > >> >> used by the driver (typically meaning physical addresses used by the > >> >> CPU) and not translated further, and can access any address supplied > >> >> to it by the driver. When clear, this overrides any > >> >> platform-specific description of whether device access is limited or > >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> > >> >> All of the above is true for POWER guests, whether they are secure > >> >> guests or not. > >> >> > >> >> Or are you saying that a virtio device may want to access memory > >> >> addresses that weren't supplied to it by the driver? > >> > > >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> > specific encrypted memory regions that driver has access to but device > >> > does not. that seems to violate the constraint. > >> > >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> the device can ignore the IOMMU for all practical purposes I would > >> indeed say that the logic would apply to IOMMUs as well. :-) > >> > >> I guess I'm still struggling with the purpose of signalling to the > >> driver that the host may not have access to memory addresses that it > >> will never try to access. > > > > For example, one of the benefits is to signal to host that driver does > > not expect ability to access all memory. If it does, host can > > fail initialization gracefully. > > But why would the ability to access all memory be necessary or even > useful? When would the host access memory that the driver didn't tell it > to access? When I say all memory I mean even memory not allowed by the IOMMU. > >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> > >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> > > >> >> > Well you do have that luxury. It looks like that there are existing > >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> > with how that path is slow. So you are trying to optimize for > >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> > to invoke DMA API. > >> >> > > >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> > >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> below? > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> when accessing memory addresses supplied to the device by the > >> >> driver. This flag should be set by the guest if offered, but to > >> >> allow for backward-compatibility device implementations allow for it > >> >> to be left unset by the guest. It is an error to set both this flag > >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > It looks kind of narrow but it's an option. > >> > >> Great! > >> > >> > I wonder how we'll define what's an iommu though. > >> > >> Hm, it didn't occur to me it could be an issue. I'll try. > > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. Thanks, I'll think about this approach. Will respond next week. > >> > Another idea is maybe something like virtio-iommu? > >> > >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> help with guests that are out today in the field, which don't have A > >> virtio-iommu driver. > > > > I presume legacy guests don't use encrypted memory so why do we > > worry about them at all? > > They don't use encrypted memory, but a host machine will run a mix of > secure and legacy guests. And since the hypervisor doesn't know whether > a guest will be secure or not at the time it is launched, legacy guests > will have to be launched with the same configuration as secure guests. OK and so I think the issue is that hosts generally fail if they set ACCESS_PLATFORM and guests do not negotiate it. So you can not just set ACCESS_PLATFORM for everyone. Is that the issue here? > >> >> > Is there any justification to doing that beyond someone putting > >> >> > out slow code in the past? > >> >> > >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> >> notion of memory access restrictions for the device. Unfortunately, on > >> >> powerpc pSeries guests it also implies that the IOMMU is turned on > >> > > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. > >> > Platform has no way to say what you want it to say > >> > which is bypass the iommu for the specific device. > >> > >> Yes, that's correct. pSeries guests running on KVM are in a gray area > >> where theoretically they use an IOMMU but in practice KVM ignores it. > >> It's unfortunate but it's the reality on the ground today. :-/ > > > > Well it's not just the reality, virt setups need something that > > emulated IOMMUs don't provide. That is not uncommon, e.g. > > intel's VTD has a "cache mode" field which AFAIK is only used for virt. > > That's good to know. Thanks for this example. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-19 23:09 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-19 23:09 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> only ever try to access memory addresses that are supplied to it by the > >> >> guest, so all of the secure guest memory that the host cares about is > >> >> accessible: > >> >> > >> >> If this feature bit is set to 0, then the device has same access to > >> >> memory addresses supplied to it as the driver has. In particular, > >> >> the device will always use physical addresses matching addresses > >> >> used by the driver (typically meaning physical addresses used by the > >> >> CPU) and not translated further, and can access any address supplied > >> >> to it by the driver. When clear, this overrides any > >> >> platform-specific description of whether device access is limited or > >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> > >> >> All of the above is true for POWER guests, whether they are secure > >> >> guests or not. > >> >> > >> >> Or are you saying that a virtio device may want to access memory > >> >> addresses that weren't supplied to it by the driver? > >> > > >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> > specific encrypted memory regions that driver has access to but device > >> > does not. that seems to violate the constraint. > >> > >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> the device can ignore the IOMMU for all practical purposes I would > >> indeed say that the logic would apply to IOMMUs as well. :-) > >> > >> I guess I'm still struggling with the purpose of signalling to the > >> driver that the host may not have access to memory addresses that it > >> will never try to access. > > > > For example, one of the benefits is to signal to host that driver does > > not expect ability to access all memory. If it does, host can > > fail initialization gracefully. > > But why would the ability to access all memory be necessary or even > useful? When would the host access memory that the driver didn't tell it > to access? When I say all memory I mean even memory not allowed by the IOMMU. > >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> > >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> > > >> >> > Well you do have that luxury. It looks like that there are existing > >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> > with how that path is slow. So you are trying to optimize for > >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> > to invoke DMA API. > >> >> > > >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> > >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> below? > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> when accessing memory addresses supplied to the device by the > >> >> driver. This flag should be set by the guest if offered, but to > >> >> allow for backward-compatibility device implementations allow for it > >> >> to be left unset by the guest. It is an error to set both this flag > >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > It looks kind of narrow but it's an option. > >> > >> Great! > >> > >> > I wonder how we'll define what's an iommu though. > >> > >> Hm, it didn't occur to me it could be an issue. I'll try. > > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. Thanks, I'll think about this approach. Will respond next week. > >> > Another idea is maybe something like virtio-iommu? > >> > >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> help with guests that are out today in the field, which don't have A > >> virtio-iommu driver. > > > > I presume legacy guests don't use encrypted memory so why do we > > worry about them at all? > > They don't use encrypted memory, but a host machine will run a mix of > secure and legacy guests. And since the hypervisor doesn't know whether > a guest will be secure or not at the time it is launched, legacy guests > will have to be launched with the same configuration as secure guests. OK and so I think the issue is that hosts generally fail if they set ACCESS_PLATFORM and guests do not negotiate it. So you can not just set ACCESS_PLATFORM for everyone. Is that the issue here? > >> >> > Is there any justification to doing that beyond someone putting > >> >> > out slow code in the past? > >> >> > >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> >> notion of memory access restrictions for the device. Unfortunately, on > >> >> powerpc pSeries guests it also implies that the IOMMU is turned on > >> > > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. > >> > Platform has no way to say what you want it to say > >> > which is bypass the iommu for the specific device. > >> > >> Yes, that's correct. pSeries guests running on KVM are in a gray area > >> where theoretically they use an IOMMU but in practice KVM ignores it. > >> It's unfortunate but it's the reality on the ground today. :-/ > > > > Well it's not just the reality, virt setups need something that > > emulated IOMMUs don't provide. That is not uncommon, e.g. > > intel's VTD has a "cache mode" field which AFAIK is only used for virt. > > That's good to know. Thanks for this example. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-19 23:09 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-19 23:09 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> only ever try to access memory addresses that are supplied to it by the > >> >> guest, so all of the secure guest memory that the host cares about is > >> >> accessible: > >> >> > >> >> If this feature bit is set to 0, then the device has same access to > >> >> memory addresses supplied to it as the driver has. In particular, > >> >> the device will always use physical addresses matching addresses > >> >> used by the driver (typically meaning physical addresses used by the > >> >> CPU) and not translated further, and can access any address supplied > >> >> to it by the driver. When clear, this overrides any > >> >> platform-specific description of whether device access is limited or > >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> > >> >> All of the above is true for POWER guests, whether they are secure > >> >> guests or not. > >> >> > >> >> Or are you saying that a virtio device may want to access memory > >> >> addresses that weren't supplied to it by the driver? > >> > > >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> > specific encrypted memory regions that driver has access to but device > >> > does not. that seems to violate the constraint. > >> > >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> the device can ignore the IOMMU for all practical purposes I would > >> indeed say that the logic would apply to IOMMUs as well. :-) > >> > >> I guess I'm still struggling with the purpose of signalling to the > >> driver that the host may not have access to memory addresses that it > >> will never try to access. > > > > For example, one of the benefits is to signal to host that driver does > > not expect ability to access all memory. If it does, host can > > fail initialization gracefully. > > But why would the ability to access all memory be necessary or even > useful? When would the host access memory that the driver didn't tell it > to access? When I say all memory I mean even memory not allowed by the IOMMU. > >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> > >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> > > >> >> > Well you do have that luxury. It looks like that there are existing > >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> > with how that path is slow. So you are trying to optimize for > >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> > to invoke DMA API. > >> >> > > >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> > >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> below? > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> when accessing memory addresses supplied to the device by the > >> >> driver. This flag should be set by the guest if offered, but to > >> >> allow for backward-compatibility device implementations allow for it > >> >> to be left unset by the guest. It is an error to set both this flag > >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > It looks kind of narrow but it's an option. > >> > >> Great! > >> > >> > I wonder how we'll define what's an iommu though. > >> > >> Hm, it didn't occur to me it could be an issue. I'll try. > > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. Thanks, I'll think about this approach. Will respond next week. > >> > Another idea is maybe something like virtio-iommu? > >> > >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> help with guests that are out today in the field, which don't have A > >> virtio-iommu driver. > > > > I presume legacy guests don't use encrypted memory so why do we > > worry about them at all? > > They don't use encrypted memory, but a host machine will run a mix of > secure and legacy guests. And since the hypervisor doesn't know whether > a guest will be secure or not at the time it is launched, legacy guests > will have to be launched with the same configuration as secure guests. OK and so I think the issue is that hosts generally fail if they set ACCESS_PLATFORM and guests do not negotiate it. So you can not just set ACCESS_PLATFORM for everyone. Is that the issue here? > >> >> > Is there any justification to doing that beyond someone putting > >> >> > out slow code in the past? > >> >> > >> >> The definition of the ACCESS_PLATFORM flag is generic and captures the > >> >> notion of memory access restrictions for the device. Unfortunately, on > >> >> powerpc pSeries guests it also implies that the IOMMU is turned on > >> > > >> > IIUC that's really because on pSeries IOMMU is *always* turned on. > >> > Platform has no way to say what you want it to say > >> > which is bypass the iommu for the specific device. > >> > >> Yes, that's correct. pSeries guests running on KVM are in a gray area > >> where theoretically they use an IOMMU but in practice KVM ignores it. > >> It's unfortunate but it's the reality on the ground today. :-/ > > > > Well it's not just the reality, virt setups need something that > > emulated IOMMUs don't provide. That is not uncommon, e.g. > > intel's VTD has a "cache mode" field which AFAIK is only used for virt. > > That's good to know. Thanks for this example. > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-19 23:09 ` Michael S. Tsirkin (?) @ 2019-04-25 1:01 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-25 1:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> accessible: >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> the device will always use physical addresses matching addresses >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> to it by the driver. When clear, this overrides any >> >> >> platform-specific description of whether device access is limited or >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> guests or not. >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> addresses that weren't supplied to it by the driver? >> >> > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> > specific encrypted memory regions that driver has access to but device >> >> > does not. that seems to violate the constraint. >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> the device can ignore the IOMMU for all practical purposes I would >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> driver that the host may not have access to memory addresses that it >> >> will never try to access. >> > >> > For example, one of the benefits is to signal to host that driver does >> > not expect ability to access all memory. If it does, host can >> > fail initialization gracefully. >> >> But why would the ability to access all memory be necessary or even >> useful? When would the host access memory that the driver didn't tell it >> to access? > > When I say all memory I mean even memory not allowed by the IOMMU. Yes, but why? How is that memory relevant? >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> >> > >> >> >> > Well you do have that luxury. It looks like that there are existing >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> >> > with how that path is slow. So you are trying to optimize for >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> >> > to invoke DMA API. >> >> >> > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> >> below? >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> >> when accessing memory addresses supplied to the device by the >> >> >> driver. This flag should be set by the guest if offered, but to >> >> >> allow for backward-compatibility device implementations allow for it >> >> >> to be left unset by the guest. It is an error to set both this flag >> >> >> and VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > It looks kind of narrow but it's an option. >> >> >> >> Great! >> >> >> >> > I wonder how we'll define what's an iommu though. >> >> >> >> Hm, it didn't occur to me it could be an issue. I'll try. >> >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > Thanks, I'll think about this approach. Will respond next week. Thanks! >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> help with guests that are out today in the field, which don't have A >> >> virtio-iommu driver. >> > >> > I presume legacy guests don't use encrypted memory so why do we >> > worry about them at all? >> >> They don't use encrypted memory, but a host machine will run a mix of >> secure and legacy guests. And since the hypervisor doesn't know whether >> a guest will be secure or not at the time it is launched, legacy guests >> will have to be launched with the same configuration as secure guests. > > OK and so I think the issue is that hosts generally fail if they set > ACCESS_PLATFORM and guests do not negotiate it. > So you can not just set ACCESS_PLATFORM for everyone. > Is that the issue here? Yes, that is one half of the issue. The other is that even if hosts didn't fail, existing legacy guests wouldn't "take the initiative" of not negotiating ACCESS_PLATFORM to get the improved performance. They'd have to be modified to do that. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-25 1:01 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-25 1:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> accessible: >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> the device will always use physical addresses matching addresses >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> to it by the driver. When clear, this overrides any >> >> >> platform-specific description of whether device access is limited or >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> guests or not. >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> addresses that weren't supplied to it by the driver? >> >> > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> > specific encrypted memory regions that driver has access to but device >> >> > does not. that seems to violate the constraint. >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> the device can ignore the IOMMU for all practical purposes I would >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> driver that the host may not have access to memory addresses that it >> >> will never try to access. >> > >> > For example, one of the benefits is to signal to host that driver does >> > not expect ability to access all memory. If it does, host can >> > fail initialization gracefully. >> >> But why would the ability to access all memory be necessary or even >> useful? When would the host access memory that the driver didn't tell it >> to access? > > When I say all memory I mean even memory not allowed by the IOMMU. Yes, but why? How is that memory relevant? >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> >> > >> >> >> > Well you do have that luxury. It looks like that there are existing >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> >> > with how that path is slow. So you are trying to optimize for >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> >> > to invoke DMA API. >> >> >> > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> >> below? >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> >> when accessing memory addresses supplied to the device by the >> >> >> driver. This flag should be set by the guest if offered, but to >> >> >> allow for backward-compatibility device implementations allow for it >> >> >> to be left unset by the guest. It is an error to set both this flag >> >> >> and VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > It looks kind of narrow but it's an option. >> >> >> >> Great! >> >> >> >> > I wonder how we'll define what's an iommu though. >> >> >> >> Hm, it didn't occur to me it could be an issue. I'll try. >> >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > Thanks, I'll think about this approach. Will respond next week. Thanks! >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> help with guests that are out today in the field, which don't have A >> >> virtio-iommu driver. >> > >> > I presume legacy guests don't use encrypted memory so why do we >> > worry about them at all? >> >> They don't use encrypted memory, but a host machine will run a mix of >> secure and legacy guests. And since the hypervisor doesn't know whether >> a guest will be secure or not at the time it is launched, legacy guests >> will have to be launched with the same configuration as secure guests. > > OK and so I think the issue is that hosts generally fail if they set > ACCESS_PLATFORM and guests do not negotiate it. > So you can not just set ACCESS_PLATFORM for everyone. > Is that the issue here? Yes, that is one half of the issue. The other is that even if hosts didn't fail, existing legacy guests wouldn't "take the initiative" of not negotiating ACCESS_PLATFORM to get the improved performance. They'd have to be modified to do that. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-25 1:01 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-25 1:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> accessible: >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> the device will always use physical addresses matching addresses >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> to it by the driver. When clear, this overrides any >> >> >> platform-specific description of whether device access is limited or >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> guests or not. >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> addresses that weren't supplied to it by the driver? >> >> > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> > specific encrypted memory regions that driver has access to but device >> >> > does not. that seems to violate the constraint. >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> the device can ignore the IOMMU for all practical purposes I would >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> driver that the host may not have access to memory addresses that it >> >> will never try to access. >> > >> > For example, one of the benefits is to signal to host that driver does >> > not expect ability to access all memory. If it does, host can >> > fail initialization gracefully. >> >> But why would the ability to access all memory be necessary or even >> useful? When would the host access memory that the driver didn't tell it >> to access? > > When I say all memory I mean even memory not allowed by the IOMMU. Yes, but why? How is that memory relevant? >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> >> > >> >> >> > Well you do have that luxury. It looks like that there are existing >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> >> > with how that path is slow. So you are trying to optimize for >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> >> > to invoke DMA API. >> >> >> > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> >> below? >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> >> when accessing memory addresses supplied to the device by the >> >> >> driver. This flag should be set by the guest if offered, but to >> >> >> allow for backward-compatibility device implementations allow for it >> >> >> to be left unset by the guest. It is an error to set both this flag >> >> >> and VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > It looks kind of narrow but it's an option. >> >> >> >> Great! >> >> >> >> > I wonder how we'll define what's an iommu though. >> >> >> >> Hm, it didn't occur to me it could be an issue. I'll try. >> >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > Thanks, I'll think about this approach. Will respond next week. Thanks! >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> help with guests that are out today in the field, which don't have A >> >> virtio-iommu driver. >> > >> > I presume legacy guests don't use encrypted memory so why do we >> > worry about them at all? >> >> They don't use encrypted memory, but a host machine will run a mix of >> secure and legacy guests. And since the hypervisor doesn't know whether >> a guest will be secure or not at the time it is launched, legacy guests >> will have to be launched with the same configuration as secure guests. > > OK and so I think the issue is that hosts generally fail if they set > ACCESS_PLATFORM and guests do not negotiate it. > So you can not just set ACCESS_PLATFORM for everyone. > Is that the issue here? Yes, that is one half of the issue. The other is that even if hosts didn't fail, existing legacy guests wouldn't "take the initiative" of not negotiating ACCESS_PLATFORM to get the improved performance. They'd have to be modified to do that. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-25 1:01 ` Thiago Jung Bauermann (?) (?) @ 2019-04-25 1:18 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-25 1:18 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-25 1:01 ` Thiago Jung Bauermann (?) (?) @ 2019-04-25 1:18 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-25 1:18 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-25 1:18 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-25 1:18 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-25 1:18 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-25 1:18 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel-u79uwXL29TY76Z2rM5mHXA, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Paul Mackerras, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Christoph Hellwig, David Gibson On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-25 1:18 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-04-25 1:18 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-25 1:18 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-04-26 23:56 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-26 23:56 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> >> accessible: >> >> >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> >> the device will always use physical addresses matching addresses >> >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> >> to it by the driver. When clear, this overrides any >> >> >> >> platform-specific description of whether device access is limited or >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> >> guests or not. >> >> >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> >> addresses that weren't supplied to it by the driver? >> >> >> > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> >> > specific encrypted memory regions that driver has access to but device >> >> >> > does not. that seems to violate the constraint. >> >> >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> >> the device can ignore the IOMMU for all practical purposes I would >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> >> driver that the host may not have access to memory addresses that it >> >> >> will never try to access. >> >> > >> >> > For example, one of the benefits is to signal to host that driver does >> >> > not expect ability to access all memory. If it does, host can >> >> > fail initialization gracefully. >> >> >> >> But why would the ability to access all memory be necessary or even >> >> useful? When would the host access memory that the driver didn't tell it >> >> to access? >> > >> > When I say all memory I mean even memory not allowed by the IOMMU. >> >> Yes, but why? How is that memory relevant? > > It's relevant when driver is not trusted to only supply correct > addresses. The feature was originally designed to support userspace > drivers within guests. Ah, thanks for clarifying. I don't think that's a problem in our case. If the guest provides an incorrect address, the hardware simply won't allow the host to access it. >> >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> >> help with guests that are out today in the field, which don't have A >> >> >> virtio-iommu driver. >> >> > >> >> > I presume legacy guests don't use encrypted memory so why do we >> >> > worry about them at all? >> >> >> >> They don't use encrypted memory, but a host machine will run a mix of >> >> secure and legacy guests. And since the hypervisor doesn't know whether >> >> a guest will be secure or not at the time it is launched, legacy guests >> >> will have to be launched with the same configuration as secure guests. >> > >> > OK and so I think the issue is that hosts generally fail if they set >> > ACCESS_PLATFORM and guests do not negotiate it. >> > So you can not just set ACCESS_PLATFORM for everyone. >> > Is that the issue here? >> >> Yes, that is one half of the issue. The other is that even if hosts >> didn't fail, existing legacy guests wouldn't "take the initiative" of >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd >> have to be modified to do that. > > So there's a non-encrypted guest, hypervisor wants to set > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > guests since their vIOMMU emulation is very slow. Yes. > So enabling support for encryption slows down non-encrypted guests. Not > great but not the end of the world, considering even older guests that > don't support ACCESS_PLATFORM are completely broken and you do not seem > to be too worried by that. Well, I guess that would be the third half of the issue. :-) > For future non-encrypted guests, bypassing the emulated IOMMU for when > that emulated IOMMU is very slow might be solvable in some other way, > e.g. with virtio-iommu. Which reminds me, could you look at > virtio-iommu as a solution for some of the issues? > Review of that patchset from that POV would be appreciated. Yes, I will have a look. As you mentioned already, virtio-iommu doesn't define a way to request iommu bypass for a device so that would have to be added. Though to be honest in practice I don't think such a feature in virtio-iommu would make things easier for us, at least in the short term. It would take the same effort to define a powerpc-specific hypercall to accomplish the same thing (easier, in fact since we wouldn't have to implement the rest of virtio-iommu). In fact, there already is such hypercall, but it is only defined for VIO devices (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on virtio devices as well. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-25 1:18 ` Michael S. Tsirkin (?) @ 2019-04-26 23:56 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-26 23:56 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> >> accessible: >> >> >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> >> the device will always use physical addresses matching addresses >> >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> >> to it by the driver. When clear, this overrides any >> >> >> >> platform-specific description of whether device access is limited or >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> >> guests or not. >> >> >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> >> addresses that weren't supplied to it by the driver? >> >> >> > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> >> > specific encrypted memory regions that driver has access to but device >> >> >> > does not. that seems to violate the constraint. >> >> >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> >> the device can ignore the IOMMU for all practical purposes I would >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> >> driver that the host may not have access to memory addresses that it >> >> >> will never try to access. >> >> > >> >> > For example, one of the benefits is to signal to host that driver does >> >> > not expect ability to access all memory. If it does, host can >> >> > fail initialization gracefully. >> >> >> >> But why would the ability to access all memory be necessary or even >> >> useful? When would the host access memory that the driver didn't tell it >> >> to access? >> > >> > When I say all memory I mean even memory not allowed by the IOMMU. >> >> Yes, but why? How is that memory relevant? > > It's relevant when driver is not trusted to only supply correct > addresses. The feature was originally designed to support userspace > drivers within guests. Ah, thanks for clarifying. I don't think that's a problem in our case. If the guest provides an incorrect address, the hardware simply won't allow the host to access it. >> >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> >> help with guests that are out today in the field, which don't have A >> >> >> virtio-iommu driver. >> >> > >> >> > I presume legacy guests don't use encrypted memory so why do we >> >> > worry about them at all? >> >> >> >> They don't use encrypted memory, but a host machine will run a mix of >> >> secure and legacy guests. And since the hypervisor doesn't know whether >> >> a guest will be secure or not at the time it is launched, legacy guests >> >> will have to be launched with the same configuration as secure guests. >> > >> > OK and so I think the issue is that hosts generally fail if they set >> > ACCESS_PLATFORM and guests do not negotiate it. >> > So you can not just set ACCESS_PLATFORM for everyone. >> > Is that the issue here? >> >> Yes, that is one half of the issue. The other is that even if hosts >> didn't fail, existing legacy guests wouldn't "take the initiative" of >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd >> have to be modified to do that. > > So there's a non-encrypted guest, hypervisor wants to set > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > guests since their vIOMMU emulation is very slow. Yes. > So enabling support for encryption slows down non-encrypted guests. Not > great but not the end of the world, considering even older guests that > don't support ACCESS_PLATFORM are completely broken and you do not seem > to be too worried by that. Well, I guess that would be the third half of the issue. :-) > For future non-encrypted guests, bypassing the emulated IOMMU for when > that emulated IOMMU is very slow might be solvable in some other way, > e.g. with virtio-iommu. Which reminds me, could you look at > virtio-iommu as a solution for some of the issues? > Review of that patchset from that POV would be appreciated. Yes, I will have a look. As you mentioned already, virtio-iommu doesn't define a way to request iommu bypass for a device so that would have to be added. Though to be honest in practice I don't think such a feature in virtio-iommu would make things easier for us, at least in the short term. It would take the same effort to define a powerpc-specific hypercall to accomplish the same thing (easier, in fact since we wouldn't have to implement the rest of virtio-iommu). In fact, there already is such hypercall, but it is only defined for VIO devices (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on virtio devices as well. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-26 23:56 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-26 23:56 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> >> accessible: >> >> >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> >> the device will always use physical addresses matching addresses >> >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> >> to it by the driver. When clear, this overrides any >> >> >> >> platform-specific description of whether device access is limited or >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> >> guests or not. >> >> >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> >> addresses that weren't supplied to it by the driver? >> >> >> > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> >> > specific encrypted memory regions that driver has access to but device >> >> >> > does not. that seems to violate the constraint. >> >> >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> >> the device can ignore the IOMMU for all practical purposes I would >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> >> driver that the host may not have access to memory addresses that it >> >> >> will never try to access. >> >> > >> >> > For example, one of the benefits is to signal to host that driver does >> >> > not expect ability to access all memory. If it does, host can >> >> > fail initialization gracefully. >> >> >> >> But why would the ability to access all memory be necessary or even >> >> useful? When would the host access memory that the driver didn't tell it >> >> to access? >> > >> > When I say all memory I mean even memory not allowed by the IOMMU. >> >> Yes, but why? How is that memory relevant? > > It's relevant when driver is not trusted to only supply correct > addresses. The feature was originally designed to support userspace > drivers within guests. Ah, thanks for clarifying. I don't think that's a problem in our case. If the guest provides an incorrect address, the hardware simply won't allow the host to access it. >> >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> >> help with guests that are out today in the field, which don't have A >> >> >> virtio-iommu driver. >> >> > >> >> > I presume legacy guests don't use encrypted memory so why do we >> >> > worry about them at all? >> >> >> >> They don't use encrypted memory, but a host machine will run a mix of >> >> secure and legacy guests. And since the hypervisor doesn't know whether >> >> a guest will be secure or not at the time it is launched, legacy guests >> >> will have to be launched with the same configuration as secure guests. >> > >> > OK and so I think the issue is that hosts generally fail if they set >> > ACCESS_PLATFORM and guests do not negotiate it. >> > So you can not just set ACCESS_PLATFORM for everyone. >> > Is that the issue here? >> >> Yes, that is one half of the issue. The other is that even if hosts >> didn't fail, existing legacy guests wouldn't "take the initiative" of >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd >> have to be modified to do that. > > So there's a non-encrypted guest, hypervisor wants to set > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > guests since their vIOMMU emulation is very slow. Yes. > So enabling support for encryption slows down non-encrypted guests. Not > great but not the end of the world, considering even older guests that > don't support ACCESS_PLATFORM are completely broken and you do not seem > to be too worried by that. Well, I guess that would be the third half of the issue. :-) > For future non-encrypted guests, bypassing the emulated IOMMU for when > that emulated IOMMU is very slow might be solvable in some other way, > e.g. with virtio-iommu. Which reminds me, could you look at > virtio-iommu as a solution for some of the issues? > Review of that patchset from that POV would be appreciated. Yes, I will have a look. As you mentioned already, virtio-iommu doesn't define a way to request iommu bypass for a device so that would have to be added. Though to be honest in practice I don't think such a feature in virtio-iommu would make things easier for us, at least in the short term. It would take the same effort to define a powerpc-specific hypercall to accomplish the same thing (easier, in fact since we wouldn't have to implement the rest of virtio-iommu). In fact, there already is such hypercall, but it is only defined for VIO devices (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on virtio devices as well. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-04-26 23:56 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-26 23:56 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> >> accessible: >> >> >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> >> the device will always use physical addresses matching addresses >> >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> >> to it by the driver. When clear, this overrides any >> >> >> >> platform-specific description of whether device access is limited or >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> >> guests or not. >> >> >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> >> addresses that weren't supplied to it by the driver? >> >> >> > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> >> > specific encrypted memory regions that driver has access to but device >> >> >> > does not. that seems to violate the constraint. >> >> >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> >> the device can ignore the IOMMU for all practical purposes I would >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> >> driver that the host may not have access to memory addresses that it >> >> >> will never try to access. >> >> > >> >> > For example, one of the benefits is to signal to host that driver does >> >> > not expect ability to access all memory. If it does, host can >> >> > fail initialization gracefully. >> >> >> >> But why would the ability to access all memory be necessary or even >> >> useful? When would the host access memory that the driver didn't tell it >> >> to access? >> > >> > When I say all memory I mean even memory not allowed by the IOMMU. >> >> Yes, but why? How is that memory relevant? > > It's relevant when driver is not trusted to only supply correct > addresses. The feature was originally designed to support userspace > drivers within guests. Ah, thanks for clarifying. I don't think that's a problem in our case. If the guest provides an incorrect address, the hardware simply won't allow the host to access it. >> >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> >> help with guests that are out today in the field, which don't have A >> >> >> virtio-iommu driver. >> >> > >> >> > I presume legacy guests don't use encrypted memory so why do we >> >> > worry about them at all? >> >> >> >> They don't use encrypted memory, but a host machine will run a mix of >> >> secure and legacy guests. And since the hypervisor doesn't know whether >> >> a guest will be secure or not at the time it is launched, legacy guests >> >> will have to be launched with the same configuration as secure guests. >> > >> > OK and so I think the issue is that hosts generally fail if they set >> > ACCESS_PLATFORM and guests do not negotiate it. >> > So you can not just set ACCESS_PLATFORM for everyone. >> > Is that the issue here? >> >> Yes, that is one half of the issue. The other is that even if hosts >> didn't fail, existing legacy guests wouldn't "take the initiative" of >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd >> have to be modified to do that. > > So there's a non-encrypted guest, hypervisor wants to set > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > guests since their vIOMMU emulation is very slow. Yes. > So enabling support for encryption slows down non-encrypted guests. Not > great but not the end of the world, considering even older guests that > don't support ACCESS_PLATFORM are completely broken and you do not seem > to be too worried by that. Well, I guess that would be the third half of the issue. :-) > For future non-encrypted guests, bypassing the emulated IOMMU for when > that emulated IOMMU is very slow might be solvable in some other way, > e.g. with virtio-iommu. Which reminds me, could you look at > virtio-iommu as a solution for some of the issues? > Review of that patchset from that POV would be appreciated. Yes, I will have a look. As you mentioned already, virtio-iommu doesn't define a way to request iommu bypass for a device so that would have to be added. Though to be honest in practice I don't think such a feature in virtio-iommu would make things easier for us, at least in the short term. It would take the same effort to define a powerpc-specific hypercall to accomplish the same thing (easier, in fact since we wouldn't have to implement the rest of virtio-iommu). In fact, there already is such hypercall, but it is only defined for VIO devices (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on virtio devices as well. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-26 23:56 ` Thiago Jung Bauermann (?) (?) @ 2019-05-20 13:08 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:08 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Fri, Apr 26, 2019 at 08:56:43PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> >> > >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> >> > >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> >> accessible: > >> >> >> >> > >> >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> >> the device will always use physical addresses matching addresses > >> >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> >> to it by the driver. When clear, this overrides any > >> >> >> >> platform-specific description of whether device access is limited or > >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> >> > >> >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> >> guests or not. > >> >> >> >> > >> >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> >> addresses that weren't supplied to it by the driver? > >> >> >> > > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> >> > specific encrypted memory regions that driver has access to but device > >> >> >> > does not. that seems to violate the constraint. > >> >> >> > >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> >> > >> >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> >> driver that the host may not have access to memory addresses that it > >> >> >> will never try to access. > >> >> > > >> >> > For example, one of the benefits is to signal to host that driver does > >> >> > not expect ability to access all memory. If it does, host can > >> >> > fail initialization gracefully. > >> >> > >> >> But why would the ability to access all memory be necessary or even > >> >> useful? When would the host access memory that the driver didn't tell it > >> >> to access? > >> > > >> > When I say all memory I mean even memory not allowed by the IOMMU. > >> > >> Yes, but why? How is that memory relevant? > > > > It's relevant when driver is not trusted to only supply correct > > addresses. The feature was originally designed to support userspace > > drivers within guests. > > Ah, thanks for clarifying. I don't think that's a problem in our case. > If the guest provides an incorrect address, the hardware simply won't > allow the host to access it. > > >> >> >> > Another idea is maybe something like virtio-iommu? > >> >> >> > >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> >> help with guests that are out today in the field, which don't have A > >> >> >> virtio-iommu driver. > >> >> > > >> >> > I presume legacy guests don't use encrypted memory so why do we > >> >> > worry about them at all? > >> >> > >> >> They don't use encrypted memory, but a host machine will run a mix of > >> >> secure and legacy guests. And since the hypervisor doesn't know whether > >> >> a guest will be secure or not at the time it is launched, legacy guests > >> >> will have to be launched with the same configuration as secure guests. > >> > > >> > OK and so I think the issue is that hosts generally fail if they set > >> > ACCESS_PLATFORM and guests do not negotiate it. > >> > So you can not just set ACCESS_PLATFORM for everyone. > >> > Is that the issue here? > >> > >> Yes, that is one half of the issue. The other is that even if hosts > >> didn't fail, existing legacy guests wouldn't "take the initiative" of > >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd > >> have to be modified to do that. > > > > So there's a non-encrypted guest, hypervisor wants to set > > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > > guests since their vIOMMU emulation is very slow. > > Yes. > > > So enabling support for encryption slows down non-encrypted guests. Not > > great but not the end of the world, considering even older guests that > > don't support ACCESS_PLATFORM are completely broken and you do not seem > > to be too worried by that. > > Well, I guess that would be the third half of the issue. :-) > > > For future non-encrypted guests, bypassing the emulated IOMMU for when > > that emulated IOMMU is very slow might be solvable in some other way, > > e.g. with virtio-iommu. Which reminds me, could you look at > > virtio-iommu as a solution for some of the issues? > > Review of that patchset from that POV would be appreciated. > > Yes, I will have a look. As you mentioned already, virtio-iommu doesn't > define a way to request iommu bypass for a device so that would have to > be added. I think it does have a way for guest to request bypass: there's a feature bit which - if set - specifies that a device that is in no domain bypasses the iommu. > Though to be honest in practice I don't think such a feature in > virtio-iommu would make things easier for us, at least in the short > term. It would take the same effort to define a powerpc-specific > hypercall to accomplish the same thing (easier, in fact since we > wouldn't have to implement the rest of virtio-iommu). In fact, there > already is such hypercall, but it is only defined for VIO devices > (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on > virtio devices as well. Now I'm a bit lost. Could you pls describe quickly what does it do? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-05-20 13:08 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:08 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Fri, Apr 26, 2019 at 08:56:43PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> >> > >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> >> > >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> >> accessible: > >> >> >> >> > >> >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> >> the device will always use physical addresses matching addresses > >> >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> >> to it by the driver. When clear, this overrides any > >> >> >> >> platform-specific description of whether device access is limited or > >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> >> > >> >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> >> guests or not. > >> >> >> >> > >> >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> >> addresses that weren't supplied to it by the driver? > >> >> >> > > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> >> > specific encrypted memory regions that driver has access to but device > >> >> >> > does not. that seems to violate the constraint. > >> >> >> > >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> >> > >> >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> >> driver that the host may not have access to memory addresses that it > >> >> >> will never try to access. > >> >> > > >> >> > For example, one of the benefits is to signal to host that driver does > >> >> > not expect ability to access all memory. If it does, host can > >> >> > fail initialization gracefully. > >> >> > >> >> But why would the ability to access all memory be necessary or even > >> >> useful? When would the host access memory that the driver didn't tell it > >> >> to access? > >> > > >> > When I say all memory I mean even memory not allowed by the IOMMU. > >> > >> Yes, but why? How is that memory relevant? > > > > It's relevant when driver is not trusted to only supply correct > > addresses. The feature was originally designed to support userspace > > drivers within guests. > > Ah, thanks for clarifying. I don't think that's a problem in our case. > If the guest provides an incorrect address, the hardware simply won't > allow the host to access it. > > >> >> >> > Another idea is maybe something like virtio-iommu? > >> >> >> > >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> >> help with guests that are out today in the field, which don't have A > >> >> >> virtio-iommu driver. > >> >> > > >> >> > I presume legacy guests don't use encrypted memory so why do we > >> >> > worry about them at all? > >> >> > >> >> They don't use encrypted memory, but a host machine will run a mix of > >> >> secure and legacy guests. And since the hypervisor doesn't know whether > >> >> a guest will be secure or not at the time it is launched, legacy guests > >> >> will have to be launched with the same configuration as secure guests. > >> > > >> > OK and so I think the issue is that hosts generally fail if they set > >> > ACCESS_PLATFORM and guests do not negotiate it. > >> > So you can not just set ACCESS_PLATFORM for everyone. > >> > Is that the issue here? > >> > >> Yes, that is one half of the issue. The other is that even if hosts > >> didn't fail, existing legacy guests wouldn't "take the initiative" of > >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd > >> have to be modified to do that. > > > > So there's a non-encrypted guest, hypervisor wants to set > > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > > guests since their vIOMMU emulation is very slow. > > Yes. > > > So enabling support for encryption slows down non-encrypted guests. Not > > great but not the end of the world, considering even older guests that > > don't support ACCESS_PLATFORM are completely broken and you do not seem > > to be too worried by that. > > Well, I guess that would be the third half of the issue. :-) > > > For future non-encrypted guests, bypassing the emulated IOMMU for when > > that emulated IOMMU is very slow might be solvable in some other way, > > e.g. with virtio-iommu. Which reminds me, could you look at > > virtio-iommu as a solution for some of the issues? > > Review of that patchset from that POV would be appreciated. > > Yes, I will have a look. As you mentioned already, virtio-iommu doesn't > define a way to request iommu bypass for a device so that would have to > be added. I think it does have a way for guest to request bypass: there's a feature bit which - if set - specifies that a device that is in no domain bypasses the iommu. > Though to be honest in practice I don't think such a feature in > virtio-iommu would make things easier for us, at least in the short > term. It would take the same effort to define a powerpc-specific > hypercall to accomplish the same thing (easier, in fact since we > wouldn't have to implement the rest of virtio-iommu). In fact, there > already is such hypercall, but it is only defined for VIO devices > (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on > virtio devices as well. Now I'm a bit lost. Could you pls describe quickly what does it do? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-05-20 13:08 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:08 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Fri, Apr 26, 2019 at 08:56:43PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> >> > >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> >> > >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> >> accessible: > >> >> >> >> > >> >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> >> the device will always use physical addresses matching addresses > >> >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> >> to it by the driver. When clear, this overrides any > >> >> >> >> platform-specific description of whether device access is limited or > >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> >> > >> >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> >> guests or not. > >> >> >> >> > >> >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> >> addresses that weren't supplied to it by the driver? > >> >> >> > > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> >> > specific encrypted memory regions that driver has access to but device > >> >> >> > does not. that seems to violate the constraint. > >> >> >> > >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> >> > >> >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> >> driver that the host may not have access to memory addresses that it > >> >> >> will never try to access. > >> >> > > >> >> > For example, one of the benefits is to signal to host that driver does > >> >> > not expect ability to access all memory. If it does, host can > >> >> > fail initialization gracefully. > >> >> > >> >> But why would the ability to access all memory be necessary or even > >> >> useful? When would the host access memory that the driver didn't tell it > >> >> to access? > >> > > >> > When I say all memory I mean even memory not allowed by the IOMMU. > >> > >> Yes, but why? How is that memory relevant? > > > > It's relevant when driver is not trusted to only supply correct > > addresses. The feature was originally designed to support userspace > > drivers within guests. > > Ah, thanks for clarifying. I don't think that's a problem in our case. > If the guest provides an incorrect address, the hardware simply won't > allow the host to access it. > > >> >> >> > Another idea is maybe something like virtio-iommu? > >> >> >> > >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> >> help with guests that are out today in the field, which don't have A > >> >> >> virtio-iommu driver. > >> >> > > >> >> > I presume legacy guests don't use encrypted memory so why do we > >> >> > worry about them at all? > >> >> > >> >> They don't use encrypted memory, but a host machine will run a mix of > >> >> secure and legacy guests. And since the hypervisor doesn't know whether > >> >> a guest will be secure or not at the time it is launched, legacy guests > >> >> will have to be launched with the same configuration as secure guests. > >> > > >> > OK and so I think the issue is that hosts generally fail if they set > >> > ACCESS_PLATFORM and guests do not negotiate it. > >> > So you can not just set ACCESS_PLATFORM for everyone. > >> > Is that the issue here? > >> > >> Yes, that is one half of the issue. The other is that even if hosts > >> didn't fail, existing legacy guests wouldn't "take the initiative" of > >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd > >> have to be modified to do that. > > > > So there's a non-encrypted guest, hypervisor wants to set > > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > > guests since their vIOMMU emulation is very slow. > > Yes. > > > So enabling support for encryption slows down non-encrypted guests. Not > > great but not the end of the world, considering even older guests that > > don't support ACCESS_PLATFORM are completely broken and you do not seem > > to be too worried by that. > > Well, I guess that would be the third half of the issue. :-) > > > For future non-encrypted guests, bypassing the emulated IOMMU for when > > that emulated IOMMU is very slow might be solvable in some other way, > > e.g. with virtio-iommu. Which reminds me, could you look at > > virtio-iommu as a solution for some of the issues? > > Review of that patchset from that POV would be appreciated. > > Yes, I will have a look. As you mentioned already, virtio-iommu doesn't > define a way to request iommu bypass for a device so that would have to > be added. I think it does have a way for guest to request bypass: there's a feature bit which - if set - specifies that a device that is in no domain bypasses the iommu. > Though to be honest in practice I don't think such a feature in > virtio-iommu would make things easier for us, at least in the short > term. It would take the same effort to define a powerpc-specific > hypercall to accomplish the same thing (easier, in fact since we > wouldn't have to implement the rest of virtio-iommu). In fact, there > already is such hypercall, but it is only defined for VIO devices > (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on > virtio devices as well. Now I'm a bit lost. Could you pls describe quickly what does it do? > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-05-20 13:08 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:08 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Fri, Apr 26, 2019 at 08:56:43PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> >> > >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> >> > >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> >> accessible: > >> >> >> >> > >> >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> >> the device will always use physical addresses matching addresses > >> >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> >> to it by the driver. When clear, this overrides any > >> >> >> >> platform-specific description of whether device access is limited or > >> >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> >> > >> >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> >> guests or not. > >> >> >> >> > >> >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> >> addresses that weren't supplied to it by the driver? > >> >> >> > > >> >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> >> > specific encrypted memory regions that driver has access to but device > >> >> >> > does not. that seems to violate the constraint. > >> >> >> > >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> >> > >> >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> >> driver that the host may not have access to memory addresses that it > >> >> >> will never try to access. > >> >> > > >> >> > For example, one of the benefits is to signal to host that driver does > >> >> > not expect ability to access all memory. If it does, host can > >> >> > fail initialization gracefully. > >> >> > >> >> But why would the ability to access all memory be necessary or even > >> >> useful? When would the host access memory that the driver didn't tell it > >> >> to access? > >> > > >> > When I say all memory I mean even memory not allowed by the IOMMU. > >> > >> Yes, but why? How is that memory relevant? > > > > It's relevant when driver is not trusted to only supply correct > > addresses. The feature was originally designed to support userspace > > drivers within guests. > > Ah, thanks for clarifying. I don't think that's a problem in our case. > If the guest provides an incorrect address, the hardware simply won't > allow the host to access it. > > >> >> >> > Another idea is maybe something like virtio-iommu? > >> >> >> > >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> >> help with guests that are out today in the field, which don't have A > >> >> >> virtio-iommu driver. > >> >> > > >> >> > I presume legacy guests don't use encrypted memory so why do we > >> >> > worry about them at all? > >> >> > >> >> They don't use encrypted memory, but a host machine will run a mix of > >> >> secure and legacy guests. And since the hypervisor doesn't know whether > >> >> a guest will be secure or not at the time it is launched, legacy guests > >> >> will have to be launched with the same configuration as secure guests. > >> > > >> > OK and so I think the issue is that hosts generally fail if they set > >> > ACCESS_PLATFORM and guests do not negotiate it. > >> > So you can not just set ACCESS_PLATFORM for everyone. > >> > Is that the issue here? > >> > >> Yes, that is one half of the issue. The other is that even if hosts > >> didn't fail, existing legacy guests wouldn't "take the initiative" of > >> not negotiating ACCESS_PLATFORM to get the improved performance. They'd > >> have to be modified to do that. > > > > So there's a non-encrypted guest, hypervisor wants to set > > ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy > > guests since their vIOMMU emulation is very slow. > > Yes. > > > So enabling support for encryption slows down non-encrypted guests. Not > > great but not the end of the world, considering even older guests that > > don't support ACCESS_PLATFORM are completely broken and you do not seem > > to be too worried by that. > > Well, I guess that would be the third half of the issue. :-) > > > For future non-encrypted guests, bypassing the emulated IOMMU for when > > that emulated IOMMU is very slow might be solvable in some other way, > > e.g. with virtio-iommu. Which reminds me, could you look at > > virtio-iommu as a solution for some of the issues? > > Review of that patchset from that POV would be appreciated. > > Yes, I will have a look. As you mentioned already, virtio-iommu doesn't > define a way to request iommu bypass for a device so that would have to > be added. I think it does have a way for guest to request bypass: there's a feature bit which - if set - specifies that a device that is in no domain bypasses the iommu. > Though to be honest in practice I don't think such a feature in > virtio-iommu would make things easier for us, at least in the short > term. It would take the same effort to define a powerpc-specific > hypercall to accomplish the same thing (easier, in fact since we > wouldn't have to implement the rest of virtio-iommu). In fact, there > already is such hypercall, but it is only defined for VIO devices > (RTAS_IBM_SET_TCE_BYPASS in QEMU). We would have to make it work on > virtio devices as well. Now I'm a bit lost. Could you pls describe quickly what does it do? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-19 23:09 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-04-25 1:01 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-04-25 1:01 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will >> >> >> only ever try to access memory addresses that are supplied to it by the >> >> >> guest, so all of the secure guest memory that the host cares about is >> >> >> accessible: >> >> >> >> >> >> If this feature bit is set to 0, then the device has same access to >> >> >> memory addresses supplied to it as the driver has. In particular, >> >> >> the device will always use physical addresses matching addresses >> >> >> used by the driver (typically meaning physical addresses used by the >> >> >> CPU) and not translated further, and can access any address supplied >> >> >> to it by the driver. When clear, this overrides any >> >> >> platform-specific description of whether device access is limited or >> >> >> translated in any way, e.g. whether an IOMMU may be present. >> >> >> >> >> >> All of the above is true for POWER guests, whether they are secure >> >> >> guests or not. >> >> >> >> >> >> Or are you saying that a virtio device may want to access memory >> >> >> addresses that weren't supplied to it by the driver? >> >> > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are >> >> > specific encrypted memory regions that driver has access to but device >> >> > does not. that seems to violate the constraint. >> >> >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that >> >> the device can ignore the IOMMU for all practical purposes I would >> >> indeed say that the logic would apply to IOMMUs as well. :-) >> >> >> >> I guess I'm still struggling with the purpose of signalling to the >> >> driver that the host may not have access to memory addresses that it >> >> will never try to access. >> > >> > For example, one of the benefits is to signal to host that driver does >> > not expect ability to access all memory. If it does, host can >> > fail initialization gracefully. >> >> But why would the ability to access all memory be necessary or even >> useful? When would the host access memory that the driver didn't tell it >> to access? > > When I say all memory I mean even memory not allowed by the IOMMU. Yes, but why? How is that memory relevant? >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> >> >> >> >> >> >> My understanding is, AMD guest-platform knows in advance that their >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. >> >> >> > >> >> >> > Well you do have that luxury. It looks like that there are existing >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy >> >> >> > with how that path is slow. So you are trying to optimize for >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability >> >> >> > to invoke DMA API. >> >> >> > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM >> >> >> > just not yet used by anyone, you would be all fine using that right? >> >> >> >> >> >> Yes, a new flag sounds like a great idea. What about the definition >> >> >> below? >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed >> >> >> when accessing memory addresses supplied to the device by the >> >> >> driver. This flag should be set by the guest if offered, but to >> >> >> allow for backward-compatibility device implementations allow for it >> >> >> to be left unset by the guest. It is an error to set both this flag >> >> >> and VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > It looks kind of narrow but it's an option. >> >> >> >> Great! >> >> >> >> > I wonder how we'll define what's an iommu though. >> >> >> >> Hm, it didn't occur to me it could be an issue. I'll try. >> >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > Thanks, I'll think about this approach. Will respond next week. Thanks! >> >> > Another idea is maybe something like virtio-iommu? >> >> >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU >> >> bypass? If so, it's an interesting idea for new guests but it doesn't >> >> help with guests that are out today in the field, which don't have A >> >> virtio-iommu driver. >> > >> > I presume legacy guests don't use encrypted memory so why do we >> > worry about them at all? >> >> They don't use encrypted memory, but a host machine will run a mix of >> secure and legacy guests. And since the hypervisor doesn't know whether >> a guest will be secure or not at the time it is launched, legacy guests >> will have to be launched with the same configuration as secure guests. > > OK and so I think the issue is that hosts generally fail if they set > ACCESS_PLATFORM and guests do not negotiate it. > So you can not just set ACCESS_PLATFORM for everyone. > Is that the issue here? Yes, that is one half of the issue. The other is that even if hosts didn't fail, existing legacy guests wouldn't "take the initiative" of not negotiating ACCESS_PLATFORM to get the improved performance. They'd have to be modified to do that. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-17 21:42 ` Thiago Jung Bauermann ` (4 preceding siblings ...) (?) @ 2019-05-20 13:16 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged drivers. This is why devices fail when it's not negotiated. This confuses me. If driver is unpriveledged then what happens with this flag? It can supply any address it wants. Will that corrupt kernel memory? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-04-17 21:42 ` Thiago Jung Bauermann (?) @ 2019-05-20 13:16 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged drivers. This is why devices fail when it's not negotiated. This confuses me. If driver is unpriveledged then what happens with this flag? It can supply any address it wants. Will that corrupt kernel memory? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-05-20 13:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged drivers. This is why devices fail when it's not negotiated. This confuses me. If driver is unpriveledged then what happens with this flag? It can supply any address it wants. Will that corrupt kernel memory? -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-05-20 13:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-05-20 13:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > I rephrased it in terms of address translation. What do you think of > this version? The flag name is slightly different too: > > > VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > with the exception that address translation is guaranteed to be > unnecessary when accessing memory addresses supplied to the device > by the driver. Which is to say, the device will always use physical > addresses matching addresses used by the driver (typically meaning > physical addresses used by the CPU) and not translated further. This > flag should be set by the guest if offered, but to allow for > backward-compatibility device implementations allow for it to be > left unset by the guest. It is an error to set both this flag and > VIRTIO_F_ACCESS_PLATFORM. OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged drivers. This is why devices fail when it's not negotiated. This confuses me. If driver is unpriveledged then what happens with this flag? It can supply any address it wants. Will that corrupt kernel memory? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-05-20 13:16 ` Michael S. Tsirkin (?) @ 2019-06-04 1:13 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-04 1:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > drivers. This is why devices fail when it's not negotiated. Just to clarify, what do you mean by unprivileged drivers? Is it drivers implemented in guest userspace such as with VFIO? Or unprivileged in some other sense such as needing to use bounce buffers for some reason? > This confuses me. > If driver is unpriveledged then what happens with this flag? > It can supply any address it wants. Will that corrupt kernel > memory? Not needing address translation doesn't necessarily mean that there's no IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's always an IOMMU present. And we also support VFIO drivers. The VFIO API for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls to program the IOMMU. For our use case, we don't need address translation because we set up an identity mapping in the IOMMU so that the device can use guest physical addresses. If the guest kernel is concerned that an unprivileged driver could jeopardize its integrity it should not negotiate this feature flag. Perhaps there should be a note about this in the flag definition? This concern is platform-dependant though. I don't believe it's an issue in pseries. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-04 1:13 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-04 1:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > drivers. This is why devices fail when it's not negotiated. Just to clarify, what do you mean by unprivileged drivers? Is it drivers implemented in guest userspace such as with VFIO? Or unprivileged in some other sense such as needing to use bounce buffers for some reason? > This confuses me. > If driver is unpriveledged then what happens with this flag? > It can supply any address it wants. Will that corrupt kernel > memory? Not needing address translation doesn't necessarily mean that there's no IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's always an IOMMU present. And we also support VFIO drivers. The VFIO API for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls to program the IOMMU. For our use case, we don't need address translation because we set up an identity mapping in the IOMMU so that the device can use guest physical addresses. If the guest kernel is concerned that an unprivileged driver could jeopardize its integrity it should not negotiate this feature flag. Perhaps there should be a note about this in the flag definition? This concern is platform-dependant though. I don't believe it's an issue in pseries. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-04 1:13 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-04 1:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > drivers. This is why devices fail when it's not negotiated. Just to clarify, what do you mean by unprivileged drivers? Is it drivers implemented in guest userspace such as with VFIO? Or unprivileged in some other sense such as needing to use bounce buffers for some reason? > This confuses me. > If driver is unpriveledged then what happens with this flag? > It can supply any address it wants. Will that corrupt kernel > memory? Not needing address translation doesn't necessarily mean that there's no IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's always an IOMMU present. And we also support VFIO drivers. The VFIO API for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls to program the IOMMU. For our use case, we don't need address translation because we set up an identity mapping in the IOMMU so that the device can use guest physical addresses. If the guest kernel is concerned that an unprivileged driver could jeopardize its integrity it should not negotiate this feature flag. Perhaps there should be a note about this in the flag definition? This concern is platform-dependant though. I don't believe it's an issue in pseries. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-06-04 1:13 ` Thiago Jung Bauermann (?) (?) @ 2019-06-04 1:42 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-06-04 1:42 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > > drivers. This is why devices fail when it's not negotiated. > > Just to clarify, what do you mean by unprivileged drivers? Is it drivers > implemented in guest userspace such as with VFIO? Or unprivileged in > some other sense such as needing to use bounce buffers for some reason? I had drivers in guest userspace in mind. > > This confuses me. > > If driver is unpriveledged then what happens with this flag? > > It can supply any address it wants. Will that corrupt kernel > > memory? > > Not needing address translation doesn't necessarily mean that there's no > IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > always an IOMMU present. And we also support VFIO drivers. The VFIO API > for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > to program the IOMMU. > > For our use case, we don't need address translation because we set up an > identity mapping in the IOMMU so that the device can use guest physical > addresses. And can it access any guest physical address? > If the guest kernel is concerned that an unprivileged driver could > jeopardize its integrity it should not negotiate this feature flag. Unfortunately flag negotiation is done through config space and so can be overwritten by the driver. > Perhaps there should be a note about this in the flag definition? This > concern is platform-dependant though. I don't believe it's an issue in > pseries. Again ACCESS_PLATFORM has a pretty open definition. It does actually say it's all up to the platform. Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be implemented portably? virtio has no portable way to know whether DMA API bypasses translation. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-04 1:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-06-04 1:42 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > > drivers. This is why devices fail when it's not negotiated. > > Just to clarify, what do you mean by unprivileged drivers? Is it drivers > implemented in guest userspace such as with VFIO? Or unprivileged in > some other sense such as needing to use bounce buffers for some reason? I had drivers in guest userspace in mind. > > This confuses me. > > If driver is unpriveledged then what happens with this flag? > > It can supply any address it wants. Will that corrupt kernel > > memory? > > Not needing address translation doesn't necessarily mean that there's no > IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > always an IOMMU present. And we also support VFIO drivers. The VFIO API > for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > to program the IOMMU. > > For our use case, we don't need address translation because we set up an > identity mapping in the IOMMU so that the device can use guest physical > addresses. And can it access any guest physical address? > If the guest kernel is concerned that an unprivileged driver could > jeopardize its integrity it should not negotiate this feature flag. Unfortunately flag negotiation is done through config space and so can be overwritten by the driver. > Perhaps there should be a note about this in the flag definition? This > concern is platform-dependant though. I don't believe it's an issue in > pseries. Again ACCESS_PLATFORM has a pretty open definition. It does actually say it's all up to the platform. Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be implemented portably? virtio has no portable way to know whether DMA API bypasses translation. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-04 1:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-06-04 1:42 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > > drivers. This is why devices fail when it's not negotiated. > > Just to clarify, what do you mean by unprivileged drivers? Is it drivers > implemented in guest userspace such as with VFIO? Or unprivileged in > some other sense such as needing to use bounce buffers for some reason? I had drivers in guest userspace in mind. > > This confuses me. > > If driver is unpriveledged then what happens with this flag? > > It can supply any address it wants. Will that corrupt kernel > > memory? > > Not needing address translation doesn't necessarily mean that there's no > IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > always an IOMMU present. And we also support VFIO drivers. The VFIO API > for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > to program the IOMMU. > > For our use case, we don't need address translation because we set up an > identity mapping in the IOMMU so that the device can use guest physical > addresses. And can it access any guest physical address? > If the guest kernel is concerned that an unprivileged driver could > jeopardize its integrity it should not negotiate this feature flag. Unfortunately flag negotiation is done through config space and so can be overwritten by the driver. > Perhaps there should be a note about this in the flag definition? This > concern is platform-dependant though. I don't believe it's an issue in > pseries. Again ACCESS_PLATFORM has a pretty open definition. It does actually say it's all up to the platform. Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be implemented portably? virtio has no portable way to know whether DMA API bypasses translation. > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-04 1:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-06-04 1:42 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > > drivers. This is why devices fail when it's not negotiated. > > Just to clarify, what do you mean by unprivileged drivers? Is it drivers > implemented in guest userspace such as with VFIO? Or unprivileged in > some other sense such as needing to use bounce buffers for some reason? I had drivers in guest userspace in mind. > > This confuses me. > > If driver is unpriveledged then what happens with this flag? > > It can supply any address it wants. Will that corrupt kernel > > memory? > > Not needing address translation doesn't necessarily mean that there's no > IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > always an IOMMU present. And we also support VFIO drivers. The VFIO API > for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > to program the IOMMU. > > For our use case, we don't need address translation because we set up an > identity mapping in the IOMMU so that the device can use guest physical > addresses. And can it access any guest physical address? > If the guest kernel is concerned that an unprivileged driver could > jeopardize its integrity it should not negotiate this feature flag. Unfortunately flag negotiation is done through config space and so can be overwritten by the driver. > Perhaps there should be a note about this in the flag definition? This > concern is platform-dependant though. I don't believe it's an issue in > pseries. Again ACCESS_PLATFORM has a pretty open definition. It does actually say it's all up to the platform. Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be implemented portably? virtio has no portable way to know whether DMA API bypasses translation. > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-06-04 1:42 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-06-28 1:58 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-28 1:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> I rephrased it in terms of address translation. What do you think of >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> with the exception that address translation is guaranteed to be >> >> unnecessary when accessing memory addresses supplied to the device >> >> by the driver. Which is to say, the device will always use physical >> >> addresses matching addresses used by the driver (typically meaning >> >> physical addresses used by the CPU) and not translated further. This >> >> flag should be set by the guest if offered, but to allow for >> >> backward-compatibility device implementations allow for it to be >> >> left unset by the guest. It is an error to set both this flag and >> >> VIRTIO_F_ACCESS_PLATFORM. >> > >> > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> > drivers. This is why devices fail when it's not negotiated. >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> implemented in guest userspace such as with VFIO? Or unprivileged in >> some other sense such as needing to use bounce buffers for some reason? > > I had drivers in guest userspace in mind. Great. Thanks for clarifying. I don't think this flag would work for guest userspace drivers. Should I add a note about that in the flag definition? >> > This confuses me. >> > If driver is unpriveledged then what happens with this flag? >> > It can supply any address it wants. Will that corrupt kernel >> > memory? >> >> Not needing address translation doesn't necessarily mean that there's no >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> to program the IOMMU. >> >> For our use case, we don't need address translation because we set up an >> identity mapping in the IOMMU so that the device can use guest physical >> addresses. > > And can it access any guest physical address? Sorry, I was mistaken. We do support VFIO in guests but not for virtio devices, only for regular PCI devices. In which case they will use address translation. >> If the guest kernel is concerned that an unprivileged driver could >> jeopardize its integrity it should not negotiate this feature flag. > > Unfortunately flag negotiation is done through config space > and so can be overwritten by the driver. Ok, so the guest kernel has to forbid VFIO access on devices where this flag is advertised. >> Perhaps there should be a note about this in the flag definition? This >> concern is platform-dependant though. I don't believe it's an issue in >> pseries. > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > say it's all up to the platform. > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > implemented portably? virtio has no portable way to know > whether DMA API bypasses translation. The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set communicates that knowledge to virtio. There is a shared understanding between the guest and the host about what this flag being set means. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-06-04 1:42 ` Michael S. Tsirkin (?) @ 2019-06-28 1:58 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-28 1:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> I rephrased it in terms of address translation. What do you think of >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> with the exception that address translation is guaranteed to be >> >> unnecessary when accessing memory addresses supplied to the device >> >> by the driver. Which is to say, the device will always use physical >> >> addresses matching addresses used by the driver (typically meaning >> >> physical addresses used by the CPU) and not translated further. This >> >> flag should be set by the guest if offered, but to allow for >> >> backward-compatibility device implementations allow for it to be >> >> left unset by the guest. It is an error to set both this flag and >> >> VIRTIO_F_ACCESS_PLATFORM. >> > >> > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> > drivers. This is why devices fail when it's not negotiated. >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> implemented in guest userspace such as with VFIO? Or unprivileged in >> some other sense such as needing to use bounce buffers for some reason? > > I had drivers in guest userspace in mind. Great. Thanks for clarifying. I don't think this flag would work for guest userspace drivers. Should I add a note about that in the flag definition? >> > This confuses me. >> > If driver is unpriveledged then what happens with this flag? >> > It can supply any address it wants. Will that corrupt kernel >> > memory? >> >> Not needing address translation doesn't necessarily mean that there's no >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> to program the IOMMU. >> >> For our use case, we don't need address translation because we set up an >> identity mapping in the IOMMU so that the device can use guest physical >> addresses. > > And can it access any guest physical address? Sorry, I was mistaken. We do support VFIO in guests but not for virtio devices, only for regular PCI devices. In which case they will use address translation. >> If the guest kernel is concerned that an unprivileged driver could >> jeopardize its integrity it should not negotiate this feature flag. > > Unfortunately flag negotiation is done through config space > and so can be overwritten by the driver. Ok, so the guest kernel has to forbid VFIO access on devices where this flag is advertised. >> Perhaps there should be a note about this in the flag definition? This >> concern is platform-dependant though. I don't believe it's an issue in >> pseries. > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > say it's all up to the platform. > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > implemented portably? virtio has no portable way to know > whether DMA API bypasses translation. The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set communicates that knowledge to virtio. There is a shared understanding between the guest and the host about what this flag being set means. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-28 1:58 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-28 1:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> I rephrased it in terms of address translation. What do you think of >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> with the exception that address translation is guaranteed to be >> >> unnecessary when accessing memory addresses supplied to the device >> >> by the driver. Which is to say, the device will always use physical >> >> addresses matching addresses used by the driver (typically meaning >> >> physical addresses used by the CPU) and not translated further. This >> >> flag should be set by the guest if offered, but to allow for >> >> backward-compatibility device implementations allow for it to be >> >> left unset by the guest. It is an error to set both this flag and >> >> VIRTIO_F_ACCESS_PLATFORM. >> > >> > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> > drivers. This is why devices fail when it's not negotiated. >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> implemented in guest userspace such as with VFIO? Or unprivileged in >> some other sense such as needing to use bounce buffers for some reason? > > I had drivers in guest userspace in mind. Great. Thanks for clarifying. I don't think this flag would work for guest userspace drivers. Should I add a note about that in the flag definition? >> > This confuses me. >> > If driver is unpriveledged then what happens with this flag? >> > It can supply any address it wants. Will that corrupt kernel >> > memory? >> >> Not needing address translation doesn't necessarily mean that there's no >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> to program the IOMMU. >> >> For our use case, we don't need address translation because we set up an >> identity mapping in the IOMMU so that the device can use guest physical >> addresses. > > And can it access any guest physical address? Sorry, I was mistaken. We do support VFIO in guests but not for virtio devices, only for regular PCI devices. In which case they will use address translation. >> If the guest kernel is concerned that an unprivileged driver could >> jeopardize its integrity it should not negotiate this feature flag. > > Unfortunately flag negotiation is done through config space > and so can be overwritten by the driver. Ok, so the guest kernel has to forbid VFIO access on devices where this flag is advertised. >> Perhaps there should be a note about this in the flag definition? This >> concern is platform-dependant though. I don't believe it's an issue in >> pseries. > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > say it's all up to the platform. > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > implemented portably? virtio has no portable way to know > whether DMA API bypasses translation. The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set communicates that knowledge to virtio. There is a shared understanding between the guest and the host about what this flag being set means. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-06-28 1:58 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-28 1:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> I rephrased it in terms of address translation. What do you think of >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> with the exception that address translation is guaranteed to be >> >> unnecessary when accessing memory addresses supplied to the device >> >> by the driver. Which is to say, the device will always use physical >> >> addresses matching addresses used by the driver (typically meaning >> >> physical addresses used by the CPU) and not translated further. This >> >> flag should be set by the guest if offered, but to allow for >> >> backward-compatibility device implementations allow for it to be >> >> left unset by the guest. It is an error to set both this flag and >> >> VIRTIO_F_ACCESS_PLATFORM. >> > >> > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> > drivers. This is why devices fail when it's not negotiated. >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> implemented in guest userspace such as with VFIO? Or unprivileged in >> some other sense such as needing to use bounce buffers for some reason? > > I had drivers in guest userspace in mind. Great. Thanks for clarifying. I don't think this flag would work for guest userspace drivers. Should I add a note about that in the flag definition? >> > This confuses me. >> > If driver is unpriveledged then what happens with this flag? >> > It can supply any address it wants. Will that corrupt kernel >> > memory? >> >> Not needing address translation doesn't necessarily mean that there's no >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> to program the IOMMU. >> >> For our use case, we don't need address translation because we set up an >> identity mapping in the IOMMU so that the device can use guest physical >> addresses. > > And can it access any guest physical address? Sorry, I was mistaken. We do support VFIO in guests but not for virtio devices, only for regular PCI devices. In which case they will use address translation. >> If the guest kernel is concerned that an unprivileged driver could >> jeopardize its integrity it should not negotiate this feature flag. > > Unfortunately flag negotiation is done through config space > and so can be overwritten by the driver. Ok, so the guest kernel has to forbid VFIO access on devices where this flag is advertised. >> Perhaps there should be a note about this in the flag definition? This >> concern is platform-dependant though. I don't believe it's an issue in >> pseries. > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > say it's all up to the platform. > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > implemented portably? virtio has no portable way to know > whether DMA API bypasses translation. The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set communicates that knowledge to virtio. There is a shared understanding between the guest and the host about what this flag being set means. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-06-28 1:58 ` Thiago Jung Bauermann (?) (?) @ 2019-07-01 14:17 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-01 14:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> I rephrased it in terms of address translation. What do you think of > >> >> this version? The flag name is slightly different too: > >> >> > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> with the exception that address translation is guaranteed to be > >> >> unnecessary when accessing memory addresses supplied to the device > >> >> by the driver. Which is to say, the device will always use physical > >> >> addresses matching addresses used by the driver (typically meaning > >> >> physical addresses used by the CPU) and not translated further. This > >> >> flag should be set by the guest if offered, but to allow for > >> >> backward-compatibility device implementations allow for it to be > >> >> left unset by the guest. It is an error to set both this flag and > >> >> VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > > >> > > >> > > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> > drivers. This is why devices fail when it's not negotiated. > >> > >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> some other sense such as needing to use bounce buffers for some reason? > > > > I had drivers in guest userspace in mind. > > Great. Thanks for clarifying. > > I don't think this flag would work for guest userspace drivers. Should I > add a note about that in the flag definition? I think you need to clarify access protection rules. Is it only translation that is bypassed or is any platform-specific protection mechanism bypassed too? > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? > >> > >> Not needing address translation doesn't necessarily mean that there's no > >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> to program the IOMMU. > >> > >> For our use case, we don't need address translation because we set up an > >> identity mapping in the IOMMU so that the device can use guest physical > >> addresses. OK so I think I am beginning to see it in a different light. Right now the specific platform creates an identity mapping. That in turn means DMA API can be fast - it does not need to do anything. What you are looking for is a way to tell host it's an identity mapping - just as an optimization. Is that right? So this is what I would call this option: VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS and the explanation should state that all device addresses are translated by the platform to identical addresses. In fact this option then becomes more, not less restrictive than VIRTIO_F_ACCESS_PLATFORM - it's a promise by guest to only create identity mappings, and only before driver_ok is set. This option then would always be negotiated together with VIRTIO_F_ACCESS_PLATFORM. Host then must verify that 1. full 1:1 mappings are created before driver_ok or can we make sure this happens before features_ok? that would be ideal as we could require that features_ok fails 2. mappings are not modified between driver_ok and reset i guess attempts to change them will fail - possibly by causing a guest crash or some other kind of platform-specific error So far so good, but now a question: how are we handling guest address width limitations? Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to guest address width limitations? I am guessing we can make them so ... This needs to be documented. > > > > And can it access any guest physical address? > > Sorry, I was mistaken. We do support VFIO in guests but not for virtio > devices, only for regular PCI devices. In which case they will use > address translation. Not sure how this answers the question. > >> If the guest kernel is concerned that an unprivileged driver could > >> jeopardize its integrity it should not negotiate this feature flag. > > > > Unfortunately flag negotiation is done through config space > > and so can be overwritten by the driver. > > Ok, so the guest kernel has to forbid VFIO access on devices where this > flag is advertised. That's possible in theory but in practice we did not yet teach VFIO not to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all security relies on host denying driver_ok without VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are thus tricky as they can create security holes for existing guests. I'm open to ideas about how to do this in a safe way, > >> Perhaps there should be a note about this in the flag definition? This > >> concern is platform-dependant though. I don't believe it's an issue in > >> pseries. > > > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > > say it's all up to the platform. > > > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > > implemented portably? virtio has no portable way to know > > whether DMA API bypasses translation. > > The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > communicates that knowledge to virtio. There is a shared understanding > between the guest and the host about what this flag being set means. Right but I wonder how are you going to *actually* implement it on Linux? Are you adding a new set of DMA APIs that do everything except translation? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-06-28 1:58 ` Thiago Jung Bauermann (?) @ 2019-07-01 14:17 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-01 14:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> I rephrased it in terms of address translation. What do you think of > >> >> this version? The flag name is slightly different too: > >> >> > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> with the exception that address translation is guaranteed to be > >> >> unnecessary when accessing memory addresses supplied to the device > >> >> by the driver. Which is to say, the device will always use physical > >> >> addresses matching addresses used by the driver (typically meaning > >> >> physical addresses used by the CPU) and not translated further. This > >> >> flag should be set by the guest if offered, but to allow for > >> >> backward-compatibility device implementations allow for it to be > >> >> left unset by the guest. It is an error to set both this flag and > >> >> VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > > >> > > >> > > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> > drivers. This is why devices fail when it's not negotiated. > >> > >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> some other sense such as needing to use bounce buffers for some reason? > > > > I had drivers in guest userspace in mind. > > Great. Thanks for clarifying. > > I don't think this flag would work for guest userspace drivers. Should I > add a note about that in the flag definition? I think you need to clarify access protection rules. Is it only translation that is bypassed or is any platform-specific protection mechanism bypassed too? > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? > >> > >> Not needing address translation doesn't necessarily mean that there's no > >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> to program the IOMMU. > >> > >> For our use case, we don't need address translation because we set up an > >> identity mapping in the IOMMU so that the device can use guest physical > >> addresses. OK so I think I am beginning to see it in a different light. Right now the specific platform creates an identity mapping. That in turn means DMA API can be fast - it does not need to do anything. What you are looking for is a way to tell host it's an identity mapping - just as an optimization. Is that right? So this is what I would call this option: VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS and the explanation should state that all device addresses are translated by the platform to identical addresses. In fact this option then becomes more, not less restrictive than VIRTIO_F_ACCESS_PLATFORM - it's a promise by guest to only create identity mappings, and only before driver_ok is set. This option then would always be negotiated together with VIRTIO_F_ACCESS_PLATFORM. Host then must verify that 1. full 1:1 mappings are created before driver_ok or can we make sure this happens before features_ok? that would be ideal as we could require that features_ok fails 2. mappings are not modified between driver_ok and reset i guess attempts to change them will fail - possibly by causing a guest crash or some other kind of platform-specific error So far so good, but now a question: how are we handling guest address width limitations? Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to guest address width limitations? I am guessing we can make them so ... This needs to be documented. > > > > And can it access any guest physical address? > > Sorry, I was mistaken. We do support VFIO in guests but not for virtio > devices, only for regular PCI devices. In which case they will use > address translation. Not sure how this answers the question. > >> If the guest kernel is concerned that an unprivileged driver could > >> jeopardize its integrity it should not negotiate this feature flag. > > > > Unfortunately flag negotiation is done through config space > > and so can be overwritten by the driver. > > Ok, so the guest kernel has to forbid VFIO access on devices where this > flag is advertised. That's possible in theory but in practice we did not yet teach VFIO not to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all security relies on host denying driver_ok without VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are thus tricky as they can create security holes for existing guests. I'm open to ideas about how to do this in a safe way, > >> Perhaps there should be a note about this in the flag definition? This > >> concern is platform-dependant though. I don't believe it's an issue in > >> pseries. > > > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > > say it's all up to the platform. > > > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > > implemented portably? virtio has no portable way to know > > whether DMA API bypasses translation. > > The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > communicates that knowledge to virtio. There is a shared understanding > between the guest and the host about what this flag being set means. Right but I wonder how are you going to *actually* implement it on Linux? Are you adding a new set of DMA APIs that do everything except translation? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-01 14:17 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-01 14:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> I rephrased it in terms of address translation. What do you think of > >> >> this version? The flag name is slightly different too: > >> >> > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> with the exception that address translation is guaranteed to be > >> >> unnecessary when accessing memory addresses supplied to the device > >> >> by the driver. Which is to say, the device will always use physical > >> >> addresses matching addresses used by the driver (typically meaning > >> >> physical addresses used by the CPU) and not translated further. This > >> >> flag should be set by the guest if offered, but to allow for > >> >> backward-compatibility device implementations allow for it to be > >> >> left unset by the guest. It is an error to set both this flag and > >> >> VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > > >> > > >> > > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> > drivers. This is why devices fail when it's not negotiated. > >> > >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> some other sense such as needing to use bounce buffers for some reason? > > > > I had drivers in guest userspace in mind. > > Great. Thanks for clarifying. > > I don't think this flag would work for guest userspace drivers. Should I > add a note about that in the flag definition? I think you need to clarify access protection rules. Is it only translation that is bypassed or is any platform-specific protection mechanism bypassed too? > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? > >> > >> Not needing address translation doesn't necessarily mean that there's no > >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> to program the IOMMU. > >> > >> For our use case, we don't need address translation because we set up an > >> identity mapping in the IOMMU so that the device can use guest physical > >> addresses. OK so I think I am beginning to see it in a different light. Right now the specific platform creates an identity mapping. That in turn means DMA API can be fast - it does not need to do anything. What you are looking for is a way to tell host it's an identity mapping - just as an optimization. Is that right? So this is what I would call this option: VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS and the explanation should state that all device addresses are translated by the platform to identical addresses. In fact this option then becomes more, not less restrictive than VIRTIO_F_ACCESS_PLATFORM - it's a promise by guest to only create identity mappings, and only before driver_ok is set. This option then would always be negotiated together with VIRTIO_F_ACCESS_PLATFORM. Host then must verify that 1. full 1:1 mappings are created before driver_ok or can we make sure this happens before features_ok? that would be ideal as we could require that features_ok fails 2. mappings are not modified between driver_ok and reset i guess attempts to change them will fail - possibly by causing a guest crash or some other kind of platform-specific error So far so good, but now a question: how are we handling guest address width limitations? Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to guest address width limitations? I am guessing we can make them so ... This needs to be documented. > > > > And can it access any guest physical address? > > Sorry, I was mistaken. We do support VFIO in guests but not for virtio > devices, only for regular PCI devices. In which case they will use > address translation. Not sure how this answers the question. > >> If the guest kernel is concerned that an unprivileged driver could > >> jeopardize its integrity it should not negotiate this feature flag. > > > > Unfortunately flag negotiation is done through config space > > and so can be overwritten by the driver. > > Ok, so the guest kernel has to forbid VFIO access on devices where this > flag is advertised. That's possible in theory but in practice we did not yet teach VFIO not to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all security relies on host denying driver_ok without VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are thus tricky as they can create security holes for existing guests. I'm open to ideas about how to do this in a safe way, > >> Perhaps there should be a note about this in the flag definition? This > >> concern is platform-dependant though. I don't believe it's an issue in > >> pseries. > > > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > > say it's all up to the platform. > > > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > > implemented portably? virtio has no portable way to know > > whether DMA API bypasses translation. > > The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > communicates that knowledge to virtio. There is a shared understanding > between the guest and the host about what this flag being set means. Right but I wonder how are you going to *actually* implement it on Linux? Are you adding a new set of DMA APIs that do everything except translation? > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-01 14:17 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-01 14:17 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> I rephrased it in terms of address translation. What do you think of > >> >> this version? The flag name is slightly different too: > >> >> > >> >> > >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> with the exception that address translation is guaranteed to be > >> >> unnecessary when accessing memory addresses supplied to the device > >> >> by the driver. Which is to say, the device will always use physical > >> >> addresses matching addresses used by the driver (typically meaning > >> >> physical addresses used by the CPU) and not translated further. This > >> >> flag should be set by the guest if offered, but to allow for > >> >> backward-compatibility device implementations allow for it to be > >> >> left unset by the guest. It is an error to set both this flag and > >> >> VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > > >> > > >> > > >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> > drivers. This is why devices fail when it's not negotiated. > >> > >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> some other sense such as needing to use bounce buffers for some reason? > > > > I had drivers in guest userspace in mind. > > Great. Thanks for clarifying. > > I don't think this flag would work for guest userspace drivers. Should I > add a note about that in the flag definition? I think you need to clarify access protection rules. Is it only translation that is bypassed or is any platform-specific protection mechanism bypassed too? > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? > >> > >> Not needing address translation doesn't necessarily mean that there's no > >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> to program the IOMMU. > >> > >> For our use case, we don't need address translation because we set up an > >> identity mapping in the IOMMU so that the device can use guest physical > >> addresses. OK so I think I am beginning to see it in a different light. Right now the specific platform creates an identity mapping. That in turn means DMA API can be fast - it does not need to do anything. What you are looking for is a way to tell host it's an identity mapping - just as an optimization. Is that right? So this is what I would call this option: VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS and the explanation should state that all device addresses are translated by the platform to identical addresses. In fact this option then becomes more, not less restrictive than VIRTIO_F_ACCESS_PLATFORM - it's a promise by guest to only create identity mappings, and only before driver_ok is set. This option then would always be negotiated together with VIRTIO_F_ACCESS_PLATFORM. Host then must verify that 1. full 1:1 mappings are created before driver_ok or can we make sure this happens before features_ok? that would be ideal as we could require that features_ok fails 2. mappings are not modified between driver_ok and reset i guess attempts to change them will fail - possibly by causing a guest crash or some other kind of platform-specific error So far so good, but now a question: how are we handling guest address width limitations? Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to guest address width limitations? I am guessing we can make them so ... This needs to be documented. > > > > And can it access any guest physical address? > > Sorry, I was mistaken. We do support VFIO in guests but not for virtio > devices, only for regular PCI devices. In which case they will use > address translation. Not sure how this answers the question. > >> If the guest kernel is concerned that an unprivileged driver could > >> jeopardize its integrity it should not negotiate this feature flag. > > > > Unfortunately flag negotiation is done through config space > > and so can be overwritten by the driver. > > Ok, so the guest kernel has to forbid VFIO access on devices where this > flag is advertised. That's possible in theory but in practice we did not yet teach VFIO not to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all security relies on host denying driver_ok without VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are thus tricky as they can create security holes for existing guests. I'm open to ideas about how to do this in a safe way, > >> Perhaps there should be a note about this in the flag definition? This > >> concern is platform-dependant though. I don't believe it's an issue in > >> pseries. > > > > Again ACCESS_PLATFORM has a pretty open definition. It does actually > > say it's all up to the platform. > > > > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > > implemented portably? virtio has no portable way to know > > whether DMA API bypasses translation. > > The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > communicates that knowledge to virtio. There is a shared understanding > between the guest and the host about what this flag being set means. Right but I wonder how are you going to *actually* implement it on Linux? Are you adding a new set of DMA APIs that do everything except translation? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-01 14:17 ` Michael S. Tsirkin (?) (?) @ 2019-07-14 5:51 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-14 5:51 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> I rephrased it in terms of address translation. What do you think of >> >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> >> with the exception that address translation is guaranteed to be >> >> >> unnecessary when accessing memory addresses supplied to the device >> >> >> by the driver. Which is to say, the device will always use physical >> >> >> addresses matching addresses used by the driver (typically meaning >> >> >> physical addresses used by the CPU) and not translated further. This >> >> >> flag should be set by the guest if offered, but to allow for >> >> >> backward-compatibility device implementations allow for it to be >> >> >> left unset by the guest. It is an error to set both this flag and >> >> >> VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > >> >> > >> >> > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> >> > drivers. This is why devices fail when it's not negotiated. >> >> >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> >> implemented in guest userspace such as with VFIO? Or unprivileged in >> >> some other sense such as needing to use bounce buffers for some reason? >> > >> > I had drivers in guest userspace in mind. >> >> Great. Thanks for clarifying. >> >> I don't think this flag would work for guest userspace drivers. Should I >> add a note about that in the flag definition? > > I think you need to clarify access protection rules. Is it only > translation that is bypassed or is any platform-specific > protection mechanism bypassed too? It is only translation. In a secure guest, if the device tries to access a memory address that wasn't provided by the driver then the architecture will deny that access. If the device accesses addresses provided to it by the driver, then there's no protection mechanism or translation to get in the way. >> >> > This confuses me. >> >> > If driver is unpriveledged then what happens with this flag? >> >> > It can supply any address it wants. Will that corrupt kernel >> >> > memory? >> >> >> >> Not needing address translation doesn't necessarily mean that there's no >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> >> to program the IOMMU. >> >> >> >> For our use case, we don't need address translation because we set up an >> >> identity mapping in the IOMMU so that the device can use guest physical >> >> addresses. > > OK so I think I am beginning to see it in a different light. Right now the specific > platform creates an identity mapping. That in turn means DMA API can be > fast - it does not need to do anything. What you are looking for is a > way to tell host it's an identity mapping - just as an optimization. > > Is that right? Almost. Theoretically it is just an optimization. But in practice the pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not possible to boot a guest from a device with that flag set. > So this is what I would call this option: > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > and the explanation should state that all device > addresses are translated by the platform to identical > addresses. > > In fact this option then becomes more, not less restrictive > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > by guest to only create identity mappings, > and only before driver_ok is set. > This option then would always be negotiated together with > VIRTIO_F_ACCESS_PLATFORM. > > Host then must verify that > 1. full 1:1 mappings are created before driver_ok > or can we make sure this happens before features_ok? > that would be ideal as we could require that features_ok fails > 2. mappings are not modified between driver_ok and reset > i guess attempts to change them will fail - > possibly by causing a guest crash > or some other kind of platform-specific error I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is SLOF as I mentioned above, another is that we would be requiring all guests running on the machine (secure guests or not, since we would use the same configuration for all guests) to support it. But ACCESS_PLATFORM is relatively recent so it's a bit early for that. For instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about it and wouldn't be able to use the device. > So far so good, but now a question: > > how are we handling guest address width limitations? > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > guest address width limitations? > I am guessing we can make them so ... > This needs to be documented. I'm not sure. I will get back to you on this. >> > And can it access any guest physical address? >> >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio >> devices, only for regular PCI devices. In which case they will use >> address translation. > > Not sure how this answers the question. Because I had said that we had VFIO virtio drivers, you asked: > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? Since we can't actually have VFIO virtio drivers, there's nothing to corrupt the kernel memory. >> >> If the guest kernel is concerned that an unprivileged driver could >> >> jeopardize its integrity it should not negotiate this feature flag. >> > >> > Unfortunately flag negotiation is done through config space >> > and so can be overwritten by the driver. >> >> Ok, so the guest kernel has to forbid VFIO access on devices where this >> flag is advertised. > > That's possible in theory but in practice we did not yet teach VFIO not > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > security relies on host denying driver_ok without > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > thus tricky as they can create security holes for existing guests. > I'm open to ideas about how to do this in a safe way, If the new flag isn't coupled with ACCESS_PLATFORM then the existing mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set will be enough. >> >> Perhaps there should be a note about this in the flag definition? This >> >> concern is platform-dependant though. I don't believe it's an issue in >> >> pseries. >> > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually >> > say it's all up to the platform. >> > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be >> > implemented portably? virtio has no portable way to know >> > whether DMA API bypasses translation. >> >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set >> communicates that knowledge to virtio. There is a shared understanding >> between the guest and the host about what this flag being set means. > > Right but I wonder how are you going to *actually* implement it on Linux? > Are you adding a new set of DMA APIs that do everything except > translation? Actually it's the opposite. There's nothing to do in the guest besides setting up SWIOTLB and sharing its buffer with the host. Normally on pseries, devices use the dma_iommu_ops defined in arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the device's dma_ops to NULL so that the default DMA path will be used: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ Then another patch forces use of SWIOTLB and defines the set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its buffer be shared with the host: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-01 14:17 ` Michael S. Tsirkin (?) @ 2019-07-14 5:51 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-14 5:51 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> I rephrased it in terms of address translation. What do you think of >> >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> >> with the exception that address translation is guaranteed to be >> >> >> unnecessary when accessing memory addresses supplied to the device >> >> >> by the driver. Which is to say, the device will always use physical >> >> >> addresses matching addresses used by the driver (typically meaning >> >> >> physical addresses used by the CPU) and not translated further. This >> >> >> flag should be set by the guest if offered, but to allow for >> >> >> backward-compatibility device implementations allow for it to be >> >> >> left unset by the guest. It is an error to set both this flag and >> >> >> VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > >> >> > >> >> > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> >> > drivers. This is why devices fail when it's not negotiated. >> >> >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> >> implemented in guest userspace such as with VFIO? Or unprivileged in >> >> some other sense such as needing to use bounce buffers for some reason? >> > >> > I had drivers in guest userspace in mind. >> >> Great. Thanks for clarifying. >> >> I don't think this flag would work for guest userspace drivers. Should I >> add a note about that in the flag definition? > > I think you need to clarify access protection rules. Is it only > translation that is bypassed or is any platform-specific > protection mechanism bypassed too? It is only translation. In a secure guest, if the device tries to access a memory address that wasn't provided by the driver then the architecture will deny that access. If the device accesses addresses provided to it by the driver, then there's no protection mechanism or translation to get in the way. >> >> > This confuses me. >> >> > If driver is unpriveledged then what happens with this flag? >> >> > It can supply any address it wants. Will that corrupt kernel >> >> > memory? >> >> >> >> Not needing address translation doesn't necessarily mean that there's no >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> >> to program the IOMMU. >> >> >> >> For our use case, we don't need address translation because we set up an >> >> identity mapping in the IOMMU so that the device can use guest physical >> >> addresses. > > OK so I think I am beginning to see it in a different light. Right now the specific > platform creates an identity mapping. That in turn means DMA API can be > fast - it does not need to do anything. What you are looking for is a > way to tell host it's an identity mapping - just as an optimization. > > Is that right? Almost. Theoretically it is just an optimization. But in practice the pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not possible to boot a guest from a device with that flag set. > So this is what I would call this option: > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > and the explanation should state that all device > addresses are translated by the platform to identical > addresses. > > In fact this option then becomes more, not less restrictive > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > by guest to only create identity mappings, > and only before driver_ok is set. > This option then would always be negotiated together with > VIRTIO_F_ACCESS_PLATFORM. > > Host then must verify that > 1. full 1:1 mappings are created before driver_ok > or can we make sure this happens before features_ok? > that would be ideal as we could require that features_ok fails > 2. mappings are not modified between driver_ok and reset > i guess attempts to change them will fail - > possibly by causing a guest crash > or some other kind of platform-specific error I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is SLOF as I mentioned above, another is that we would be requiring all guests running on the machine (secure guests or not, since we would use the same configuration for all guests) to support it. But ACCESS_PLATFORM is relatively recent so it's a bit early for that. For instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about it and wouldn't be able to use the device. > So far so good, but now a question: > > how are we handling guest address width limitations? > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > guest address width limitations? > I am guessing we can make them so ... > This needs to be documented. I'm not sure. I will get back to you on this. >> > And can it access any guest physical address? >> >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio >> devices, only for regular PCI devices. In which case they will use >> address translation. > > Not sure how this answers the question. Because I had said that we had VFIO virtio drivers, you asked: > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? Since we can't actually have VFIO virtio drivers, there's nothing to corrupt the kernel memory. >> >> If the guest kernel is concerned that an unprivileged driver could >> >> jeopardize its integrity it should not negotiate this feature flag. >> > >> > Unfortunately flag negotiation is done through config space >> > and so can be overwritten by the driver. >> >> Ok, so the guest kernel has to forbid VFIO access on devices where this >> flag is advertised. > > That's possible in theory but in practice we did not yet teach VFIO not > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > security relies on host denying driver_ok without > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > thus tricky as they can create security holes for existing guests. > I'm open to ideas about how to do this in a safe way, If the new flag isn't coupled with ACCESS_PLATFORM then the existing mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set will be enough. >> >> Perhaps there should be a note about this in the flag definition? This >> >> concern is platform-dependant though. I don't believe it's an issue in >> >> pseries. >> > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually >> > say it's all up to the platform. >> > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be >> > implemented portably? virtio has no portable way to know >> > whether DMA API bypasses translation. >> >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set >> communicates that knowledge to virtio. There is a shared understanding >> between the guest and the host about what this flag being set means. > > Right but I wonder how are you going to *actually* implement it on Linux? > Are you adding a new set of DMA APIs that do everything except > translation? Actually it's the opposite. There's nothing to do in the guest besides setting up SWIOTLB and sharing its buffer with the host. Normally on pseries, devices use the dma_iommu_ops defined in arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the device's dma_ops to NULL so that the default DMA path will be used: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ Then another patch forces use of SWIOTLB and defines the set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its buffer be shared with the host: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-14 5:51 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-14 5:51 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> I rephrased it in terms of address translation. What do you think of >> >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> >> with the exception that address translation is guaranteed to be >> >> >> unnecessary when accessing memory addresses supplied to the device >> >> >> by the driver. Which is to say, the device will always use physical >> >> >> addresses matching addresses used by the driver (typically meaning >> >> >> physical addresses used by the CPU) and not translated further. This >> >> >> flag should be set by the guest if offered, but to allow for >> >> >> backward-compatibility device implementations allow for it to be >> >> >> left unset by the guest. It is an error to set both this flag and >> >> >> VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > >> >> > >> >> > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> >> > drivers. This is why devices fail when it's not negotiated. >> >> >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> >> implemented in guest userspace such as with VFIO? Or unprivileged in >> >> some other sense such as needing to use bounce buffers for some reason? >> > >> > I had drivers in guest userspace in mind. >> >> Great. Thanks for clarifying. >> >> I don't think this flag would work for guest userspace drivers. Should I >> add a note about that in the flag definition? > > I think you need to clarify access protection rules. Is it only > translation that is bypassed or is any platform-specific > protection mechanism bypassed too? It is only translation. In a secure guest, if the device tries to access a memory address that wasn't provided by the driver then the architecture will deny that access. If the device accesses addresses provided to it by the driver, then there's no protection mechanism or translation to get in the way. >> >> > This confuses me. >> >> > If driver is unpriveledged then what happens with this flag? >> >> > It can supply any address it wants. Will that corrupt kernel >> >> > memory? >> >> >> >> Not needing address translation doesn't necessarily mean that there's no >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> >> to program the IOMMU. >> >> >> >> For our use case, we don't need address translation because we set up an >> >> identity mapping in the IOMMU so that the device can use guest physical >> >> addresses. > > OK so I think I am beginning to see it in a different light. Right now the specific > platform creates an identity mapping. That in turn means DMA API can be > fast - it does not need to do anything. What you are looking for is a > way to tell host it's an identity mapping - just as an optimization. > > Is that right? Almost. Theoretically it is just an optimization. But in practice the pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not possible to boot a guest from a device with that flag set. > So this is what I would call this option: > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > and the explanation should state that all device > addresses are translated by the platform to identical > addresses. > > In fact this option then becomes more, not less restrictive > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > by guest to only create identity mappings, > and only before driver_ok is set. > This option then would always be negotiated together with > VIRTIO_F_ACCESS_PLATFORM. > > Host then must verify that > 1. full 1:1 mappings are created before driver_ok > or can we make sure this happens before features_ok? > that would be ideal as we could require that features_ok fails > 2. mappings are not modified between driver_ok and reset > i guess attempts to change them will fail - > possibly by causing a guest crash > or some other kind of platform-specific error I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is SLOF as I mentioned above, another is that we would be requiring all guests running on the machine (secure guests or not, since we would use the same configuration for all guests) to support it. But ACCESS_PLATFORM is relatively recent so it's a bit early for that. For instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about it and wouldn't be able to use the device. > So far so good, but now a question: > > how are we handling guest address width limitations? > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > guest address width limitations? > I am guessing we can make them so ... > This needs to be documented. I'm not sure. I will get back to you on this. >> > And can it access any guest physical address? >> >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio >> devices, only for regular PCI devices. In which case they will use >> address translation. > > Not sure how this answers the question. Because I had said that we had VFIO virtio drivers, you asked: > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? Since we can't actually have VFIO virtio drivers, there's nothing to corrupt the kernel memory. >> >> If the guest kernel is concerned that an unprivileged driver could >> >> jeopardize its integrity it should not negotiate this feature flag. >> > >> > Unfortunately flag negotiation is done through config space >> > and so can be overwritten by the driver. >> >> Ok, so the guest kernel has to forbid VFIO access on devices where this >> flag is advertised. > > That's possible in theory but in practice we did not yet teach VFIO not > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > security relies on host denying driver_ok without > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > thus tricky as they can create security holes for existing guests. > I'm open to ideas about how to do this in a safe way, If the new flag isn't coupled with ACCESS_PLATFORM then the existing mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set will be enough. >> >> Perhaps there should be a note about this in the flag definition? This >> >> concern is platform-dependant though. I don't believe it's an issue in >> >> pseries. >> > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually >> > say it's all up to the platform. >> > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be >> > implemented portably? virtio has no portable way to know >> > whether DMA API bypasses translation. >> >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set >> communicates that knowledge to virtio. There is a shared understanding >> between the guest and the host about what this flag being set means. > > Right but I wonder how are you going to *actually* implement it on Linux? > Are you adding a new set of DMA APIs that do everything except > translation? Actually it's the opposite. There's nothing to do in the guest besides setting up SWIOTLB and sharing its buffer with the host. Normally on pseries, devices use the dma_iommu_ops defined in arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the device's dma_ops to NULL so that the default DMA path will be used: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ Then another patch forces use of SWIOTLB and defines the set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its buffer be shared with the host: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-14 5:51 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-14 5:51 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> >> >> I rephrased it in terms of address translation. What do you think of >> >> >> this version? The flag name is slightly different too: >> >> >> >> >> >> >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> >> >> with the exception that address translation is guaranteed to be >> >> >> unnecessary when accessing memory addresses supplied to the device >> >> >> by the driver. Which is to say, the device will always use physical >> >> >> addresses matching addresses used by the driver (typically meaning >> >> >> physical addresses used by the CPU) and not translated further. This >> >> >> flag should be set by the guest if offered, but to allow for >> >> >> backward-compatibility device implementations allow for it to be >> >> >> left unset by the guest. It is an error to set both this flag and >> >> >> VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > >> >> > >> >> > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged >> >> > drivers. This is why devices fail when it's not negotiated. >> >> >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers >> >> implemented in guest userspace such as with VFIO? Or unprivileged in >> >> some other sense such as needing to use bounce buffers for some reason? >> > >> > I had drivers in guest userspace in mind. >> >> Great. Thanks for clarifying. >> >> I don't think this flag would work for guest userspace drivers. Should I >> add a note about that in the flag definition? > > I think you need to clarify access protection rules. Is it only > translation that is bypassed or is any platform-specific > protection mechanism bypassed too? It is only translation. In a secure guest, if the device tries to access a memory address that wasn't provided by the driver then the architecture will deny that access. If the device accesses addresses provided to it by the driver, then there's no protection mechanism or translation to get in the way. >> >> > This confuses me. >> >> > If driver is unpriveledged then what happens with this flag? >> >> > It can supply any address it wants. Will that corrupt kernel >> >> > memory? >> >> >> >> Not needing address translation doesn't necessarily mean that there's no >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls >> >> to program the IOMMU. >> >> >> >> For our use case, we don't need address translation because we set up an >> >> identity mapping in the IOMMU so that the device can use guest physical >> >> addresses. > > OK so I think I am beginning to see it in a different light. Right now the specific > platform creates an identity mapping. That in turn means DMA API can be > fast - it does not need to do anything. What you are looking for is a > way to tell host it's an identity mapping - just as an optimization. > > Is that right? Almost. Theoretically it is just an optimization. But in practice the pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not possible to boot a guest from a device with that flag set. > So this is what I would call this option: > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > and the explanation should state that all device > addresses are translated by the platform to identical > addresses. > > In fact this option then becomes more, not less restrictive > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > by guest to only create identity mappings, > and only before driver_ok is set. > This option then would always be negotiated together with > VIRTIO_F_ACCESS_PLATFORM. > > Host then must verify that > 1. full 1:1 mappings are created before driver_ok > or can we make sure this happens before features_ok? > that would be ideal as we could require that features_ok fails > 2. mappings are not modified between driver_ok and reset > i guess attempts to change them will fail - > possibly by causing a guest crash > or some other kind of platform-specific error I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is SLOF as I mentioned above, another is that we would be requiring all guests running on the machine (secure guests or not, since we would use the same configuration for all guests) to support it. But ACCESS_PLATFORM is relatively recent so it's a bit early for that. For instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about it and wouldn't be able to use the device. > So far so good, but now a question: > > how are we handling guest address width limitations? > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > guest address width limitations? > I am guessing we can make them so ... > This needs to be documented. I'm not sure. I will get back to you on this. >> > And can it access any guest physical address? >> >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio >> devices, only for regular PCI devices. In which case they will use >> address translation. > > Not sure how this answers the question. Because I had said that we had VFIO virtio drivers, you asked: > >> > This confuses me. > >> > If driver is unpriveledged then what happens with this flag? > >> > It can supply any address it wants. Will that corrupt kernel > >> > memory? Since we can't actually have VFIO virtio drivers, there's nothing to corrupt the kernel memory. >> >> If the guest kernel is concerned that an unprivileged driver could >> >> jeopardize its integrity it should not negotiate this feature flag. >> > >> > Unfortunately flag negotiation is done through config space >> > and so can be overwritten by the driver. >> >> Ok, so the guest kernel has to forbid VFIO access on devices where this >> flag is advertised. > > That's possible in theory but in practice we did not yet teach VFIO not > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > security relies on host denying driver_ok without > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > thus tricky as they can create security holes for existing guests. > I'm open to ideas about how to do this in a safe way, If the new flag isn't coupled with ACCESS_PLATFORM then the existing mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set will be enough. >> >> Perhaps there should be a note about this in the flag definition? This >> >> concern is platform-dependant though. I don't believe it's an issue in >> >> pseries. >> > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually >> > say it's all up to the platform. >> > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be >> > implemented portably? virtio has no portable way to know >> > whether DMA API bypasses translation. >> >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set >> communicates that knowledge to virtio. There is a shared understanding >> between the guest and the host about what this flag being set means. > > Right but I wonder how are you going to *actually* implement it on Linux? > Are you adding a new set of DMA APIs that do everything except > translation? Actually it's the opposite. There's nothing to do in the guest besides setting up SWIOTLB and sharing its buffer with the host. Normally on pseries, devices use the dma_iommu_ops defined in arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the device's dma_ops to NULL so that the default DMA path will be used: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ Then another patch forces use of SWIOTLB and defines the set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its buffer be shared with the host: https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-14 5:51 ` Thiago Jung Bauermann (?) (?) @ 2019-07-15 14:35 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 14:35 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> >> I rephrased it in terms of address translation. What do you think of > >> >> >> this version? The flag name is slightly different too: > >> >> >> > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> >> with the exception that address translation is guaranteed to be > >> >> >> unnecessary when accessing memory addresses supplied to the device > >> >> >> by the driver. Which is to say, the device will always use physical > >> >> >> addresses matching addresses used by the driver (typically meaning > >> >> >> physical addresses used by the CPU) and not translated further. This > >> >> >> flag should be set by the guest if offered, but to allow for > >> >> >> backward-compatibility device implementations allow for it to be > >> >> >> left unset by the guest. It is an error to set both this flag and > >> >> >> VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> >> > drivers. This is why devices fail when it's not negotiated. > >> >> > >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> >> some other sense such as needing to use bounce buffers for some reason? > >> > > >> > I had drivers in guest userspace in mind. > >> > >> Great. Thanks for clarifying. > >> > >> I don't think this flag would work for guest userspace drivers. Should I > >> add a note about that in the flag definition? > > > > I think you need to clarify access protection rules. Is it only > > translation that is bypassed or is any platform-specific > > protection mechanism bypassed too? > > It is only translation. In a secure guest, if the device tries to access > a memory address that wasn't provided by the driver then the > architecture will deny that access. If the device accesses addresses > provided to it by the driver, then there's no protection mechanism or > translation to get in the way. > > >> >> > This confuses me. > >> >> > If driver is unpriveledged then what happens with this flag? > >> >> > It can supply any address it wants. Will that corrupt kernel > >> >> > memory? > >> >> > >> >> Not needing address translation doesn't necessarily mean that there's no > >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> >> to program the IOMMU. > >> >> > >> >> For our use case, we don't need address translation because we set up an > >> >> identity mapping in the IOMMU so that the device can use guest physical > >> >> addresses. > > > > OK so I think I am beginning to see it in a different light. Right now the specific > > platform creates an identity mapping. That in turn means DMA API can be > > fast - it does not need to do anything. What you are looking for is a > > way to tell host it's an identity mapping - just as an optimization. > > > > Is that right? > > Almost. Theoretically it is just an optimization. But in practice the > pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not > possible to boot a guest from a device with that flag set. > > > So this is what I would call this option: > > > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > > > and the explanation should state that all device > > addresses are translated by the platform to identical > > addresses. > > > > In fact this option then becomes more, not less restrictive > > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > > by guest to only create identity mappings, > > and only before driver_ok is set. > > This option then would always be negotiated together with > > VIRTIO_F_ACCESS_PLATFORM. > > > > Host then must verify that > > 1. full 1:1 mappings are created before driver_ok > > or can we make sure this happens before features_ok? > > that would be ideal as we could require that features_ok fails > > 2. mappings are not modified between driver_ok and reset > > i guess attempts to change them will fail - > > possibly by causing a guest crash > > or some other kind of platform-specific error > > I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > SLOF as I mentioned above, another is that we would be requiring all > guests running on the machine (secure guests or not, since we would use > the same configuration for all guests) to support it. But > ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > it and wouldn't be able to use the device. OK and your target is to enable use with kernel drivers within guests, right? My question is, we are defining a new flag here, I guess old guests then do not set it. How does it help old guests? Or maybe it's not designed to ... > > So far so good, but now a question: > > > > how are we handling guest address width limitations? > > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > > guest address width limitations? > > I am guessing we can make them so ... > > This needs to be documented. > > I'm not sure. I will get back to you on this. > > >> > And can it access any guest physical address? > >> > >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio > >> devices, only for regular PCI devices. In which case they will use > >> address translation. > > > > Not sure how this answers the question. > > Because I had said that we had VFIO virtio drivers, you asked: > > > >> > This confuses me. > > >> > If driver is unpriveledged then what happens with this flag? > > >> > It can supply any address it wants. Will that corrupt kernel > > >> > memory? > > Since we can't actually have VFIO virtio drivers, there's nothing to > corrupt the kernel memory. > > >> >> If the guest kernel is concerned that an unprivileged driver could > >> >> jeopardize its integrity it should not negotiate this feature flag. > >> > > >> > Unfortunately flag negotiation is done through config space > >> > and so can be overwritten by the driver. > >> > >> Ok, so the guest kernel has to forbid VFIO access on devices where this > >> flag is advertised. > > > > That's possible in theory but in practice we did not yet teach VFIO not > > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > > security relies on host denying driver_ok without > > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > > thus tricky as they can create security holes for existing guests. > > I'm open to ideas about how to do this in a safe way, > > If the new flag isn't coupled with ACCESS_PLATFORM then the existing > mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set > will be enough. > > >> >> Perhaps there should be a note about this in the flag definition? This > >> >> concern is platform-dependant though. I don't believe it's an issue in > >> >> pseries. > >> > > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually > >> > say it's all up to the platform. > >> > > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > >> > implemented portably? virtio has no portable way to know > >> > whether DMA API bypasses translation. > >> > >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > >> communicates that knowledge to virtio. There is a shared understanding > >> between the guest and the host about what this flag being set means. > > > > Right but I wonder how are you going to *actually* implement it on Linux? > > Are you adding a new set of DMA APIs that do everything except > > translation? > > Actually it's the opposite. There's nothing to do in the guest besides > setting up SWIOTLB and sharing its buffer with the host. > > Normally on pseries, devices use the dma_iommu_ops defined in > arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the > device's dma_ops to NULL so that the default DMA path will be used: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ > > Then another patch forces use of SWIOTLB and defines the > set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its > buffer be shared with the host: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-14 5:51 ` Thiago Jung Bauermann (?) @ 2019-07-15 14:35 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 14:35 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> >> I rephrased it in terms of address translation. What do you think of > >> >> >> this version? The flag name is slightly different too: > >> >> >> > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> >> with the exception that address translation is guaranteed to be > >> >> >> unnecessary when accessing memory addresses supplied to the device > >> >> >> by the driver. Which is to say, the device will always use physical > >> >> >> addresses matching addresses used by the driver (typically meaning > >> >> >> physical addresses used by the CPU) and not translated further. This > >> >> >> flag should be set by the guest if offered, but to allow for > >> >> >> backward-compatibility device implementations allow for it to be > >> >> >> left unset by the guest. It is an error to set both this flag and > >> >> >> VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> >> > drivers. This is why devices fail when it's not negotiated. > >> >> > >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> >> some other sense such as needing to use bounce buffers for some reason? > >> > > >> > I had drivers in guest userspace in mind. > >> > >> Great. Thanks for clarifying. > >> > >> I don't think this flag would work for guest userspace drivers. Should I > >> add a note about that in the flag definition? > > > > I think you need to clarify access protection rules. Is it only > > translation that is bypassed or is any platform-specific > > protection mechanism bypassed too? > > It is only translation. In a secure guest, if the device tries to access > a memory address that wasn't provided by the driver then the > architecture will deny that access. If the device accesses addresses > provided to it by the driver, then there's no protection mechanism or > translation to get in the way. > > >> >> > This confuses me. > >> >> > If driver is unpriveledged then what happens with this flag? > >> >> > It can supply any address it wants. Will that corrupt kernel > >> >> > memory? > >> >> > >> >> Not needing address translation doesn't necessarily mean that there's no > >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> >> to program the IOMMU. > >> >> > >> >> For our use case, we don't need address translation because we set up an > >> >> identity mapping in the IOMMU so that the device can use guest physical > >> >> addresses. > > > > OK so I think I am beginning to see it in a different light. Right now the specific > > platform creates an identity mapping. That in turn means DMA API can be > > fast - it does not need to do anything. What you are looking for is a > > way to tell host it's an identity mapping - just as an optimization. > > > > Is that right? > > Almost. Theoretically it is just an optimization. But in practice the > pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not > possible to boot a guest from a device with that flag set. > > > So this is what I would call this option: > > > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > > > and the explanation should state that all device > > addresses are translated by the platform to identical > > addresses. > > > > In fact this option then becomes more, not less restrictive > > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > > by guest to only create identity mappings, > > and only before driver_ok is set. > > This option then would always be negotiated together with > > VIRTIO_F_ACCESS_PLATFORM. > > > > Host then must verify that > > 1. full 1:1 mappings are created before driver_ok > > or can we make sure this happens before features_ok? > > that would be ideal as we could require that features_ok fails > > 2. mappings are not modified between driver_ok and reset > > i guess attempts to change them will fail - > > possibly by causing a guest crash > > or some other kind of platform-specific error > > I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > SLOF as I mentioned above, another is that we would be requiring all > guests running on the machine (secure guests or not, since we would use > the same configuration for all guests) to support it. But > ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > it and wouldn't be able to use the device. OK and your target is to enable use with kernel drivers within guests, right? My question is, we are defining a new flag here, I guess old guests then do not set it. How does it help old guests? Or maybe it's not designed to ... > > So far so good, but now a question: > > > > how are we handling guest address width limitations? > > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > > guest address width limitations? > > I am guessing we can make them so ... > > This needs to be documented. > > I'm not sure. I will get back to you on this. > > >> > And can it access any guest physical address? > >> > >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio > >> devices, only for regular PCI devices. In which case they will use > >> address translation. > > > > Not sure how this answers the question. > > Because I had said that we had VFIO virtio drivers, you asked: > > > >> > This confuses me. > > >> > If driver is unpriveledged then what happens with this flag? > > >> > It can supply any address it wants. Will that corrupt kernel > > >> > memory? > > Since we can't actually have VFIO virtio drivers, there's nothing to > corrupt the kernel memory. > > >> >> If the guest kernel is concerned that an unprivileged driver could > >> >> jeopardize its integrity it should not negotiate this feature flag. > >> > > >> > Unfortunately flag negotiation is done through config space > >> > and so can be overwritten by the driver. > >> > >> Ok, so the guest kernel has to forbid VFIO access on devices where this > >> flag is advertised. > > > > That's possible in theory but in practice we did not yet teach VFIO not > > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > > security relies on host denying driver_ok without > > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > > thus tricky as they can create security holes for existing guests. > > I'm open to ideas about how to do this in a safe way, > > If the new flag isn't coupled with ACCESS_PLATFORM then the existing > mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set > will be enough. > > >> >> Perhaps there should be a note about this in the flag definition? This > >> >> concern is platform-dependant though. I don't believe it's an issue in > >> >> pseries. > >> > > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually > >> > say it's all up to the platform. > >> > > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > >> > implemented portably? virtio has no portable way to know > >> > whether DMA API bypasses translation. > >> > >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > >> communicates that knowledge to virtio. There is a shared understanding > >> between the guest and the host about what this flag being set means. > > > > Right but I wonder how are you going to *actually* implement it on Linux? > > Are you adding a new set of DMA APIs that do everything except > > translation? > > Actually it's the opposite. There's nothing to do in the guest besides > setting up SWIOTLB and sharing its buffer with the host. > > Normally on pseries, devices use the dma_iommu_ops defined in > arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the > device's dma_ops to NULL so that the default DMA path will be used: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ > > Then another patch forces use of SWIOTLB and defines the > set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its > buffer be shared with the host: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 14:35 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 14:35 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> >> I rephrased it in terms of address translation. What do you think of > >> >> >> this version? The flag name is slightly different too: > >> >> >> > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> >> with the exception that address translation is guaranteed to be > >> >> >> unnecessary when accessing memory addresses supplied to the device > >> >> >> by the driver. Which is to say, the device will always use physical > >> >> >> addresses matching addresses used by the driver (typically meaning > >> >> >> physical addresses used by the CPU) and not translated further. This > >> >> >> flag should be set by the guest if offered, but to allow for > >> >> >> backward-compatibility device implementations allow for it to be > >> >> >> left unset by the guest. It is an error to set both this flag and > >> >> >> VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> >> > drivers. This is why devices fail when it's not negotiated. > >> >> > >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> >> some other sense such as needing to use bounce buffers for some reason? > >> > > >> > I had drivers in guest userspace in mind. > >> > >> Great. Thanks for clarifying. > >> > >> I don't think this flag would work for guest userspace drivers. Should I > >> add a note about that in the flag definition? > > > > I think you need to clarify access protection rules. Is it only > > translation that is bypassed or is any platform-specific > > protection mechanism bypassed too? > > It is only translation. In a secure guest, if the device tries to access > a memory address that wasn't provided by the driver then the > architecture will deny that access. If the device accesses addresses > provided to it by the driver, then there's no protection mechanism or > translation to get in the way. > > >> >> > This confuses me. > >> >> > If driver is unpriveledged then what happens with this flag? > >> >> > It can supply any address it wants. Will that corrupt kernel > >> >> > memory? > >> >> > >> >> Not needing address translation doesn't necessarily mean that there's no > >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> >> to program the IOMMU. > >> >> > >> >> For our use case, we don't need address translation because we set up an > >> >> identity mapping in the IOMMU so that the device can use guest physical > >> >> addresses. > > > > OK so I think I am beginning to see it in a different light. Right now the specific > > platform creates an identity mapping. That in turn means DMA API can be > > fast - it does not need to do anything. What you are looking for is a > > way to tell host it's an identity mapping - just as an optimization. > > > > Is that right? > > Almost. Theoretically it is just an optimization. But in practice the > pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not > possible to boot a guest from a device with that flag set. > > > So this is what I would call this option: > > > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > > > and the explanation should state that all device > > addresses are translated by the platform to identical > > addresses. > > > > In fact this option then becomes more, not less restrictive > > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > > by guest to only create identity mappings, > > and only before driver_ok is set. > > This option then would always be negotiated together with > > VIRTIO_F_ACCESS_PLATFORM. > > > > Host then must verify that > > 1. full 1:1 mappings are created before driver_ok > > or can we make sure this happens before features_ok? > > that would be ideal as we could require that features_ok fails > > 2. mappings are not modified between driver_ok and reset > > i guess attempts to change them will fail - > > possibly by causing a guest crash > > or some other kind of platform-specific error > > I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > SLOF as I mentioned above, another is that we would be requiring all > guests running on the machine (secure guests or not, since we would use > the same configuration for all guests) to support it. But > ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > it and wouldn't be able to use the device. OK and your target is to enable use with kernel drivers within guests, right? My question is, we are defining a new flag here, I guess old guests then do not set it. How does it help old guests? Or maybe it's not designed to ... > > So far so good, but now a question: > > > > how are we handling guest address width limitations? > > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > > guest address width limitations? > > I am guessing we can make them so ... > > This needs to be documented. > > I'm not sure. I will get back to you on this. > > >> > And can it access any guest physical address? > >> > >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio > >> devices, only for regular PCI devices. In which case they will use > >> address translation. > > > > Not sure how this answers the question. > > Because I had said that we had VFIO virtio drivers, you asked: > > > >> > This confuses me. > > >> > If driver is unpriveledged then what happens with this flag? > > >> > It can supply any address it wants. Will that corrupt kernel > > >> > memory? > > Since we can't actually have VFIO virtio drivers, there's nothing to > corrupt the kernel memory. > > >> >> If the guest kernel is concerned that an unprivileged driver could > >> >> jeopardize its integrity it should not negotiate this feature flag. > >> > > >> > Unfortunately flag negotiation is done through config space > >> > and so can be overwritten by the driver. > >> > >> Ok, so the guest kernel has to forbid VFIO access on devices where this > >> flag is advertised. > > > > That's possible in theory but in practice we did not yet teach VFIO not > > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > > security relies on host denying driver_ok without > > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > > thus tricky as they can create security holes for existing guests. > > I'm open to ideas about how to do this in a safe way, > > If the new flag isn't coupled with ACCESS_PLATFORM then the existing > mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set > will be enough. > > >> >> Perhaps there should be a note about this in the flag definition? This > >> >> concern is platform-dependant though. I don't believe it's an issue in > >> >> pseries. > >> > > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually > >> > say it's all up to the platform. > >> > > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > >> > implemented portably? virtio has no portable way to know > >> > whether DMA API bypasses translation. > >> > >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > >> communicates that knowledge to virtio. There is a shared understanding > >> between the guest and the host about what this flag being set means. > > > > Right but I wonder how are you going to *actually* implement it on Linux? > > Are you adding a new set of DMA APIs that do everything except > > translation? > > Actually it's the opposite. There's nothing to do in the guest besides > setting up SWIOTLB and sharing its buffer with the host. > > Normally on pseries, devices use the dma_iommu_ops defined in > arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the > device's dma_ops to NULL so that the default DMA path will be used: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ > > Then another patch forces use of SWIOTLB and defines the > set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its > buffer be shared with the host: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ > > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 14:35 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 14:35 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Thu, Jun 27, 2019 at 10:58:40PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> >> >> I rephrased it in terms of address translation. What do you think of > >> >> >> this version? The flag name is slightly different too: > >> >> >> > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> >> >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> >> >> with the exception that address translation is guaranteed to be > >> >> >> unnecessary when accessing memory addresses supplied to the device > >> >> >> by the driver. Which is to say, the device will always use physical > >> >> >> addresses matching addresses used by the driver (typically meaning > >> >> >> physical addresses used by the CPU) and not translated further. This > >> >> >> flag should be set by the guest if offered, but to allow for > >> >> >> backward-compatibility device implementations allow for it to be > >> >> >> left unset by the guest. It is an error to set both this flag and > >> >> >> VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > >> >> > drivers. This is why devices fail when it's not negotiated. > >> >> > >> >> Just to clarify, what do you mean by unprivileged drivers? Is it drivers > >> >> implemented in guest userspace such as with VFIO? Or unprivileged in > >> >> some other sense such as needing to use bounce buffers for some reason? > >> > > >> > I had drivers in guest userspace in mind. > >> > >> Great. Thanks for clarifying. > >> > >> I don't think this flag would work for guest userspace drivers. Should I > >> add a note about that in the flag definition? > > > > I think you need to clarify access protection rules. Is it only > > translation that is bypassed or is any platform-specific > > protection mechanism bypassed too? > > It is only translation. In a secure guest, if the device tries to access > a memory address that wasn't provided by the driver then the > architecture will deny that access. If the device accesses addresses > provided to it by the driver, then there's no protection mechanism or > translation to get in the way. > > >> >> > This confuses me. > >> >> > If driver is unpriveledged then what happens with this flag? > >> >> > It can supply any address it wants. Will that corrupt kernel > >> >> > memory? > >> >> > >> >> Not needing address translation doesn't necessarily mean that there's no > >> >> IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > >> >> always an IOMMU present. And we also support VFIO drivers. The VFIO API > >> >> for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > >> >> to program the IOMMU. > >> >> > >> >> For our use case, we don't need address translation because we set up an > >> >> identity mapping in the IOMMU so that the device can use guest physical > >> >> addresses. > > > > OK so I think I am beginning to see it in a different light. Right now the specific > > platform creates an identity mapping. That in turn means DMA API can be > > fast - it does not need to do anything. What you are looking for is a > > way to tell host it's an identity mapping - just as an optimization. > > > > Is that right? > > Almost. Theoretically it is just an optimization. But in practice the > pseries boot firmware (SLOF) doesn't support IOMMU_PLATFORM so it's not > possible to boot a guest from a device with that flag set. > > > So this is what I would call this option: > > > > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > > > > and the explanation should state that all device > > addresses are translated by the platform to identical > > addresses. > > > > In fact this option then becomes more, not less restrictive > > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > > by guest to only create identity mappings, > > and only before driver_ok is set. > > This option then would always be negotiated together with > > VIRTIO_F_ACCESS_PLATFORM. > > > > Host then must verify that > > 1. full 1:1 mappings are created before driver_ok > > or can we make sure this happens before features_ok? > > that would be ideal as we could require that features_ok fails > > 2. mappings are not modified between driver_ok and reset > > i guess attempts to change them will fail - > > possibly by causing a guest crash > > or some other kind of platform-specific error > > I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > SLOF as I mentioned above, another is that we would be requiring all > guests running on the machine (secure guests or not, since we would use > the same configuration for all guests) to support it. But > ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > it and wouldn't be able to use the device. OK and your target is to enable use with kernel drivers within guests, right? My question is, we are defining a new flag here, I guess old guests then do not set it. How does it help old guests? Or maybe it's not designed to ... > > So far so good, but now a question: > > > > how are we handling guest address width limitations? > > Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to > > guest address width limitations? > > I am guessing we can make them so ... > > This needs to be documented. > > I'm not sure. I will get back to you on this. > > >> > And can it access any guest physical address? > >> > >> Sorry, I was mistaken. We do support VFIO in guests but not for virtio > >> devices, only for regular PCI devices. In which case they will use > >> address translation. > > > > Not sure how this answers the question. > > Because I had said that we had VFIO virtio drivers, you asked: > > > >> > This confuses me. > > >> > If driver is unpriveledged then what happens with this flag? > > >> > It can supply any address it wants. Will that corrupt kernel > > >> > memory? > > Since we can't actually have VFIO virtio drivers, there's nothing to > corrupt the kernel memory. > > >> >> If the guest kernel is concerned that an unprivileged driver could > >> >> jeopardize its integrity it should not negotiate this feature flag. > >> > > >> > Unfortunately flag negotiation is done through config space > >> > and so can be overwritten by the driver. > >> > >> Ok, so the guest kernel has to forbid VFIO access on devices where this > >> flag is advertised. > > > > That's possible in theory but in practice we did not yet teach VFIO not > > to attach to legacy devices without VIRTIO_F_ACCESS_PLATFORM. So all > > security relies on host denying driver_ok without > > VIRTIO_F_ACCESS_PLATFORM. New options that bypass guest security are > > thus tricky as they can create security holes for existing guests. > > I'm open to ideas about how to do this in a safe way, > > If the new flag isn't coupled with ACCESS_PLATFORM then the existing > mechanism of the host denying driver_ok when ACCESS_PLATFORM isn't set > will be enough. > > >> >> Perhaps there should be a note about this in the flag definition? This > >> >> concern is platform-dependant though. I don't believe it's an issue in > >> >> pseries. > >> > > >> > Again ACCESS_PLATFORM has a pretty open definition. It does actually > >> > say it's all up to the platform. > >> > > >> > Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be > >> > implemented portably? virtio has no portable way to know > >> > whether DMA API bypasses translation. > >> > >> The fact that VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION is set > >> communicates that knowledge to virtio. There is a shared understanding > >> between the guest and the host about what this flag being set means. > > > > Right but I wonder how are you going to *actually* implement it on Linux? > > Are you adding a new set of DMA APIs that do everything except > > translation? > > Actually it's the opposite. There's nothing to do in the guest besides > setting up SWIOTLB and sharing its buffer with the host. > > Normally on pseries, devices use the dma_iommu_ops defined in > arch/powerpc/kernel/dma-iommu.c. I have a patch which changes the > device's dma_ops to NULL so that the default DMA path will be used: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-12-bauerman@linux.ibm.com/ > > Then another patch forces use of SWIOTLB and defines the > set_memory_{encrypted,decrypted} functions so that SWIOTLB can make its > buffer be shared with the host: > > https://lore.kernel.org/linuxppc-dev/20190713060023.8479-13-bauerman@linux.ibm.com/ > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 14:35 ` Michael S. Tsirkin (?) @ 2019-07-15 20:29 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 20:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > So this is what I would call this option: >> > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> > >> > and the explanation should state that all device >> > addresses are translated by the platform to identical >> > addresses. >> > >> > In fact this option then becomes more, not less restrictive >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> > by guest to only create identity mappings, >> > and only before driver_ok is set. >> > This option then would always be negotiated together with >> > VIRTIO_F_ACCESS_PLATFORM. >> > >> > Host then must verify that >> > 1. full 1:1 mappings are created before driver_ok >> > or can we make sure this happens before features_ok? >> > that would be ideal as we could require that features_ok fails >> > 2. mappings are not modified between driver_ok and reset >> > i guess attempts to change them will fail - >> > possibly by causing a guest crash >> > or some other kind of platform-specific error >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> SLOF as I mentioned above, another is that we would be requiring all >> guests running on the machine (secure guests or not, since we would use >> the same configuration for all guests) to support it. But >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> it and wouldn't be able to use the device. > > OK and your target is to enable use with kernel drivers within > guests, right? Right. > My question is, we are defining a new flag here, I guess old guests > then do not set it. How does it help old guests? Or maybe it's > not designed to ... Indeed. The idea is that QEMU can offer the flag, old guests can reject it (or even new guests can reject it, if they decide not to convert into secure VMs) and the feature negotiation will succeed with the flag unset. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 20:29 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 20:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > So this is what I would call this option: >> > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> > >> > and the explanation should state that all device >> > addresses are translated by the platform to identical >> > addresses. >> > >> > In fact this option then becomes more, not less restrictive >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> > by guest to only create identity mappings, >> > and only before driver_ok is set. >> > This option then would always be negotiated together with >> > VIRTIO_F_ACCESS_PLATFORM. >> > >> > Host then must verify that >> > 1. full 1:1 mappings are created before driver_ok >> > or can we make sure this happens before features_ok? >> > that would be ideal as we could require that features_ok fails >> > 2. mappings are not modified between driver_ok and reset >> > i guess attempts to change them will fail - >> > possibly by causing a guest crash >> > or some other kind of platform-specific error >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> SLOF as I mentioned above, another is that we would be requiring all >> guests running on the machine (secure guests or not, since we would use >> the same configuration for all guests) to support it. But >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> it and wouldn't be able to use the device. > > OK and your target is to enable use with kernel drivers within > guests, right? Right. > My question is, we are defining a new flag here, I guess old guests > then do not set it. How does it help old guests? Or maybe it's > not designed to ... Indeed. The idea is that QEMU can offer the flag, old guests can reject it (or even new guests can reject it, if they decide not to convert into secure VMs) and the feature negotiation will succeed with the flag unset. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 20:29 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 20:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > So this is what I would call this option: >> > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> > >> > and the explanation should state that all device >> > addresses are translated by the platform to identical >> > addresses. >> > >> > In fact this option then becomes more, not less restrictive >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> > by guest to only create identity mappings, >> > and only before driver_ok is set. >> > This option then would always be negotiated together with >> > VIRTIO_F_ACCESS_PLATFORM. >> > >> > Host then must verify that >> > 1. full 1:1 mappings are created before driver_ok >> > or can we make sure this happens before features_ok? >> > that would be ideal as we could require that features_ok fails >> > 2. mappings are not modified between driver_ok and reset >> > i guess attempts to change them will fail - >> > possibly by causing a guest crash >> > or some other kind of platform-specific error >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> SLOF as I mentioned above, another is that we would be requiring all >> guests running on the machine (secure guests or not, since we would use >> the same configuration for all guests) to support it. But >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> it and wouldn't be able to use the device. > > OK and your target is to enable use with kernel drivers within > guests, right? Right. > My question is, we are defining a new flag here, I guess old guests > then do not set it. How does it help old guests? Or maybe it's > not designed to ... Indeed. The idea is that QEMU can offer the flag, old guests can reject it (or even new guests can reject it, if they decide not to convert into secure VMs) and the feature negotiation will succeed with the flag unset. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 20:29 ` Thiago Jung Bauermann (?) (?) @ 2019-07-15 20:36 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 20:36 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > So this is what I would call this option: > >> > > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> > > >> > and the explanation should state that all device > >> > addresses are translated by the platform to identical > >> > addresses. > >> > > >> > In fact this option then becomes more, not less restrictive > >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> > by guest to only create identity mappings, > >> > and only before driver_ok is set. > >> > This option then would always be negotiated together with > >> > VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > Host then must verify that > >> > 1. full 1:1 mappings are created before driver_ok > >> > or can we make sure this happens before features_ok? > >> > that would be ideal as we could require that features_ok fails > >> > 2. mappings are not modified between driver_ok and reset > >> > i guess attempts to change them will fail - > >> > possibly by causing a guest crash > >> > or some other kind of platform-specific error > >> > >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> SLOF as I mentioned above, another is that we would be requiring all > >> guests running on the machine (secure guests or not, since we would use > >> the same configuration for all guests) to support it. But > >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> it and wouldn't be able to use the device. > > > > OK and your target is to enable use with kernel drivers within > > guests, right? > > Right. > > > My question is, we are defining a new flag here, I guess old guests > > then do not set it. How does it help old guests? Or maybe it's > > not designed to ... > > Indeed. The idea is that QEMU can offer the flag, old guests can reject > it (or even new guests can reject it, if they decide not to convert into > secure VMs) and the feature negotiation will succeed with the flag > unset. OK. And then what does QEMU do? Assume guest is not encrypted I guess? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 20:29 ` Thiago Jung Bauermann (?) @ 2019-07-15 20:36 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 20:36 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > So this is what I would call this option: > >> > > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> > > >> > and the explanation should state that all device > >> > addresses are translated by the platform to identical > >> > addresses. > >> > > >> > In fact this option then becomes more, not less restrictive > >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> > by guest to only create identity mappings, > >> > and only before driver_ok is set. > >> > This option then would always be negotiated together with > >> > VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > Host then must verify that > >> > 1. full 1:1 mappings are created before driver_ok > >> > or can we make sure this happens before features_ok? > >> > that would be ideal as we could require that features_ok fails > >> > 2. mappings are not modified between driver_ok and reset > >> > i guess attempts to change them will fail - > >> > possibly by causing a guest crash > >> > or some other kind of platform-specific error > >> > >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> SLOF as I mentioned above, another is that we would be requiring all > >> guests running on the machine (secure guests or not, since we would use > >> the same configuration for all guests) to support it. But > >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> it and wouldn't be able to use the device. > > > > OK and your target is to enable use with kernel drivers within > > guests, right? > > Right. > > > My question is, we are defining a new flag here, I guess old guests > > then do not set it. How does it help old guests? Or maybe it's > > not designed to ... > > Indeed. The idea is that QEMU can offer the flag, old guests can reject > it (or even new guests can reject it, if they decide not to convert into > secure VMs) and the feature negotiation will succeed with the flag > unset. OK. And then what does QEMU do? Assume guest is not encrypted I guess? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 20:36 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 20:36 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > So this is what I would call this option: > >> > > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> > > >> > and the explanation should state that all device > >> > addresses are translated by the platform to identical > >> > addresses. > >> > > >> > In fact this option then becomes more, not less restrictive > >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> > by guest to only create identity mappings, > >> > and only before driver_ok is set. > >> > This option then would always be negotiated together with > >> > VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > Host then must verify that > >> > 1. full 1:1 mappings are created before driver_ok > >> > or can we make sure this happens before features_ok? > >> > that would be ideal as we could require that features_ok fails > >> > 2. mappings are not modified between driver_ok and reset > >> > i guess attempts to change them will fail - > >> > possibly by causing a guest crash > >> > or some other kind of platform-specific error > >> > >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> SLOF as I mentioned above, another is that we would be requiring all > >> guests running on the machine (secure guests or not, since we would use > >> the same configuration for all guests) to support it. But > >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> it and wouldn't be able to use the device. > > > > OK and your target is to enable use with kernel drivers within > > guests, right? > > Right. > > > My question is, we are defining a new flag here, I guess old guests > > then do not set it. How does it help old guests? Or maybe it's > > not designed to ... > > Indeed. The idea is that QEMU can offer the flag, old guests can reject > it (or even new guests can reject it, if they decide not to convert into > secure VMs) and the feature negotiation will succeed with the flag > unset. OK. And then what does QEMU do? Assume guest is not encrypted I guess? > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 20:36 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 20:36 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > So this is what I would call this option: > >> > > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> > > >> > and the explanation should state that all device > >> > addresses are translated by the platform to identical > >> > addresses. > >> > > >> > In fact this option then becomes more, not less restrictive > >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> > by guest to only create identity mappings, > >> > and only before driver_ok is set. > >> > This option then would always be negotiated together with > >> > VIRTIO_F_ACCESS_PLATFORM. > >> > > >> > Host then must verify that > >> > 1. full 1:1 mappings are created before driver_ok > >> > or can we make sure this happens before features_ok? > >> > that would be ideal as we could require that features_ok fails > >> > 2. mappings are not modified between driver_ok and reset > >> > i guess attempts to change them will fail - > >> > possibly by causing a guest crash > >> > or some other kind of platform-specific error > >> > >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> SLOF as I mentioned above, another is that we would be requiring all > >> guests running on the machine (secure guests or not, since we would use > >> the same configuration for all guests) to support it. But > >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> it and wouldn't be able to use the device. > > > > OK and your target is to enable use with kernel drivers within > > guests, right? > > Right. > > > My question is, we are defining a new flag here, I guess old guests > > then do not set it. How does it help old guests? Or maybe it's > > not designed to ... > > Indeed. The idea is that QEMU can offer the flag, old guests can reject > it (or even new guests can reject it, if they decide not to convert into > secure VMs) and the feature negotiation will succeed with the flag > unset. OK. And then what does QEMU do? Assume guest is not encrypted I guess? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 20:36 ` Michael S. Tsirkin (?) (?) @ 2019-07-15 22:03 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 22:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > So this is what I would call this option: >> >> > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> > >> >> > and the explanation should state that all device >> >> > addresses are translated by the platform to identical >> >> > addresses. >> >> > >> >> > In fact this option then becomes more, not less restrictive >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> > by guest to only create identity mappings, >> >> > and only before driver_ok is set. >> >> > This option then would always be negotiated together with >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > Host then must verify that >> >> > 1. full 1:1 mappings are created before driver_ok >> >> > or can we make sure this happens before features_ok? >> >> > that would be ideal as we could require that features_ok fails >> >> > 2. mappings are not modified between driver_ok and reset >> >> > i guess attempts to change them will fail - >> >> > possibly by causing a guest crash >> >> > or some other kind of platform-specific error >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> guests running on the machine (secure guests or not, since we would use >> >> the same configuration for all guests) to support it. But >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> it and wouldn't be able to use the device. >> > >> > OK and your target is to enable use with kernel drivers within >> > guests, right? >> >> Right. >> >> > My question is, we are defining a new flag here, I guess old guests >> > then do not set it. How does it help old guests? Or maybe it's >> > not designed to ... >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> it (or even new guests can reject it, if they decide not to convert into >> secure VMs) and the feature negotiation will succeed with the flag >> unset. > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? There's nothing different that QEMU needs to do, with or without the flag. the perspective of the host, a secure guest and a regular guest work the same way with respect to virtio. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 22:03 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 22:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > So this is what I would call this option: >> >> > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> > >> >> > and the explanation should state that all device >> >> > addresses are translated by the platform to identical >> >> > addresses. >> >> > >> >> > In fact this option then becomes more, not less restrictive >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> > by guest to only create identity mappings, >> >> > and only before driver_ok is set. >> >> > This option then would always be negotiated together with >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > Host then must verify that >> >> > 1. full 1:1 mappings are created before driver_ok >> >> > or can we make sure this happens before features_ok? >> >> > that would be ideal as we could require that features_ok fails >> >> > 2. mappings are not modified between driver_ok and reset >> >> > i guess attempts to change them will fail - >> >> > possibly by causing a guest crash >> >> > or some other kind of platform-specific error >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> guests running on the machine (secure guests or not, since we would use >> >> the same configuration for all guests) to support it. But >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> it and wouldn't be able to use the device. >> > >> > OK and your target is to enable use with kernel drivers within >> > guests, right? >> >> Right. >> >> > My question is, we are defining a new flag here, I guess old guests >> > then do not set it. How does it help old guests? Or maybe it's >> > not designed to ... >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> it (or even new guests can reject it, if they decide not to convert into >> secure VMs) and the feature negotiation will succeed with the flag >> unset. > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? There's nothing different that QEMU needs to do, with or without the flag. the perspective of the host, a secure guest and a regular guest work the same way with respect to virtio. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 22:03 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 22:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > So this is what I would call this option: >> >> > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> > >> >> > and the explanation should state that all device >> >> > addresses are translated by the platform to identical >> >> > addresses. >> >> > >> >> > In fact this option then becomes more, not less restrictive >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> > by guest to only create identity mappings, >> >> > and only before driver_ok is set. >> >> > This option then would always be negotiated together with >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > Host then must verify that >> >> > 1. full 1:1 mappings are created before driver_ok >> >> > or can we make sure this happens before features_ok? >> >> > that would be ideal as we could require that features_ok fails >> >> > 2. mappings are not modified between driver_ok and reset >> >> > i guess attempts to change them will fail - >> >> > possibly by causing a guest crash >> >> > or some other kind of platform-specific error >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> guests running on the machine (secure guests or not, since we would use >> >> the same configuration for all guests) to support it. But >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> it and wouldn't be able to use the device. >> > >> > OK and your target is to enable use with kernel drivers within >> > guests, right? >> >> Right. >> >> > My question is, we are defining a new flag here, I guess old guests >> > then do not set it. How does it help old guests? Or maybe it's >> > not designed to ... >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> it (or even new guests can reject it, if they decide not to convert into >> secure VMs) and the feature negotiation will succeed with the flag >> unset. > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? There's nothing different that QEMU needs to do, with or without the flag. the perspective of the host, a secure guest and a regular guest work the same way with respect to virtio. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 22:03 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 22:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > So this is what I would call this option: >> >> > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> > >> >> > and the explanation should state that all device >> >> > addresses are translated by the platform to identical >> >> > addresses. >> >> > >> >> > In fact this option then becomes more, not less restrictive >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> > by guest to only create identity mappings, >> >> > and only before driver_ok is set. >> >> > This option then would always be negotiated together with >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> > >> >> > Host then must verify that >> >> > 1. full 1:1 mappings are created before driver_ok >> >> > or can we make sure this happens before features_ok? >> >> > that would be ideal as we could require that features_ok fails >> >> > 2. mappings are not modified between driver_ok and reset >> >> > i guess attempts to change them will fail - >> >> > possibly by causing a guest crash >> >> > or some other kind of platform-specific error >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> guests running on the machine (secure guests or not, since we would use >> >> the same configuration for all guests) to support it. But >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> it and wouldn't be able to use the device. >> > >> > OK and your target is to enable use with kernel drivers within >> > guests, right? >> >> Right. >> >> > My question is, we are defining a new flag here, I guess old guests >> > then do not set it. How does it help old guests? Or maybe it's >> > not designed to ... >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> it (or even new guests can reject it, if they decide not to convert into >> secure VMs) and the feature negotiation will succeed with the flag >> unset. > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? There's nothing different that QEMU needs to do, with or without the flag. the perspective of the host, a secure guest and a regular guest work the same way with respect to virtio. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:03 ` Thiago Jung Bauermann ` (2 preceding siblings ...) (?) @ 2019-07-15 22:16 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 22:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > So this is what I would call this option: > >> >> > > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> >> > > >> >> > and the explanation should state that all device > >> >> > addresses are translated by the platform to identical > >> >> > addresses. > >> >> > > >> >> > In fact this option then becomes more, not less restrictive > >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> >> > by guest to only create identity mappings, > >> >> > and only before driver_ok is set. > >> >> > This option then would always be negotiated together with > >> >> > VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > Host then must verify that > >> >> > 1. full 1:1 mappings are created before driver_ok > >> >> > or can we make sure this happens before features_ok? > >> >> > that would be ideal as we could require that features_ok fails > >> >> > 2. mappings are not modified between driver_ok and reset > >> >> > i guess attempts to change them will fail - > >> >> > possibly by causing a guest crash > >> >> > or some other kind of platform-specific error > >> >> > >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> >> SLOF as I mentioned above, another is that we would be requiring all > >> >> guests running on the machine (secure guests or not, since we would use > >> >> the same configuration for all guests) to support it. But > >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> >> it and wouldn't be able to use the device. > >> > > >> > OK and your target is to enable use with kernel drivers within > >> > guests, right? > >> > >> Right. > >> > >> > My question is, we are defining a new flag here, I guess old guests > >> > then do not set it. How does it help old guests? Or maybe it's > >> > not designed to ... > >> > >> Indeed. The idea is that QEMU can offer the flag, old guests can reject > >> it (or even new guests can reject it, if they decide not to convert into > >> secure VMs) and the feature negotiation will succeed with the flag > >> unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. OK. So now let's get back to implementation. What will Linux guest driver do? It can't activate DMA API blindly since that will assume translation also works, right? Or do we somehow limit it to just a specific platform? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:03 ` Thiago Jung Bauermann (?) @ 2019-07-15 22:16 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 22:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > So this is what I would call this option: > >> >> > > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> >> > > >> >> > and the explanation should state that all device > >> >> > addresses are translated by the platform to identical > >> >> > addresses. > >> >> > > >> >> > In fact this option then becomes more, not less restrictive > >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> >> > by guest to only create identity mappings, > >> >> > and only before driver_ok is set. > >> >> > This option then would always be negotiated together with > >> >> > VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > Host then must verify that > >> >> > 1. full 1:1 mappings are created before driver_ok > >> >> > or can we make sure this happens before features_ok? > >> >> > that would be ideal as we could require that features_ok fails > >> >> > 2. mappings are not modified between driver_ok and reset > >> >> > i guess attempts to change them will fail - > >> >> > possibly by causing a guest crash > >> >> > or some other kind of platform-specific error > >> >> > >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> >> SLOF as I mentioned above, another is that we would be requiring all > >> >> guests running on the machine (secure guests or not, since we would use > >> >> the same configuration for all guests) to support it. But > >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> >> it and wouldn't be able to use the device. > >> > > >> > OK and your target is to enable use with kernel drivers within > >> > guests, right? > >> > >> Right. > >> > >> > My question is, we are defining a new flag here, I guess old guests > >> > then do not set it. How does it help old guests? Or maybe it's > >> > not designed to ... > >> > >> Indeed. The idea is that QEMU can offer the flag, old guests can reject > >> it (or even new guests can reject it, if they decide not to convert into > >> secure VMs) and the feature negotiation will succeed with the flag > >> unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. OK. So now let's get back to implementation. What will Linux guest driver do? It can't activate DMA API blindly since that will assume translation also works, right? Or do we somehow limit it to just a specific platform? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 22:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 22:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > So this is what I would call this option: > >> >> > > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> >> > > >> >> > and the explanation should state that all device > >> >> > addresses are translated by the platform to identical > >> >> > addresses. > >> >> > > >> >> > In fact this option then becomes more, not less restrictive > >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> >> > by guest to only create identity mappings, > >> >> > and only before driver_ok is set. > >> >> > This option then would always be negotiated together with > >> >> > VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > Host then must verify that > >> >> > 1. full 1:1 mappings are created before driver_ok > >> >> > or can we make sure this happens before features_ok? > >> >> > that would be ideal as we could require that features_ok fails > >> >> > 2. mappings are not modified between driver_ok and reset > >> >> > i guess attempts to change them will fail - > >> >> > possibly by causing a guest crash > >> >> > or some other kind of platform-specific error > >> >> > >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> >> SLOF as I mentioned above, another is that we would be requiring all > >> >> guests running on the machine (secure guests or not, since we would use > >> >> the same configuration for all guests) to support it. But > >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> >> it and wouldn't be able to use the device. > >> > > >> > OK and your target is to enable use with kernel drivers within > >> > guests, right? > >> > >> Right. > >> > >> > My question is, we are defining a new flag here, I guess old guests > >> > then do not set it. How does it help old guests? Or maybe it's > >> > not designed to ... > >> > >> Indeed. The idea is that QEMU can offer the flag, old guests can reject > >> it (or even new guests can reject it, if they decide not to convert into > >> secure VMs) and the feature negotiation will succeed with the flag > >> unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. OK. So now let's get back to implementation. What will Linux guest driver do? It can't activate DMA API blindly since that will assume translation also works, right? Or do we somehow limit it to just a specific platform? > -- > Thiago Jung Bauermann > IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 22:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-07-15 22:16 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin <mst@redhat.com> writes: > >> > >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> > >> >> Michael S. Tsirkin <mst@redhat.com> writes: > >> >> > >> >> > So this is what I would call this option: > >> >> > > >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS > >> >> > > >> >> > and the explanation should state that all device > >> >> > addresses are translated by the platform to identical > >> >> > addresses. > >> >> > > >> >> > In fact this option then becomes more, not less restrictive > >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise > >> >> > by guest to only create identity mappings, > >> >> > and only before driver_ok is set. > >> >> > This option then would always be negotiated together with > >> >> > VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > Host then must verify that > >> >> > 1. full 1:1 mappings are created before driver_ok > >> >> > or can we make sure this happens before features_ok? > >> >> > that would be ideal as we could require that features_ok fails > >> >> > 2. mappings are not modified between driver_ok and reset > >> >> > i guess attempts to change them will fail - > >> >> > possibly by causing a guest crash > >> >> > or some other kind of platform-specific error > >> >> > >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring > >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is > >> >> SLOF as I mentioned above, another is that we would be requiring all > >> >> guests running on the machine (secure guests or not, since we would use > >> >> the same configuration for all guests) to support it. But > >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For > >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about > >> >> it and wouldn't be able to use the device. > >> > > >> > OK and your target is to enable use with kernel drivers within > >> > guests, right? > >> > >> Right. > >> > >> > My question is, we are defining a new flag here, I guess old guests > >> > then do not set it. How does it help old guests? Or maybe it's > >> > not designed to ... > >> > >> Indeed. The idea is that QEMU can offer the flag, old guests can reject > >> it (or even new guests can reject it, if they decide not to convert into > >> secure VMs) and the feature negotiation will succeed with the flag > >> unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. OK. So now let's get back to implementation. What will Linux guest driver do? It can't activate DMA API blindly since that will assume translation also works, right? Or do we somehow limit it to just a specific platform? > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:16 ` Michael S. Tsirkin (?) @ 2019-07-15 23:05 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 23:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > So this is what I would call this option: >> >> >> > >> >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> >> > >> >> >> > and the explanation should state that all device >> >> >> > addresses are translated by the platform to identical >> >> >> > addresses. >> >> >> > >> >> >> > In fact this option then becomes more, not less restrictive >> >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> >> > by guest to only create identity mappings, >> >> >> > and only before driver_ok is set. >> >> >> > This option then would always be negotiated together with >> >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> >> > >> >> >> > Host then must verify that >> >> >> > 1. full 1:1 mappings are created before driver_ok >> >> >> > or can we make sure this happens before features_ok? >> >> >> > that would be ideal as we could require that features_ok fails >> >> >> > 2. mappings are not modified between driver_ok and reset >> >> >> > i guess attempts to change them will fail - >> >> >> > possibly by causing a guest crash >> >> >> > or some other kind of platform-specific error >> >> >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> >> guests running on the machine (secure guests or not, since we would use >> >> >> the same configuration for all guests) to support it. But >> >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> >> it and wouldn't be able to use the device. >> >> > >> >> > OK and your target is to enable use with kernel drivers within >> >> > guests, right? >> >> >> >> Right. >> >> >> >> > My question is, we are defining a new flag here, I guess old guests >> >> > then do not set it. How does it help old guests? Or maybe it's >> >> > not designed to ... >> >> >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> >> it (or even new guests can reject it, if they decide not to convert into >> >> secure VMs) and the feature negotiation will succeed with the flag >> >> unset. >> > >> > OK. And then what does QEMU do? Assume guest is not encrypted I guess? >> >> There's nothing different that QEMU needs to do, with or without the >> flag. the perspective of the host, a secure guest and a regular guest >> work the same way with respect to virtio. > > OK. So now let's get back to implementation. What will > Linux guest driver do? It can't activate DMA API blindly since that > will assume translation also works, right? It can on pseries, because we always have a 1:1 window mapping the whole guest memory. > Or do we somehow limit it to just a specific platform? Yes, we want to accept the new flag only on secure pseries guests. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 23:05 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 23:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > So this is what I would call this option: >> >> >> > >> >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> >> > >> >> >> > and the explanation should state that all device >> >> >> > addresses are translated by the platform to identical >> >> >> > addresses. >> >> >> > >> >> >> > In fact this option then becomes more, not less restrictive >> >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> >> > by guest to only create identity mappings, >> >> >> > and only before driver_ok is set. >> >> >> > This option then would always be negotiated together with >> >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> >> > >> >> >> > Host then must verify that >> >> >> > 1. full 1:1 mappings are created before driver_ok >> >> >> > or can we make sure this happens before features_ok? >> >> >> > that would be ideal as we could require that features_ok fails >> >> >> > 2. mappings are not modified between driver_ok and reset >> >> >> > i guess attempts to change them will fail - >> >> >> > possibly by causing a guest crash >> >> >> > or some other kind of platform-specific error >> >> >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> >> guests running on the machine (secure guests or not, since we would use >> >> >> the same configuration for all guests) to support it. But >> >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> >> it and wouldn't be able to use the device. >> >> > >> >> > OK and your target is to enable use with kernel drivers within >> >> > guests, right? >> >> >> >> Right. >> >> >> >> > My question is, we are defining a new flag here, I guess old guests >> >> > then do not set it. How does it help old guests? Or maybe it's >> >> > not designed to ... >> >> >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> >> it (or even new guests can reject it, if they decide not to convert into >> >> secure VMs) and the feature negotiation will succeed with the flag >> >> unset. >> > >> > OK. And then what does QEMU do? Assume guest is not encrypted I guess? >> >> There's nothing different that QEMU needs to do, with or without the >> flag. the perspective of the host, a secure guest and a regular guest >> work the same way with respect to virtio. > > OK. So now let's get back to implementation. What will > Linux guest driver do? It can't activate DMA API blindly since that > will assume translation also works, right? It can on pseries, because we always have a 1:1 window mapping the whole guest memory. > Or do we somehow limit it to just a specific platform? Yes, we want to accept the new flag only on secure pseries guests. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 23:05 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 23:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > So this is what I would call this option: >> >> >> > >> >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> >> > >> >> >> > and the explanation should state that all device >> >> >> > addresses are translated by the platform to identical >> >> >> > addresses. >> >> >> > >> >> >> > In fact this option then becomes more, not less restrictive >> >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> >> > by guest to only create identity mappings, >> >> >> > and only before driver_ok is set. >> >> >> > This option then would always be negotiated together with >> >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> >> > >> >> >> > Host then must verify that >> >> >> > 1. full 1:1 mappings are created before driver_ok >> >> >> > or can we make sure this happens before features_ok? >> >> >> > that would be ideal as we could require that features_ok fails >> >> >> > 2. mappings are not modified between driver_ok and reset >> >> >> > i guess attempts to change them will fail - >> >> >> > possibly by causing a guest crash >> >> >> > or some other kind of platform-specific error >> >> >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> >> guests running on the machine (secure guests or not, since we would use >> >> >> the same configuration for all guests) to support it. But >> >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> >> it and wouldn't be able to use the device. >> >> > >> >> > OK and your target is to enable use with kernel drivers within >> >> > guests, right? >> >> >> >> Right. >> >> >> >> > My question is, we are defining a new flag here, I guess old guests >> >> > then do not set it. How does it help old guests? Or maybe it's >> >> > not designed to ... >> >> >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> >> it (or even new guests can reject it, if they decide not to convert into >> >> secure VMs) and the feature negotiation will succeed with the flag >> >> unset. >> > >> > OK. And then what does QEMU do? Assume guest is not encrypted I guess? >> >> There's nothing different that QEMU needs to do, with or without the >> flag. the perspective of the host, a secure guest and a regular guest >> work the same way with respect to virtio. > > OK. So now let's get back to implementation. What will > Linux guest driver do? It can't activate DMA API blindly since that > will assume translation also works, right? It can on pseries, because we always have a 1:1 window mapping the whole guest memory. > Or do we somehow limit it to just a specific platform? Yes, we want to accept the new flag only on secure pseries guests. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:16 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-07-15 23:05 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 23:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Jul 15, 2019 at 07:03:03PM -0300, Thiago Jung Bauermann wrote: >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Mon, Jul 15, 2019 at 05:29:06PM -0300, Thiago Jung Bauermann wrote: >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> >> >> >> >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> >> >> >> >> > So this is what I would call this option: >> >> >> > >> >> >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> >> >> > >> >> >> > and the explanation should state that all device >> >> >> > addresses are translated by the platform to identical >> >> >> > addresses. >> >> >> > >> >> >> > In fact this option then becomes more, not less restrictive >> >> >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> >> >> > by guest to only create identity mappings, >> >> >> > and only before driver_ok is set. >> >> >> > This option then would always be negotiated together with >> >> >> > VIRTIO_F_ACCESS_PLATFORM. >> >> >> > >> >> >> > Host then must verify that >> >> >> > 1. full 1:1 mappings are created before driver_ok >> >> >> > or can we make sure this happens before features_ok? >> >> >> > that would be ideal as we could require that features_ok fails >> >> >> > 2. mappings are not modified between driver_ok and reset >> >> >> > i guess attempts to change them will fail - >> >> >> > possibly by causing a guest crash >> >> >> > or some other kind of platform-specific error >> >> >> >> >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> >> >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> >> >> SLOF as I mentioned above, another is that we would be requiring all >> >> >> guests running on the machine (secure guests or not, since we would use >> >> >> the same configuration for all guests) to support it. But >> >> >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> >> >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> >> >> it and wouldn't be able to use the device. >> >> > >> >> > OK and your target is to enable use with kernel drivers within >> >> > guests, right? >> >> >> >> Right. >> >> >> >> > My question is, we are defining a new flag here, I guess old guests >> >> > then do not set it. How does it help old guests? Or maybe it's >> >> > not designed to ... >> >> >> >> Indeed. The idea is that QEMU can offer the flag, old guests can reject >> >> it (or even new guests can reject it, if they decide not to convert into >> >> secure VMs) and the feature negotiation will succeed with the flag >> >> unset. >> > >> > OK. And then what does QEMU do? Assume guest is not encrypted I guess? >> >> There's nothing different that QEMU needs to do, with or without the >> flag. the perspective of the host, a secure guest and a regular guest >> work the same way with respect to virtio. > > OK. So now let's get back to implementation. What will > Linux guest driver do? It can't activate DMA API blindly since that > will assume translation also works, right? It can on pseries, because we always have a 1:1 window mapping the whole guest memory. > Or do we somehow limit it to just a specific platform? Yes, we want to accept the new flag only on secure pseries guests. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:03 ` Thiago Jung Bauermann ` (4 preceding siblings ...) (?) @ 2019-07-15 23:24 ` Benjamin Herrenschmidt -1 siblings, 0 replies; 198+ messages in thread From: Benjamin Herrenschmidt @ 2019-07-15 23:24 UTC (permalink / raw) To: Thiago Jung Bauermann, Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, 2019-07-15 at 19:03 -0300, Thiago Jung Bauermann wrote: > > > Indeed. The idea is that QEMU can offer the flag, old guests can > > > reject > > > it (or even new guests can reject it, if they decide not to > > > convert into > > > secure VMs) and the feature negotiation will succeed with the > > > flag > > > unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I > > guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. This is *precisely* why I was against adding a flag and touch the protocol negociation with qemu in the first place, back when I cared about that stuff... Guys, this has gone in circles over and over again. This has nothing to do with qemu. Qemu doesn't need to know about this. It's entirely guest local. This is why the one-liner in virtio was a far better and simpler solution. This is something the guest does to itself (with the participation of a ultravisor but that's not something qemu cares about at this stage, at least not as far as virtio is concerned). Basically, the guest "hides" its memory from the host using a HW secure memory facility. As a result, it needs to ensure that all of its DMA pages are bounced through insecure pages that aren't hidden. That's it, it's all guest side. Qemu shouldn't have to care about it at all. Cheers, Ben. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 22:03 ` Thiago Jung Bauermann (?) @ 2019-07-15 23:24 ` Benjamin Herrenschmidt -1 siblings, 0 replies; 198+ messages in thread From: Benjamin Herrenschmidt @ 2019-07-15 23:24 UTC (permalink / raw) To: Thiago Jung Bauermann, Michael S. Tsirkin Cc: virtualization, linuxppc-dev, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Ram Pai, Jean-Philippe Brucker, Michael Roth, Mike Anderson On Mon, 2019-07-15 at 19:03 -0300, Thiago Jung Bauermann wrote: > > > Indeed. The idea is that QEMU can offer the flag, old guests can > > > reject > > > it (or even new guests can reject it, if they decide not to > > > convert into > > > secure VMs) and the feature negotiation will succeed with the > > > flag > > > unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I > > guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. This is *precisely* why I was against adding a flag and touch the protocol negociation with qemu in the first place, back when I cared about that stuff... Guys, this has gone in circles over and over again. This has nothing to do with qemu. Qemu doesn't need to know about this. It's entirely guest local. This is why the one-liner in virtio was a far better and simpler solution. This is something the guest does to itself (with the participation of a ultravisor but that's not something qemu cares about at this stage, at least not as far as virtio is concerned). Basically, the guest "hides" its memory from the host using a HW secure memory facility. As a result, it needs to ensure that all of its DMA pages are bounced through insecure pages that aren't hidden. That's it, it's all guest side. Qemu shouldn't have to care about it at all. Cheers, Ben. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 23:24 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 198+ messages in thread From: Benjamin Herrenschmidt @ 2019-07-15 23:24 UTC (permalink / raw) To: Thiago Jung Bauermann, Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, 2019-07-15 at 19:03 -0300, Thiago Jung Bauermann wrote: > > > Indeed. The idea is that QEMU can offer the flag, old guests can > > > reject > > > it (or even new guests can reject it, if they decide not to > > > convert into > > > secure VMs) and the feature negotiation will succeed with the > > > flag > > > unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I > > guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. This is *precisely* why I was against adding a flag and touch the protocol negociation with qemu in the first place, back when I cared about that stuff... Guys, this has gone in circles over and over again. This has nothing to do with qemu. Qemu doesn't need to know about this. It's entirely guest local. This is why the one-liner in virtio was a far better and simpler solution. This is something the guest does to itself (with the participation of a ultravisor but that's not something qemu cares about at this stage, at least not as far as virtio is concerned). Basically, the guest "hides" its memory from the host using a HW secure memory facility. As a result, it needs to ensure that all of its DMA pages are bounced through insecure pages that aren't hidden. That's it, it's all guest side. Qemu shouldn't have to care about it at all. Cheers, Ben. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-15 23:24 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 198+ messages in thread From: Benjamin Herrenschmidt @ 2019-07-15 23:24 UTC (permalink / raw) To: Thiago Jung Bauermann, Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, 2019-07-15 at 19:03 -0300, Thiago Jung Bauermann wrote: > > > Indeed. The idea is that QEMU can offer the flag, old guests can > > > reject > > > it (or even new guests can reject it, if they decide not to > > > convert into > > > secure VMs) and the feature negotiation will succeed with the > > > flag > > > unset. > > > > OK. And then what does QEMU do? Assume guest is not encrypted I > > guess? > > There's nothing different that QEMU needs to do, with or without the > flag. the perspective of the host, a secure guest and a regular guest > work the same way with respect to virtio. This is *precisely* why I was against adding a flag and touch the protocol negociation with qemu in the first place, back when I cared about that stuff... Guys, this has gone in circles over and over again. This has nothing to do with qemu. Qemu doesn't need to know about this. It's entirely guest local. This is why the one-liner in virtio was a far better and simpler solution. This is something the guest does to itself (with the participation of a ultravisor but that's not something qemu cares about at this stage, at least not as far as virtio is concerned). Basically, the guest "hides" its memory from the host using a HW secure memory facility. As a result, it needs to ensure that all of its DMA pages are bounced through insecure pages that aren't hidden. That's it, it's all guest side. Qemu shouldn't have to care about it at all. Cheers, Ben. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-15 14:35 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-07-15 20:29 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-15 20:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Sun, Jul 14, 2019 at 02:51:18AM -0300, Thiago Jung Bauermann wrote: >> >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > So this is what I would call this option: >> > >> > VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS >> > >> > and the explanation should state that all device >> > addresses are translated by the platform to identical >> > addresses. >> > >> > In fact this option then becomes more, not less restrictive >> > than VIRTIO_F_ACCESS_PLATFORM - it's a promise >> > by guest to only create identity mappings, >> > and only before driver_ok is set. >> > This option then would always be negotiated together with >> > VIRTIO_F_ACCESS_PLATFORM. >> > >> > Host then must verify that >> > 1. full 1:1 mappings are created before driver_ok >> > or can we make sure this happens before features_ok? >> > that would be ideal as we could require that features_ok fails >> > 2. mappings are not modified between driver_ok and reset >> > i guess attempts to change them will fail - >> > possibly by causing a guest crash >> > or some other kind of platform-specific error >> >> I think VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS is good, but requiring >> it to be accompanied by ACCESS_PLATFORM can be a problem. One reason is >> SLOF as I mentioned above, another is that we would be requiring all >> guests running on the machine (secure guests or not, since we would use >> the same configuration for all guests) to support it. But >> ACCESS_PLATFORM is relatively recent so it's a bit early for that. For >> instance, Ubuntu 16.04 LTS (which is still supported) doesn't know about >> it and wouldn't be able to use the device. > > OK and your target is to enable use with kernel drivers within > guests, right? Right. > My question is, we are defining a new flag here, I guess old guests > then do not set it. How does it help old guests? Or maybe it's > not designed to ... Indeed. The idea is that QEMU can offer the flag, old guests can reject it (or even new guests can reject it, if they decide not to convert into secure VMs) and the feature negotiation will succeed with the flag unset. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-14 5:51 ` Thiago Jung Bauermann @ 2019-07-18 3:39 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-18 3:39 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Michael Roth, Jean-Philippe Brucker, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, iommu, Christoph Hellwig, David Gibson Hello, Just going back to this question which I wasn't able to answer. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Michael S. Tsirkin <mst@redhat.com> writes: > >> So far so good, but now a question: >> >> how are we handling guest address width limitations? >> Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to >> guest address width limitations? >> I am guessing we can make them so ... >> This needs to be documented. > > I'm not sure. I will get back to you on this. We don't have address width limitations between host and guest. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-07-18 3:39 ` Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-18 3:39 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Alexey Kardashevskiy, Jean-Philippe Brucker, Jason Wang, Mike Anderson, Ram Pai, Michael Roth, linux-kernel, iommu, virtualization, Christoph Hellwig, David Gibson Hello, Just going back to this question which I wasn't able to answer. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Michael S. Tsirkin <mst@redhat.com> writes: > >> So far so good, but now a question: >> >> how are we handling guest address width limitations? >> Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to >> guest address width limitations? >> I am guessing we can make them so ... >> This needs to be documented. > > I'm not sure. I will get back to you on this. We don't have address width limitations between host and guest. -- Thiago Jung Bauermann IBM Linux Technology Center _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-07-14 5:51 ` Thiago Jung Bauermann ` (4 preceding siblings ...) (?) @ 2019-07-18 3:39 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-07-18 3:39 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Alexey Kardashevskiy, Jean-Philippe Brucker, Mike Anderson, Ram Pai, linux-kernel, iommu, virtualization, Christoph Hellwig, David Gibson Hello, Just going back to this question which I wasn't able to answer. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Michael S. Tsirkin <mst@redhat.com> writes: > >> So far so good, but now a question: >> >> how are we handling guest address width limitations? >> Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to >> guest address width limitations? >> I am guessing we can make them so ... >> This needs to be documented. > > I'm not sure. I will get back to you on this. We don't have address width limitations between host and guest. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-05-20 13:16 ` Michael S. Tsirkin ` (2 preceding siblings ...) (?) @ 2019-06-04 1:13 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-06-04 1:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Michael S. Tsirkin <mst@redhat.com> writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > drivers. This is why devices fail when it's not negotiated. Just to clarify, what do you mean by unprivileged drivers? Is it drivers implemented in guest userspace such as with VFIO? Or unprivileged in some other sense such as needing to use bounce buffers for some reason? > This confuses me. > If driver is unpriveledged then what happens with this flag? > It can supply any address it wants. Will that corrupt kernel > memory? Not needing address translation doesn't necessarily mean that there's no IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's always an IOMMU present. And we also support VFIO drivers. The VFIO API for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls to program the IOMMU. For our use case, we don't need address translation because we set up an identity mapping in the IOMMU so that the device can use guest physical addresses. If the guest kernel is concerned that an unprivileged driver could jeopardize its integrity it should not negotiate this feature flag. Perhaps there should be a note about this in the flag definition? This concern is platform-dependant though. I don't believe it's an issue in pseries. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 20:23 ` Michael S. Tsirkin (?) (?) @ 2019-03-20 16:13 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-03-20 16:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Mike Anderson, Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Hello Michael, Sorry for the delay in responding. We had some internal discussions on this. Michael S. Tsirkin <mst@redhat.com> writes: > On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: >> >> Hello Michael, >> >> Michael S. Tsirkin <mst@redhat.com> writes: >> >> > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: >> So while ACCESS_PLATFORM solves our problems for secure guests, we can't >> turn it on by default because we can't affect legacy systems. Doing so >> would penalize existing systems that can access all memory. They would >> all have to unnecessarily go through address translations, and take a >> performance hit. > > So as step one, you just give hypervisor admin an option to run legacy > systems faster by blocking secure mode. I don't see why that is > so terrible. There are a few reasons why: 1. It's bad user experience to require people to fiddle with knobs for obscure reasons if it's possible to design things such that they Just Work. 2. "User" in this case can be a human directly calling QEMU, but could also be libvirt or one of its users, or some other framework. This means having to adjust and/or educate an open-ended number of people and software. It's best avoided if possible. 3. The hypervisor admin and the admin of the guest system don't necessarily belong to the same organization (e.g., cloud provider and cloud customer), so there may be some friction when they need to coordinate to get this right. 4. A feature of our design is that the guest may or may not decide to "go secure" at boot time, so it's best not to depend on flags that may or may not have been set at the time QEMU was started. >> The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows >> in advance - right when the VM is instantiated - that it will not have >> access to all guest memory. > > Not quite. It just means that hypervisor can live with not having > access to all memory. If platform wants to give it access > to all memory that is quite all right. Except that on powerpc it also means "there's an IOMMU present" and there's no way to say "bypass IOMMU translation". :-/ >> Another way of looking at this issue which also explains our reluctance >> is that the only difference between a secure guest and a regular guest >> (at least regarding virtio) is that the former uses swiotlb while the >> latter doens't. > > But swiotlb is just one implementation. It's a guest internal thing. The > issue is that memory isn't host accessible. From what I understand of the ACCESS_PLATFORM definition, the host will only ever try to access memory addresses that are supplied to it by the guest, so all of the secure guest memory that the host cares about is accessible: If this feature bit is set to 0, then the device has same access to memory addresses supplied to it as the driver has. In particular, the device will always use physical addresses matching addresses used by the driver (typically meaning physical addresses used by the CPU) and not translated further, and can access any address supplied to it by the driver. When clear, this overrides any platform-specific description of whether device access is limited or translated in any way, e.g. whether an IOMMU may be present. All of the above is true for POWER guests, whether they are secure guests or not. Or are you saying that a virtio device may want to access memory addresses that weren't supplied to it by the driver? >> And from the device's point of view they're >> indistinguishable. It can't tell one guest that is using swiotlb from >> one that isn't. And that implies that secure guest vs regular guest >> isn't a virtio interface issue, it's "guest internal affairs". So >> there's no reason to reflect that in the feature flags. > > So don't. The way not to reflect that in the feature flags is > to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. > > > Without ACCESS_PLATFORM > virtio has a very specific opinion about the security of the > device, and that opinion is that device is part of the guest > supervisor security domain. Sorry for being a bit dense, but not sure what "the device is part of the guest supervisor security domain" means. In powerpc-speak, "supervisor" is the operating system so perhaps that explains my confusion. Are you saying that without ACCESS_PLATFORM, the guest considers the host to be part of the guest operating system's security domain? If so, does that have any other implication besides "the host can access any address supplied to it by the driver"? If that is the case, perhaps the definition of ACCESS_PLATFORM needs to be amended to include that information because it's not part of the current definition. >> That said, we still would like to arrive at a proper design for this >> rather than add yet another hack if we can avoid it. So here's another >> proposal: considering that the dma-direct code (in kernel/dma/direct.c) >> automatically uses swiotlb when necessary (thanks to Christoph's recent >> DMA work), would it be ok to replace virtio's own direct-memory code >> that is used in the !ACCESS_PLATFORM case with the dma-direct code? That >> way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a >> code cleanup (replace open-coded stuff with calls to existing >> infrastructure). > > Let's say I have some doubts that there's an API that > matches what virtio with its bag of legacy compatibility exactly. Ok. >> > But the name "sev_active" makes me scared because at least AMD guys who >> > were doing the sensible thing and setting ACCESS_PLATFORM >> >> My understanding is, AMD guest-platform knows in advance that their >> guest will run in secure mode and hence sets the flag at the time of VM >> instantiation. Unfortunately we dont have that luxury on our platforms. > > Well you do have that luxury. It looks like that there are existing > guests that already acknowledge ACCESS_PLATFORM and you are not happy > with how that path is slow. So you are trying to optimize for > them by clearing ACCESS_PLATFORM and then you have lost ability > to invoke DMA API. > > For example if there was another flag just like ACCESS_PLATFORM > just not yet used by anyone, you would be all fine using that right? Yes, a new flag sounds like a great idea. What about the definition below? VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the exception that the IOMMU is explicitly defined to be off or bypassed when accessing memory addresses supplied to the device by the driver. This flag should be set by the guest if offered, but to allow for backward-compatibility device implementations allow for it to be left unset by the guest. It is an error to set both this flag and VIRTIO_F_ACCESS_PLATFORM. > Is there any justification to doing that beyond someone putting > out slow code in the past? The definition of the ACCESS_PLATFORM flag is generic and captures the notion of memory access restrictions for the device. Unfortunately, on powerpc pSeries guests it also implies that the IOMMU is turned on even though pSeries guests have never used IOMMU for virtio devices. Combined with the lack of a way to turn off or bypass the IOMMU for virtio devices, this means that existing guests in the field are compelled to use the IOMMU even though that never was the case before, and said guests having no mechanism to turn it off. Therefore, we need a new flag to signal the memory access restriction present in secure guests which doesn't also imply turning on the IOMMU. -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-02-04 18:14 ` Thiago Jung Bauermann (?) (?) @ 2019-02-04 20:23 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-02-04 20:23 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Mon, Feb 04, 2019 at 04:14:20PM -0200, Thiago Jung Bauermann wrote: > > Hello Michael, > > Michael S. Tsirkin <mst@redhat.com> writes: > > > On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > >> > >> Fixing address of powerpc mailing list. > >> > >> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > >> > >> > Hello, > >> > > >> > With Christoph's rework of the DMA API that recently landed, the patch > >> > below is the only change needed in virtio to make it work in a POWER > >> > secure guest under the ultravisor. > >> > > >> > The other change we need (making sure the device's dma_map_ops is NULL > >> > so that the dma-direct/swiotlb code is used) can be made in > >> > powerpc-specific code. > >> > > >> > Of course, I also have patches (soon to be posted as RFC) which hook up > >> > <linux/mem_encrypt.h> to the powerpc secure guest support code. > >> > > >> > What do you think? > >> > > >> > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > >> > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > >> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > >> > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > >> > > >> > The host can't access the guest memory when it's encrypted, so using > >> > regular memory pages for the ring isn't an option. Go through the DMA API. > >> > > >> > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > > Well I think this will come back to bite us (witness xen which is now > > reworking precisely this path - but at least they aren't to blame, xen > > came before ACCESS_PLATFORM). > > > > I also still think the right thing would have been to set > > ACCESS_PLATFORM for all systems where device can't access all memory. > > I understand. The problem with that approach for us is that because we > don't know which guests will become secure guests and which will remain > regular guests, QEMU would need to offer ACCESS_PLATFORM to all guests. > > And the problem with that is that for QEMU on POWER, having > ACCESS_PLATFORM turned off means that it can bypass the IOMMU for the > device (which makes sense considering that the name of the flag was > IOMMU_PLATFORM). And we need that for regular guests to avoid > performance degradation. You don't really, ACCESS_PLATFORM means just that, platform decides. > So while ACCESS_PLATFORM solves our problems for secure guests, we can't > turn it on by default because we can't affect legacy systems. Doing so > would penalize existing systems that can access all memory. They would > all have to unnecessarily go through address translations, and take a > performance hit. So as step one, you just give hypervisor admin an option to run legacy systems faster by blocking secure mode. I don't see why that is so terrible. But as step two, assuming you use above step one to make legacy guests go fast - maybe there is a point in detecting such a hypervisor and doing something smarter with it. By all means let's have a discussion around this but that is no longer "to make it work" as the commit log says it's more a performance optimization. > The semantics of ACCESS_PLATFORM assume that the hypervisor/QEMU knows > in advance - right when the VM is instantiated - that it will not have > access to all guest memory. Not quite. It just means that hypervisor can live with not having access to all memory. If platform wants to give it access to all memory that is quite all right. > Unfortunately that assumption is subtly > broken on our secure-platform. The hypervisor/QEMU realizes that the > platform is going secure only *after the VM is instantiated*. It's the > kernel running in the VM that determines that it wants to switch the > platform to secure-mode. ACCESS_PLATFORM is there so guests can detect legacy hypervisors which always assumed it's another CPU. > Another way of looking at this issue which also explains our reluctance > is that the only difference between a secure guest and a regular guest > (at least regarding virtio) is that the former uses swiotlb while the > latter doens't. But swiotlb is just one implementation. It's a guest internal thing. The issue is that memory isn't host accessible. Yes linux does not use that info too much right now but it already begins to seep out of the abstraction. For example as you are doing data copies you should maybe calculate the packet checksum just as well. Not something DMA API will let you know right now, but that's because any bounce buffer users so far weren't terribly fast anyway - it was all for 16 bit hardware and such. > And from the device's point of view they're > indistinguishable. It can't tell one guest that is using swiotlb from > one that isn't. And that implies that secure guest vs regular guest > isn't a virtio interface issue, it's "guest internal affairs". So > there's no reason to reflect that in the feature flags. So don't. The way not to reflect that in the feature flags is to set ACCESS_PLATFORM. Then you say *I don't care let platform device*. Without ACCESS_PLATFORM virtio has a very specific opinion about the security of the device, and that opinion is that device is part of the guest supervisor security domain. > That said, we still would like to arrive at a proper design for this > rather than add yet another hack if we can avoid it. So here's another > proposal: considering that the dma-direct code (in kernel/dma/direct.c) > automatically uses swiotlb when necessary (thanks to Christoph's recent > DMA work), would it be ok to replace virtio's own direct-memory code > that is used in the !ACCESS_PLATFORM case with the dma-direct code? That > way we'll get swiotlb even with !ACCESS_PLATFORM, and virtio will get a > code cleanup (replace open-coded stuff with calls to existing > infrastructure). Let's say I have some doubts that there's an API that matches what virtio with its bag of legacy compatibility exactly. But taking a step back you seem to keep looking at it at the code level. And I think that's not necessarily right. If ACCESS_PLATFORM isn't what you are looking for then maybe you need another feature bit. But you/we need to figure out what it means first. > > But I also think I don't have the energy to argue about power secure > > guest anymore. So be it for power secure guest since the involved > > engineers disagree with me. Hey I've been wrong in the past ;). > > Yeah, it's been a difficult discussion. Thanks for still engaging! > I honestly thought that this patch was a good solution (if the guest has > encrypted memory it means that the DMA API needs to be used), but I can > see where you are coming from. As I said, we'd like to arrive at a good > solution if possible. > > > But the name "sev_active" makes me scared because at least AMD guys who > > were doing the sensible thing and setting ACCESS_PLATFORM > > My understanding is, AMD guest-platform knows in advance that their > guest will run in secure mode and hence sets the flag at the time of VM > instantiation. Unfortunately we dont have that luxury on our platforms. Well you do have that luxury. It looks like that there are existing guests that already acknowledge ACCESS_PLATFORM and you are not happy with how that path is slow. So you are trying to optimize for them by clearing ACCESS_PLATFORM and then you have lost ability to invoke DMA API. For example if there was another flag just like ACCESS_PLATFORM just not yet used by anyone, you would be all fine using that right? Is there any justification to doing that beyond someone putting out slow code in the past? > > (unless I'm > > wrong? I reemember distinctly that's so) will likely be affected too. > > We don't want that. > > > > So let's find a way to make sure it's just power secure guest for now > > pls. > > Yes, my understanding is that they turn ACCESS_PLATFORM on. And because > of that, IIUC this patch wouldn't affect them because in their platform > vring_use_dma_api() returns true earlier in the > "if !virtio_has_iommu_quirk(vdev)" condition. Let's just say I don't think we should assume how the specific hypervisor behaves. It seems to follow the spec and so should Linux. > > I also think we should add a dma_api near features under virtio_device > > such that these hacks can move off data path. > > Sorry, I don't understand this. I mean we can set a flag within struct virtio_device instead of poking at features checking xen etc etc. > > By the way could you please respond about virtio-iommu and > > why there's no support for ACCESS_PLATFORM on power? > > There is support for ACCESS_PLATFORM on POWER. We don't enable it > because it causes a performance hit. For legacy guests. > > I have Cc'd you on these discussions. > > I'm having a look at the spec and the patches, but to be honest I'm not > the best powerpc guy for this. I'll see if I can get others to have a > look. > > > Thanks! > > Thanks as well! > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 17:42 ` Thiago Jung Bauermann (?) (?) @ 2019-01-29 19:02 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-01-29 19:02 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Jean-Philippe Brucker, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson On Tue, Jan 29, 2019 at 03:42:44PM -0200, Thiago Jung Bauermann wrote: > > Fixing address of powerpc mailing list. > > Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Well I think this will come back to bite us (witness xen which is now reworking precisely this path - but at least they aren't to blame, xen came before ACCESS_PLATFORM). I also still think the right thing would have been to set ACCESS_PLATFORM for all systems where device can't access all memory. But I also think I don't have the energy to argue about power secure guest anymore. So be it for power secure guest since the involved engineers disagree with me. Hey I've been wrong in the past ;). But the name "sev_active" makes me scared because at least AMD guys who were doing the sensible thing and setting ACCESS_PLATFORM (unless I'm wrong? I reemember distinctly that's so) will likely be affected too. We don't want that. So let's find a way to make sure it's just power secure guest for now pls. I also think we should add a dma_api near features under virtio_device such that these hacks can move off data path. By the way could you please respond about virtio-iommu and why there's no support for ACCESS_PLATFORM on power? I have Cc'd you on these discussions. Thanks! > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > > -- > Thiago Jung Bauermann > IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 17:08 ` Thiago Jung Bauermann (?) (?) @ 2019-01-29 17:42 ` Thiago Jung Bauermann -1 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:42 UTC (permalink / raw) To: virtualization Cc: Michael S . Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, Paul Mackerras, iommu, linuxppc-dev, Christoph Hellwig, David Gibson Fixing address of powerpc mailing list. Thiago Jung Bauermann <bauerman@linux.ibm.com> writes: > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; -- Thiago Jung Bauermann IBM Linux Technology Center ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-01-29 17:08 ` Thiago Jung Bauermann (?) @ 2019-08-10 18:57 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-10 18:57 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, Ram Pai On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; So I gave this lots of thought, and I'm coming round to basically accepting something very similar to this patch. But not exactly like this :). Let's see what are the requirements. If 1. We do not trust the device (so we want to use a bounce buffer with it) 2. DMA address is also a physical address of a buffer then we should use DMA API with virtio. sev_active() above is one way to put (1). I can't say I love it but it's tolerable. But we also want promise from DMA API about 2. Without promise 2 we simply can't use DMA API with a legacy device. Otherwise, on a SEV system with an IOMMU which isn't 1:1 and with a virtio device without ACCESS_PLATFORM, we are trying to pass a virtual address, and devices without ACCESS_PLATFORM can only access CPU physical addresses. So something like: dma_addr_is_phys_addr? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-10 18:57 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-10 18:57 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; So I gave this lots of thought, and I'm coming round to basically accepting something very similar to this patch. But not exactly like this :). Let's see what are the requirements. If 1. We do not trust the device (so we want to use a bounce buffer with it) 2. DMA address is also a physical address of a buffer then we should use DMA API with virtio. sev_active() above is one way to put (1). I can't say I love it but it's tolerable. But we also want promise from DMA API about 2. Without promise 2 we simply can't use DMA API with a legacy device. Otherwise, on a SEV system with an IOMMU which isn't 1:1 and with a virtio device without ACCESS_PLATFORM, we are trying to pass a virtual address, and devices without ACCESS_PLATFORM can only access CPU physical addresses. So something like: dma_addr_is_phys_addr? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-10 18:57 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-10 18:57 UTC (permalink / raw) To: Thiago Jung Bauermann Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > Hello, > > With Christoph's rework of the DMA API that recently landed, the patch > below is the only change needed in virtio to make it work in a POWER > secure guest under the ultravisor. > > The other change we need (making sure the device's dma_map_ops is NULL > so that the dma-direct/swiotlb code is used) can be made in > powerpc-specific code. > > Of course, I also have patches (soon to be posted as RFC) which hook up > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > What do you think? > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > Date: Thu, 24 Jan 2019 22:08:02 -0200 > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > The host can't access the guest memory when it's encrypted, so using > regular memory pages for the ring isn't an option. Go through the DMA API. > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > --- > drivers/virtio/virtio_ring.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..321a27075380 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * not work without an even larger kludge. Instead, enable > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > + * > + * Also, if guest memory is encrypted the host can't access > + * it directly. In this case, we'll need to use the DMA API. > */ > - if (xen_domain()) > + if (xen_domain() || sev_active()) > return true; > > return false; So I gave this lots of thought, and I'm coming round to basically accepting something very similar to this patch. But not exactly like this :). Let's see what are the requirements. If 1. We do not trust the device (so we want to use a bounce buffer with it) 2. DMA address is also a physical address of a buffer then we should use DMA API with virtio. sev_active() above is one way to put (1). I can't say I love it but it's tolerable. But we also want promise from DMA API about 2. Without promise 2 we simply can't use DMA API with a legacy device. Otherwise, on a SEV system with an IOMMU which isn't 1:1 and with a virtio device without ACCESS_PLATFORM, we are trying to pass a virtual address, and devices without ACCESS_PLATFORM can only access CPU physical addresses. So something like: dma_addr_is_phys_addr? -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-10 18:57 ` Michael S. Tsirkin (?) (?) @ 2019-08-10 22:07 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-10 22:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > So I gave this lots of thought, and I'm coming round to > basically accepting something very similar to this patch. > > But not exactly like this :). > > Let's see what are the requirements. > > If > > 1. We do not trust the device (so we want to use a bounce buffer with it) > 2. DMA address is also a physical address of a buffer > > then we should use DMA API with virtio. > > > sev_active() above is one way to put (1). I can't say I love it but > it's tolerable. > > > But we also want promise from DMA API about 2. > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > and with a virtio device without ACCESS_PLATFORM, we are trying > to pass a virtual address, and devices without ACCESS_PLATFORM > can only access CPU physical addresses. > > So something like: > > dma_addr_is_phys_addr? On our Secure pseries platform, dma address is physical address and this proposal will help us, use DMA API. On our normal pseries platform, dma address is physical address too. But we do not necessarily need to use the DMA API. We can use the DMA API, but our handlers will do the same thing, the generic virtio handlers would do. If there is an opt-out option; even when dma addr is same as physical addr, than there will be less code duplication. Would something like this be better. (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? RP > -- > MST -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-10 18:57 ` Michael S. Tsirkin @ 2019-08-10 22:07 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-10 22:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > So I gave this lots of thought, and I'm coming round to > basically accepting something very similar to this patch. > > But not exactly like this :). > > Let's see what are the requirements. > > If > > 1. We do not trust the device (so we want to use a bounce buffer with it) > 2. DMA address is also a physical address of a buffer > > then we should use DMA API with virtio. > > > sev_active() above is one way to put (1). I can't say I love it but > it's tolerable. > > > But we also want promise from DMA API about 2. > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > and with a virtio device without ACCESS_PLATFORM, we are trying > to pass a virtual address, and devices without ACCESS_PLATFORM > can only access CPU physical addresses. > > So something like: > > dma_addr_is_phys_addr? On our Secure pseries platform, dma address is physical address and this proposal will help us, use DMA API. On our normal pseries platform, dma address is physical address too. But we do not necessarily need to use the DMA API. We can use the DMA API, but our handlers will do the same thing, the generic virtio handlers would do. If there is an opt-out option; even when dma addr is same as physical addr, than there will be less code duplication. Would something like this be better. (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? RP > -- > MST -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-10 22:07 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-10 22:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > Hello, > > > > With Christoph's rework of the DMA API that recently landed, the patch > > below is the only change needed in virtio to make it work in a POWER > > secure guest under the ultravisor. > > > > The other change we need (making sure the device's dma_map_ops is NULL > > so that the dma-direct/swiotlb code is used) can be made in > > powerpc-specific code. > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > What do you think? > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > The host can't access the guest memory when it's encrypted, so using > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > --- > > drivers/virtio/virtio_ring.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..321a27075380 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * not work without an even larger kludge. Instead, enable > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > + * > > + * Also, if guest memory is encrypted the host can't access > > + * it directly. In this case, we'll need to use the DMA API. > > */ > > - if (xen_domain()) > > + if (xen_domain() || sev_active()) > > return true; > > > > return false; > > So I gave this lots of thought, and I'm coming round to > basically accepting something very similar to this patch. > > But not exactly like this :). > > Let's see what are the requirements. > > If > > 1. We do not trust the device (so we want to use a bounce buffer with it) > 2. DMA address is also a physical address of a buffer > > then we should use DMA API with virtio. > > > sev_active() above is one way to put (1). I can't say I love it but > it's tolerable. > > > But we also want promise from DMA API about 2. > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > and with a virtio device without ACCESS_PLATFORM, we are trying > to pass a virtual address, and devices without ACCESS_PLATFORM > can only access CPU physical addresses. > > So something like: > > dma_addr_is_phys_addr? On our Secure pseries platform, dma address is physical address and this proposal will help us, use DMA API. On our normal pseries platform, dma address is physical address too. But we do not necessarily need to use the DMA API. We can use the DMA API, but our handlers will do the same thing, the generic virtio handlers would do. If there is an opt-out option; even when dma addr is same as physical addr, than there will be less code duplication. Would something like this be better. (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? RP > -- > MST -- Ram Pai _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-10 22:07 ` Ram Pai (?) @ 2019-08-11 5:56 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-11 5:56 UTC (permalink / raw) To: Ram Pai Cc: Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt sev_active() is gone now in linux-next, at least as a global API. And once again this is entirely going in the wrong direction. The only way using the DMA API is going to work at all is if the device is ready for it. So we need a flag on the virtio device, exposed by the hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, don't take a shortcut. And that means on power and s390 qemu will always have to set thos if you want to be ready for the ultravisor and co games. It's not like we haven't been through this a few times before, have we? ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 5:56 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-11 5:56 UTC (permalink / raw) To: Ram Pai Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson sev_active() is gone now in linux-next, at least as a global API. And once again this is entirely going in the wrong direction. The only way using the DMA API is going to work at all is if the device is ready for it. So we need a flag on the virtio device, exposed by the hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, don't take a shortcut. And that means on power and s390 qemu will always have to set thos if you want to be ready for the ultravisor and co games. It's not like we haven't been through this a few times before, have we? ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 5:56 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-11 5:56 UTC (permalink / raw) To: Ram Pai Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson sev_active() is gone now in linux-next, at least as a global API. And once again this is entirely going in the wrong direction. The only way using the DMA API is going to work at all is if the device is ready for it. So we need a flag on the virtio device, exposed by the hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, don't take a shortcut. And that means on power and s390 qemu will always have to set thos if you want to be ready for the ultravisor and co games. It's not like we haven't been through this a few times before, have we? _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig (?) @ 2019-08-11 6:46 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-11 6:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. > > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? We have been through this so many times, but I dont think, we ever understood each other. I have a fundamental question, the answer to which was never clear. Here it is... If the hypervisor (hardware for hw virtio devices) does not mandate a DMA API, why is it illegal for the driver to request, special handling of its i/o buffers? Why are we associating this special handling to always mean, some DMA address translation? Can't there be any other kind of special handling needs, that has nothing to do with DMA address translation? -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 6:46 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-11 6:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. > > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? We have been through this so many times, but I dont think, we ever understood each other. I have a fundamental question, the answer to which was never clear. Here it is... If the hypervisor (hardware for hw virtio devices) does not mandate a DMA API, why is it illegal for the driver to request, special handling of its i/o buffers? Why are we associating this special handling to always mean, some DMA address translation? Can't there be any other kind of special handling needs, that has nothing to do with DMA address translation? -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 6:46 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-11 6:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. > > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? We have been through this so many times, but I dont think, we ever understood each other. I have a fundamental question, the answer to which was never clear. Here it is... If the hypervisor (hardware for hw virtio devices) does not mandate a DMA API, why is it illegal for the driver to request, special handling of its i/o buffers? Why are we associating this special handling to always mean, some DMA address translation? Can't there be any other kind of special handling needs, that has nothing to do with DMA address translation? -- Ram Pai _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 6:46 ` Ram Pai (?) @ 2019-08-11 8:44 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:44 UTC (permalink / raw) To: Ram Pai Cc: Christoph Hellwig, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > sev_active() is gone now in linux-next, at least as a global API. > > > > And once again this is entirely going in the wrong direction. The only > > way using the DMA API is going to work at all is if the device is ready > > for it. So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > > > And that means on power and s390 qemu will always have to set thos if > > you want to be ready for the ultravisor and co games. It's not like we > > haven't been through this a few times before, have we? > > > We have been through this so many times, but I dont think, we ever > understood each other. I have a fundamental question, the answer to > which was never clear. Here it is... > > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I think the answer to that is, extend the DMA API to cover that special need then. And that's exactly what dma_addr_is_phys_addr is trying to do. > > -- > Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:44 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:44 UTC (permalink / raw) To: Ram Pai Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > sev_active() is gone now in linux-next, at least as a global API. > > > > And once again this is entirely going in the wrong direction. The only > > way using the DMA API is going to work at all is if the device is ready > > for it. So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > > > And that means on power and s390 qemu will always have to set thos if > > you want to be ready for the ultravisor and co games. It's not like we > > haven't been through this a few times before, have we? > > > We have been through this so many times, but I dont think, we ever > understood each other. I have a fundamental question, the answer to > which was never clear. Here it is... > > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I think the answer to that is, extend the DMA API to cover that special need then. And that's exactly what dma_addr_is_phys_addr is trying to do. > > -- > Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:44 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:44 UTC (permalink / raw) To: Ram Pai Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > sev_active() is gone now in linux-next, at least as a global API. > > > > And once again this is entirely going in the wrong direction. The only > > way using the DMA API is going to work at all is if the device is ready > > for it. So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > > > And that means on power and s390 qemu will always have to set thos if > > you want to be ready for the ultravisor and co games. It's not like we > > haven't been through this a few times before, have we? > > > We have been through this so many times, but I dont think, we ever > understood each other. I have a fundamental question, the answer to > which was never clear. Here it is... > > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I think the answer to that is, extend the DMA API to cover that special need then. And that's exactly what dma_addr_is_phys_addr is trying to do. > > -- > Ram Pai _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 6:46 ` Ram Pai ` (2 preceding siblings ...) (?) @ 2019-08-12 12:13 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:13 UTC (permalink / raw) To: Ram Pai Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I don't think it is illegal per se. It is however completely broken if we do that decision on a system weide scale rather than properly requesting it through a per-device flag in the normal virtio framework. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 6:46 ` Ram Pai @ 2019-08-12 12:13 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:13 UTC (permalink / raw) To: Ram Pai Cc: Christoph Hellwig, Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I don't think it is illegal per se. It is however completely broken if we do that decision on a system weide scale rather than properly requesting it through a per-device flag in the normal virtio framework. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-12 12:13 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:13 UTC (permalink / raw) To: Ram Pai Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > If the hypervisor (hardware for hw virtio devices) does not mandate a > DMA API, why is it illegal for the driver to request, special handling > of its i/o buffers? Why are we associating this special handling to > always mean, some DMA address translation? Can't there be > any other kind of special handling needs, that has nothing to do with > DMA address translation? I don't think it is illegal per se. It is however completely broken if we do that decision on a system weide scale rather than properly requesting it through a per-device flag in the normal virtio framework. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-12 12:13 ` Christoph Hellwig @ 2019-08-12 20:29 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-12 20:29 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Mon, Aug 12, 2019 at 02:13:24PM +0200, Christoph Hellwig wrote: > On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > > If the hypervisor (hardware for hw virtio devices) does not mandate a > > DMA API, why is it illegal for the driver to request, special handling > > of its i/o buffers? Why are we associating this special handling to > > always mean, some DMA address translation? Can't there be > > any other kind of special handling needs, that has nothing to do with > > DMA address translation? > > I don't think it is illegal per se. It is however completely broken > if we do that decision on a system weide scale rather than properly > requesting it through a per-device flag in the normal virtio framework. if the decision has to be system-wide; for reasons known locally only to the kernel/driver, something that is independent of any device-flag, what would be the mechanism? RP ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-12 20:29 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-12 20:29 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Mon, Aug 12, 2019 at 02:13:24PM +0200, Christoph Hellwig wrote: > On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > > If the hypervisor (hardware for hw virtio devices) does not mandate a > > DMA API, why is it illegal for the driver to request, special handling > > of its i/o buffers? Why are we associating this special handling to > > always mean, some DMA address translation? Can't there be > > any other kind of special handling needs, that has nothing to do with > > DMA address translation? > > I don't think it is illegal per se. It is however completely broken > if we do that decision on a system weide scale rather than properly > requesting it through a per-device flag in the normal virtio framework. if the decision has to be system-wide; for reasons known locally only to the kernel/driver, something that is independent of any device-flag, what would be the mechanism? RP _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-12 12:13 ` Christoph Hellwig (?) (?) @ 2019-08-12 20:29 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-12 20:29 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Mon, Aug 12, 2019 at 02:13:24PM +0200, Christoph Hellwig wrote: > On Sat, Aug 10, 2019 at 11:46:21PM -0700, Ram Pai wrote: > > If the hypervisor (hardware for hw virtio devices) does not mandate a > > DMA API, why is it illegal for the driver to request, special handling > > of its i/o buffers? Why are we associating this special handling to > > always mean, some DMA address translation? Can't there be > > any other kind of special handling needs, that has nothing to do with > > DMA address translation? > > I don't think it is illegal per se. It is however completely broken > if we do that decision on a system weide scale rather than properly > requesting it through a per-device flag in the normal virtio framework. if the decision has to be system-wide; for reasons known locally only to the kernel/driver, something that is independent of any device-flag, what would be the mechanism? RP ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig @ 2019-08-11 8:42 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Ram Pai, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So the point made is that if DMA addresses are also physical addresses (not necessarily the same physical addresses that driver supplied), then DMA API actually works even though device itself uses CPU page tables. To put it in other terms: it would be possible to make all or part of memory unenecrypted and then have virtio access all of it. SEV guests at the moment make a decision to instead use a bounce buffer, forcing an extra copy but gaining security. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So the point made is that if DMA addresses are also physical addresses (not necessarily the same physical addresses that driver supplied), then DMA API actually works even though device itself uses CPU page tables. To put it in other terms: it would be possible to make all or part of memory unenecrypted and then have virtio access all of it. SEV guests at the moment make a decision to instead use a bounce buffer, forcing an extra copy but gaining security. -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig ` (3 preceding siblings ...) (?) @ 2019-08-11 8:42 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So the point made is that if DMA addresses are also physical addresses (not necessarily the same physical addresses that driver supplied), then DMA API actually works even though device itself uses CPU page tables. To put it in other terms: it would be possible to make all or part of memory unenecrypted and then have virtio access all of it. SEV guests at the moment make a decision to instead use a bounce buffer, forcing an extra copy but gaining security. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig ` (4 preceding siblings ...) (?) @ 2019-08-11 8:55 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. The point here is that it's actually still not real. So we would still use a physical address. However Linux decides that it wants extra security by moving all data through the bounce buffer. The distinction made is that one can actually give device a physical address of the bounce buffer. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig @ 2019-08-11 8:55 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Ram Pai, Thiago Jung Bauermann, virtualization, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. The point here is that it's actually still not real. So we would still use a physical address. However Linux decides that it wants extra security by moving all data through the bounce buffer. The distinction made is that one can actually give device a physical address of the bounce buffer. -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:55 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, David Gibson On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. The point here is that it's actually still not real. So we would still use a physical address. However Linux decides that it wants extra security by moving all data through the bounce buffer. The distinction made is that one can actually give device a physical address of the bounce buffer. -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 8:55 ` Michael S. Tsirkin (?) @ 2019-08-12 12:15 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, Christoph Hellwig, David Gibson On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > The point here is that it's actually still not real. So we would still > use a physical address. However Linux decides that it wants extra > security by moving all data through the bounce buffer. The distinction > made is that one can actually give device a physical address of the > bounce buffer. Sure. The problem is just that you keep piling hacks on top of hacks. We need the per-device flag anyway to properly support hardware virtio device in all circumstances. Instead of coming up with another ad-hoc hack to force DMA uses implement that one proper bit and reuse it here. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 8:55 ` Michael S. Tsirkin @ 2019-08-12 12:15 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Christoph Hellwig, Ram Pai, Thiago Jung Bauermann, virtualization, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > The point here is that it's actually still not real. So we would still > use a physical address. However Linux decides that it wants extra > security by moving all data through the bounce buffer. The distinction > made is that one can actually give device a physical address of the > bounce buffer. Sure. The problem is just that you keep piling hacks on top of hacks. We need the per-device flag anyway to properly support hardware virtio device in all circumstances. Instead of coming up with another ad-hoc hack to force DMA uses implement that one proper bit and reuse it here. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-12 12:15 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-12 12:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, Christoph Hellwig, David Gibson On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > So we need a flag on the virtio device, exposed by the > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > don't take a shortcut. > > The point here is that it's actually still not real. So we would still > use a physical address. However Linux decides that it wants extra > security by moving all data through the bounce buffer. The distinction > made is that one can actually give device a physical address of the > bounce buffer. Sure. The problem is just that you keep piling hacks on top of hacks. We need the per-device flag anyway to properly support hardware virtio device in all circumstances. Instead of coming up with another ad-hoc hack to force DMA uses implement that one proper bit and reuse it here. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-12 12:15 ` Christoph Hellwig (?) @ 2019-09-06 5:07 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-09-06 5:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Ram Pai, Thiago Jung Bauermann, virtualization, iommu, linux-kernel, Jason Wang, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Mon, Aug 12, 2019 at 02:15:32PM +0200, Christoph Hellwig wrote: > On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > > So we need a flag on the virtio device, exposed by the > > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > > don't take a shortcut. > > > > The point here is that it's actually still not real. So we would still > > use a physical address. However Linux decides that it wants extra > > security by moving all data through the bounce buffer. The distinction > > made is that one can actually give device a physical address of the > > bounce buffer. > > Sure. The problem is just that you keep piling hacks on top of hacks. > We need the per-device flag anyway to properly support hardware virtio > device in all circumstances. Instead of coming up with another ad-hoc > hack to force DMA uses implement that one proper bit and reuse it here. The flag that you mention literally means "I am a real device" so for example, you can use VFIO with it. And this device isn't a real one, and you can't use VFIO with it, even though it's part of a power system which always has an IOMMU. Or here's another way to put it: we have a broken device that can only access physical addresses, not DMA addresses. But to enable SEV Linux requires DMA API. So we can still make it work if DMA address happens to be a physical address (not necessarily of the same page). This is where dma_addr_is_a_phys_addr() is coming from: it tells us this weird configuration can still work. What are we going to do for SEV if dma_addr_is_a_phys_addr does not apply? Fail probe I guess. So the proposal is really to make things safe and to this end, to add this in probe: if (sev_active() && !dma_addr_is_a_phys_addr(dev) && !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) return -EINVAL; the point being to prevent loading driver where it would corrupt guest memory. Put this way, any objections to adding dma_addr_is_a_phys_addr to the DMA API? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-09-06 5:07 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-09-06 5:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, David Gibson On Mon, Aug 12, 2019 at 02:15:32PM +0200, Christoph Hellwig wrote: > On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > > So we need a flag on the virtio device, exposed by the > > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > > don't take a shortcut. > > > > The point here is that it's actually still not real. So we would still > > use a physical address. However Linux decides that it wants extra > > security by moving all data through the bounce buffer. The distinction > > made is that one can actually give device a physical address of the > > bounce buffer. > > Sure. The problem is just that you keep piling hacks on top of hacks. > We need the per-device flag anyway to properly support hardware virtio > device in all circumstances. Instead of coming up with another ad-hoc > hack to force DMA uses implement that one proper bit and reuse it here. The flag that you mention literally means "I am a real device" so for example, you can use VFIO with it. And this device isn't a real one, and you can't use VFIO with it, even though it's part of a power system which always has an IOMMU. Or here's another way to put it: we have a broken device that can only access physical addresses, not DMA addresses. But to enable SEV Linux requires DMA API. So we can still make it work if DMA address happens to be a physical address (not necessarily of the same page). This is where dma_addr_is_a_phys_addr() is coming from: it tells us this weird configuration can still work. What are we going to do for SEV if dma_addr_is_a_phys_addr does not apply? Fail probe I guess. So the proposal is really to make things safe and to this end, to add this in probe: if (sev_active() && !dma_addr_is_a_phys_addr(dev) && !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) return -EINVAL; the point being to prevent loading driver where it would corrupt guest memory. Put this way, any objections to adding dma_addr_is_a_phys_addr to the DMA API? -- MST ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-09-06 5:07 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-09-06 5:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, David Gibson On Mon, Aug 12, 2019 at 02:15:32PM +0200, Christoph Hellwig wrote: > On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote: > > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > > > So we need a flag on the virtio device, exposed by the > > > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > > > don't take a shortcut. > > > > The point here is that it's actually still not real. So we would still > > use a physical address. However Linux decides that it wants extra > > security by moving all data through the bounce buffer. The distinction > > made is that one can actually give device a physical address of the > > bounce buffer. > > Sure. The problem is just that you keep piling hacks on top of hacks. > We need the per-device flag anyway to properly support hardware virtio > device in all circumstances. Instead of coming up with another ad-hoc > hack to force DMA uses implement that one proper bit and reuse it here. The flag that you mention literally means "I am a real device" so for example, you can use VFIO with it. And this device isn't a real one, and you can't use VFIO with it, even though it's part of a power system which always has an IOMMU. Or here's another way to put it: we have a broken device that can only access physical addresses, not DMA addresses. But to enable SEV Linux requires DMA API. So we can still make it work if DMA address happens to be a physical address (not necessarily of the same page). This is where dma_addr_is_a_phys_addr() is coming from: it tells us this weird configuration can still work. What are we going to do for SEV if dma_addr_is_a_phys_addr does not apply? Fail probe I guess. So the proposal is really to make things safe and to this end, to add this in probe: if (sev_active() && !dma_addr_is_a_phys_addr(dev) && !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) return -EINVAL; the point being to prevent loading driver where it would corrupt guest memory. Put this way, any objections to adding dma_addr_is_a_phys_addr to the DMA API? -- MST _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig ` (6 preceding siblings ...) (?) @ 2019-08-12 9:51 ` David Gibson -1 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-12 9:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel [-- Attachment #1.1: Type: text/plain, Size: 2064 bytes --] On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. There still seems to be a failure to understand each other here. The limitation here simply *is not* a property of the device. In fact, it's effectively a property of the memory the virtio device would be trying to access (because it's in secure mode it can't be directly accessed via the hypervisor). There absolutely are cases where this is a device property (a physical virtio device being the obvious one), but this isn't one of them. Unfortunately, we're kind of stymied by the feature negotiation model of virtio. AIUI the hypervisor / device presents a bunch of feature bits of which the guest / driver selects a subset. AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, because to handle for cases where it *is* a device limitation, we assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then the guest *must* select it. What we actually need here is for the hypervisor to present VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need a way for the platform core code to communicate to the virtio driver that *it* requires the IOMMU to be used, so that the driver can select or not the feature bit on that basis. > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-11 5:56 ` Christoph Hellwig @ 2019-08-12 9:51 ` David Gibson -1 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-12 9:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Ram Pai, Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt [-- Attachment #1: Type: text/plain, Size: 2064 bytes --] On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. There still seems to be a failure to understand each other here. The limitation here simply *is not* a property of the device. In fact, it's effectively a property of the memory the virtio device would be trying to access (because it's in secure mode it can't be directly accessed via the hypervisor). There absolutely are cases where this is a device property (a physical virtio device being the obvious one), but this isn't one of them. Unfortunately, we're kind of stymied by the feature negotiation model of virtio. AIUI the hypervisor / device presents a bunch of feature bits of which the guest / driver selects a subset. AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, because to handle for cases where it *is* a device limitation, we assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then the guest *must* select it. What we actually need here is for the hypervisor to present VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need a way for the platform core code to communicate to the virtio driver that *it* requires the IOMMU to be used, so that the driver can select or not the feature bit on that basis. > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-12 9:51 ` David Gibson 0 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-12 9:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel [-- Attachment #1.1: Type: text/plain, Size: 2064 bytes --] On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote: > sev_active() is gone now in linux-next, at least as a global API. > > And once again this is entirely going in the wrong direction. The only > way using the DMA API is going to work at all is if the device is ready > for it. So we need a flag on the virtio device, exposed by the > hypervisor (or hardware for hw virtio devices) that says: hey, I'm real, > don't take a shortcut. There still seems to be a failure to understand each other here. The limitation here simply *is not* a property of the device. In fact, it's effectively a property of the memory the virtio device would be trying to access (because it's in secure mode it can't be directly accessed via the hypervisor). There absolutely are cases where this is a device property (a physical virtio device being the obvious one), but this isn't one of them. Unfortunately, we're kind of stymied by the feature negotiation model of virtio. AIUI the hypervisor / device presents a bunch of feature bits of which the guest / driver selects a subset. AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, because to handle for cases where it *is* a device limitation, we assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then the guest *must* select it. What we actually need here is for the hypervisor to present VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need a way for the platform core code to communicate to the virtio driver that *it* requires the IOMMU to be used, so that the driver can select or not the feature bit on that basis. > And that means on power and s390 qemu will always have to set thos if > you want to be ready for the ultravisor and co games. It's not like we > haven't been through this a few times before, have we? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-12 9:51 ` David Gibson @ 2019-08-13 13:26 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-13 13:26 UTC (permalink / raw) To: David Gibson Cc: Christoph Hellwig, Ram Pai, Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > because to handle for cases where it *is* a device limitation, we > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > the guest *must* select it. > > What we actually need here is for the hypervisor to present > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > a way for the platform core code to communicate to the virtio driver > that *it* requires the IOMMU to be used, so that the driver can select > or not the feature bit on that basis. I agree with the above, but that just brings us back to the original issue - the whole bypass of the DMA OPS should be an option that the device can offer, not the other way around. And we really need to fix that root cause instead of doctoring around it. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-13 13:26 ` Christoph Hellwig 0 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-13 13:26 UTC (permalink / raw) To: David Gibson Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > because to handle for cases where it *is* a device limitation, we > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > the guest *must* select it. > > What we actually need here is for the hypervisor to present > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > a way for the platform core code to communicate to the virtio driver > that *it* requires the IOMMU to be used, so that the driver can select > or not the feature bit on that basis. I agree with the above, but that just brings us back to the original issue - the whole bypass of the DMA OPS should be an option that the device can offer, not the other way around. And we really need to fix that root cause instead of doctoring around it. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-13 13:26 ` Christoph Hellwig @ 2019-08-13 14:24 ` David Gibson -1 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-13 14:24 UTC (permalink / raw) To: Christoph Hellwig Cc: Ram Pai, Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt [-- Attachment #1: Type: text/plain, Size: 1852 bytes --] On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > because to handle for cases where it *is* a device limitation, we > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > the guest *must* select it. > > > > What we actually need here is for the hypervisor to present > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > a way for the platform core code to communicate to the virtio driver > > that *it* requires the IOMMU to be used, so that the driver can select > > or not the feature bit on that basis. > > I agree with the above, but that just brings us back to the original > issue - the whole bypass of the DMA OPS should be an option that the > device can offer, not the other way around. And we really need to > fix that root cause instead of doctoring around it. I'm not exactly sure what you mean by "device" in this context. Do you mean the hypervisor (qemu) side implementation? You're right that this was the wrong way around to begin with, but as well as being hard to change now, I don't see how it really addresses the current problem. The device could default to IOMMU and allow bypass, but the driver would still need to get information from the platform to know that it *can't* accept that option in the case of a secure VM. Reversed sense, but the same basic problem. The hypervisor does not, and can not be aware of the secure VM restrictions - only the guest side platform code knows that. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-13 14:24 ` David Gibson 0 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-13 14:24 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel [-- Attachment #1.1: Type: text/plain, Size: 1852 bytes --] On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > because to handle for cases where it *is* a device limitation, we > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > the guest *must* select it. > > > > What we actually need here is for the hypervisor to present > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > a way for the platform core code to communicate to the virtio driver > > that *it* requires the IOMMU to be used, so that the driver can select > > or not the feature bit on that basis. > > I agree with the above, but that just brings us back to the original > issue - the whole bypass of the DMA OPS should be an option that the > device can offer, not the other way around. And we really need to > fix that root cause instead of doctoring around it. I'm not exactly sure what you mean by "device" in this context. Do you mean the hypervisor (qemu) side implementation? You're right that this was the wrong way around to begin with, but as well as being hard to change now, I don't see how it really addresses the current problem. The device could default to IOMMU and allow bypass, but the driver would still need to get information from the platform to know that it *can't* accept that option in the case of a secure VM. Reversed sense, but the same basic problem. The hypervisor does not, and can not be aware of the secure VM restrictions - only the guest side platform code knows that. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-13 14:24 ` David Gibson (?) @ 2019-08-13 15:45 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-13 15:45 UTC (permalink / raw) To: David Gibson Cc: Christoph Hellwig, Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > because to handle for cases where it *is* a device limitation, we > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > the guest *must* select it. > > > > > > What we actually need here is for the hypervisor to present > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > a way for the platform core code to communicate to the virtio driver > > > that *it* requires the IOMMU to be used, so that the driver can select > > > or not the feature bit on that basis. > > > > I agree with the above, but that just brings us back to the original > > issue - the whole bypass of the DMA OPS should be an option that the > > device can offer, not the other way around. And we really need to > > fix that root cause instead of doctoring around it. > > I'm not exactly sure what you mean by "device" in this context. Do > you mean the hypervisor (qemu) side implementation? > > You're right that this was the wrong way around to begin with, but as > well as being hard to change now, I don't see how it really addresses > the current problem. The device could default to IOMMU and allow > bypass, but the driver would still need to get information from the > platform to know that it *can't* accept that option in the case of a > secure VM. Reversed sense, but the same basic problem. > > The hypervisor does not, and can not be aware of the secure VM > restrictions - only the guest side platform code knows that. This statement is almost entirely right. I will rephrase it to make it entirely right. The hypervisor does not, and can not be aware of the secure VM requirement that it needs to do some special processing that has nothing to do with DMA address translation - only the guest side platform code know that. BTW: I do not consider 'bounce buffering' as 'DMA address translation'. DMA address translation, translates CPU address to DMA address. Bounce buffering moves the data from one buffer at a given CPU address to another buffer at a different CPU address. Unfortunately the current DMA ops conflates the two. The need to do 'DMA address translation' is something the device can enforce. But the need to do bounce buffering, is something that the device should not be aware and should be entirely a decision made locally by the kernel/driver in the secure VM. RP > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-13 15:45 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-13 15:45 UTC (permalink / raw) To: David Gibson Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > because to handle for cases where it *is* a device limitation, we > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > the guest *must* select it. > > > > > > What we actually need here is for the hypervisor to present > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > a way for the platform core code to communicate to the virtio driver > > > that *it* requires the IOMMU to be used, so that the driver can select > > > or not the feature bit on that basis. > > > > I agree with the above, but that just brings us back to the original > > issue - the whole bypass of the DMA OPS should be an option that the > > device can offer, not the other way around. And we really need to > > fix that root cause instead of doctoring around it. > > I'm not exactly sure what you mean by "device" in this context. Do > you mean the hypervisor (qemu) side implementation? > > You're right that this was the wrong way around to begin with, but as > well as being hard to change now, I don't see how it really addresses > the current problem. The device could default to IOMMU and allow > bypass, but the driver would still need to get information from the > platform to know that it *can't* accept that option in the case of a > secure VM. Reversed sense, but the same basic problem. > > The hypervisor does not, and can not be aware of the secure VM > restrictions - only the guest side platform code knows that. This statement is almost entirely right. I will rephrase it to make it entirely right. The hypervisor does not, and can not be aware of the secure VM requirement that it needs to do some special processing that has nothing to do with DMA address translation - only the guest side platform code know that. BTW: I do not consider 'bounce buffering' as 'DMA address translation'. DMA address translation, translates CPU address to DMA address. Bounce buffering moves the data from one buffer at a given CPU address to another buffer at a different CPU address. Unfortunately the current DMA ops conflates the two. The need to do 'DMA address translation' is something the device can enforce. But the need to do bounce buffering, is something that the device should not be aware and should be entirely a decision made locally by the kernel/driver in the secure VM. RP > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-13 15:45 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-13 15:45 UTC (permalink / raw) To: David Gibson Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > because to handle for cases where it *is* a device limitation, we > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > the guest *must* select it. > > > > > > What we actually need here is for the hypervisor to present > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > a way for the platform core code to communicate to the virtio driver > > > that *it* requires the IOMMU to be used, so that the driver can select > > > or not the feature bit on that basis. > > > > I agree with the above, but that just brings us back to the original > > issue - the whole bypass of the DMA OPS should be an option that the > > device can offer, not the other way around. And we really need to > > fix that root cause instead of doctoring around it. > > I'm not exactly sure what you mean by "device" in this context. Do > you mean the hypervisor (qemu) side implementation? > > You're right that this was the wrong way around to begin with, but as > well as being hard to change now, I don't see how it really addresses > the current problem. The device could default to IOMMU and allow > bypass, but the driver would still need to get information from the > platform to know that it *can't* accept that option in the case of a > secure VM. Reversed sense, but the same basic problem. > > The hypervisor does not, and can not be aware of the secure VM > restrictions - only the guest side platform code knows that. This statement is almost entirely right. I will rephrase it to make it entirely right. The hypervisor does not, and can not be aware of the secure VM requirement that it needs to do some special processing that has nothing to do with DMA address translation - only the guest side platform code know that. BTW: I do not consider 'bounce buffering' as 'DMA address translation'. DMA address translation, translates CPU address to DMA address. Bounce buffering moves the data from one buffer at a given CPU address to another buffer at a different CPU address. Unfortunately the current DMA ops conflates the two. The need to do 'DMA address translation' is something the device can enforce. But the need to do bounce buffering, is something that the device should not be aware and should be entirely a decision made locally by the kernel/driver in the secure VM. RP > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- Ram Pai _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-13 15:45 ` Ram Pai @ 2019-08-26 17:48 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-26 17:48 UTC (permalink / raw) To: David Gibson, Christoph Hellwig Cc: Michael S. Tsirkin, Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt, ram.n.pai On Tue, Aug 13, 2019 at 08:45:37AM -0700, Ram Pai wrote: > On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > > because to handle for cases where it *is* a device limitation, we > > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > > the guest *must* select it. > > > > > > > > What we actually need here is for the hypervisor to present > > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > > a way for the platform core code to communicate to the virtio driver > > > > that *it* requires the IOMMU to be used, so that the driver can select > > > > or not the feature bit on that basis. > > > > > > I agree with the above, but that just brings us back to the original > > > issue - the whole bypass of the DMA OPS should be an option that the > > > device can offer, not the other way around. And we really need to > > > fix that root cause instead of doctoring around it. > > > > I'm not exactly sure what you mean by "device" in this context. Do > > you mean the hypervisor (qemu) side implementation? > > > > You're right that this was the wrong way around to begin with, but as > > well as being hard to change now, I don't see how it really addresses > > the current problem. The device could default to IOMMU and allow > > bypass, but the driver would still need to get information from the > > platform to know that it *can't* accept that option in the case of a > > secure VM. Reversed sense, but the same basic problem. > > > > The hypervisor does not, and can not be aware of the secure VM > > restrictions - only the guest side platform code knows that. > > This statement is almost entirely right. I will rephrase it to make it > entirely right. > > The hypervisor does not, and can not be aware of the secure VM > requirement that it needs to do some special processing that has nothing > to do with DMA address translation - only the guest side platform code > know that. > > BTW: I do not consider 'bounce buffering' as 'DMA address translation'. > DMA address translation, translates CPU address to DMA address. Bounce > buffering moves the data from one buffer at a given CPU address to > another buffer at a different CPU address. Unfortunately the current > DMA ops conflates the two. The need to do 'DMA address translation' > is something the device can enforce. But the need to do bounce > buffering, is something that the device should not be aware and should be > entirely a decision made locally by the kernel/driver in the secure VM. Christoph, Since we have not heard back from you, I am not sure where you stand on this issue now. One of the three things are possible.. (a) our above explaination did not make sense and hence you decided to ignore it. (b) our above above made some sense and need more time to think and respond. (c) you totally forgot about this. I hope it is (b). We want a solution that works for all, and your inputs are important to us. Thanks, RP ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-26 17:48 ` Ram Pai 0 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-26 17:48 UTC (permalink / raw) To: David Gibson, Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, ram.n.pai, linuxppc-devel On Tue, Aug 13, 2019 at 08:45:37AM -0700, Ram Pai wrote: > On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > > because to handle for cases where it *is* a device limitation, we > > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > > the guest *must* select it. > > > > > > > > What we actually need here is for the hypervisor to present > > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > > a way for the platform core code to communicate to the virtio driver > > > > that *it* requires the IOMMU to be used, so that the driver can select > > > > or not the feature bit on that basis. > > > > > > I agree with the above, but that just brings us back to the original > > > issue - the whole bypass of the DMA OPS should be an option that the > > > device can offer, not the other way around. And we really need to > > > fix that root cause instead of doctoring around it. > > > > I'm not exactly sure what you mean by "device" in this context. Do > > you mean the hypervisor (qemu) side implementation? > > > > You're right that this was the wrong way around to begin with, but as > > well as being hard to change now, I don't see how it really addresses > > the current problem. The device could default to IOMMU and allow > > bypass, but the driver would still need to get information from the > > platform to know that it *can't* accept that option in the case of a > > secure VM. Reversed sense, but the same basic problem. > > > > The hypervisor does not, and can not be aware of the secure VM > > restrictions - only the guest side platform code knows that. > > This statement is almost entirely right. I will rephrase it to make it > entirely right. > > The hypervisor does not, and can not be aware of the secure VM > requirement that it needs to do some special processing that has nothing > to do with DMA address translation - only the guest side platform code > know that. > > BTW: I do not consider 'bounce buffering' as 'DMA address translation'. > DMA address translation, translates CPU address to DMA address. Bounce > buffering moves the data from one buffer at a given CPU address to > another buffer at a different CPU address. Unfortunately the current > DMA ops conflates the two. The need to do 'DMA address translation' > is something the device can enforce. But the need to do bounce > buffering, is something that the device should not be aware and should be > entirely a decision made locally by the kernel/driver in the secure VM. Christoph, Since we have not heard back from you, I am not sure where you stand on this issue now. One of the three things are possible.. (a) our above explaination did not make sense and hence you decided to ignore it. (b) our above above made some sense and need more time to think and respond. (c) you totally forgot about this. I hope it is (b). We want a solution that works for all, and your inputs are important to us. Thanks, RP _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* RE: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-13 15:45 ` Ram Pai ` (2 preceding siblings ...) (?) @ 2019-08-26 17:48 ` Ram Pai -1 siblings, 0 replies; 198+ messages in thread From: Ram Pai @ 2019-08-26 17:48 UTC (permalink / raw) To: David Gibson, Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, ram.n.pai, linuxppc-devel On Tue, Aug 13, 2019 at 08:45:37AM -0700, Ram Pai wrote: > On Wed, Aug 14, 2019 at 12:24:39AM +1000, David Gibson wrote: > > On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > > > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > > > because to handle for cases where it *is* a device limitation, we > > > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > > > the guest *must* select it. > > > > > > > > What we actually need here is for the hypervisor to present > > > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > > > a way for the platform core code to communicate to the virtio driver > > > > that *it* requires the IOMMU to be used, so that the driver can select > > > > or not the feature bit on that basis. > > > > > > I agree with the above, but that just brings us back to the original > > > issue - the whole bypass of the DMA OPS should be an option that the > > > device can offer, not the other way around. And we really need to > > > fix that root cause instead of doctoring around it. > > > > I'm not exactly sure what you mean by "device" in this context. Do > > you mean the hypervisor (qemu) side implementation? > > > > You're right that this was the wrong way around to begin with, but as > > well as being hard to change now, I don't see how it really addresses > > the current problem. The device could default to IOMMU and allow > > bypass, but the driver would still need to get information from the > > platform to know that it *can't* accept that option in the case of a > > secure VM. Reversed sense, but the same basic problem. > > > > The hypervisor does not, and can not be aware of the secure VM > > restrictions - only the guest side platform code knows that. > > This statement is almost entirely right. I will rephrase it to make it > entirely right. > > The hypervisor does not, and can not be aware of the secure VM > requirement that it needs to do some special processing that has nothing > to do with DMA address translation - only the guest side platform code > know that. > > BTW: I do not consider 'bounce buffering' as 'DMA address translation'. > DMA address translation, translates CPU address to DMA address. Bounce > buffering moves the data from one buffer at a given CPU address to > another buffer at a different CPU address. Unfortunately the current > DMA ops conflates the two. The need to do 'DMA address translation' > is something the device can enforce. But the need to do bounce > buffering, is something that the device should not be aware and should be > entirely a decision made locally by the kernel/driver in the secure VM. Christoph, Since we have not heard back from you, I am not sure where you stand on this issue now. One of the three things are possible.. (a) our above explaination did not make sense and hence you decided to ignore it. (b) our above above made some sense and need more time to think and respond. (c) you totally forgot about this. I hope it is (b). We want a solution that works for all, and your inputs are important to us. Thanks, RP ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-13 13:26 ` Christoph Hellwig (?) (?) @ 2019-08-13 14:24 ` David Gibson -1 siblings, 0 replies; 198+ messages in thread From: David Gibson @ 2019-08-13 14:24 UTC (permalink / raw) To: Christoph Hellwig Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel [-- Attachment #1.1: Type: text/plain, Size: 1852 bytes --] On Tue, Aug 13, 2019 at 03:26:17PM +0200, Christoph Hellwig wrote: > On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > > because to handle for cases where it *is* a device limitation, we > > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > > the guest *must* select it. > > > > What we actually need here is for the hypervisor to present > > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > > a way for the platform core code to communicate to the virtio driver > > that *it* requires the IOMMU to be used, so that the driver can select > > or not the feature bit on that basis. > > I agree with the above, but that just brings us back to the original > issue - the whole bypass of the DMA OPS should be an option that the > device can offer, not the other way around. And we really need to > fix that root cause instead of doctoring around it. I'm not exactly sure what you mean by "device" in this context. Do you mean the hypervisor (qemu) side implementation? You're right that this was the wrong way around to begin with, but as well as being hard to change now, I don't see how it really addresses the current problem. The device could default to IOMMU and allow bypass, but the driver would still need to get information from the platform to know that it *can't* accept that option in the case of a secure VM. Reversed sense, but the same basic problem. The hypervisor does not, and can not be aware of the secure VM restrictions - only the guest side platform code knows that. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-12 9:51 ` David Gibson (?) (?) @ 2019-08-13 13:26 ` Christoph Hellwig -1 siblings, 0 replies; 198+ messages in thread From: Christoph Hellwig @ 2019-08-13 13:26 UTC (permalink / raw) To: David Gibson Cc: Michael S. Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig On Mon, Aug 12, 2019 at 07:51:56PM +1000, David Gibson wrote: > AFAICT we already kind of abuse this for the VIRTIO_F_IOMMU_PLATFORM, > because to handle for cases where it *is* a device limitation, we > assume that if the hypervisor presents VIRTIO_F_IOMMU_PLATFORM then > the guest *must* select it. > > What we actually need here is for the hypervisor to present > VIRTIO_F_IOMMU_PLATFORM as available, but not required. Then we need > a way for the platform core code to communicate to the virtio driver > that *it* requires the IOMMU to be used, so that the driver can select > or not the feature bit on that basis. I agree with the above, but that just brings us back to the original issue - the whole bypass of the DMA OPS should be an option that the device can offer, not the other way around. And we really need to fix that root cause instead of doctoring around it. ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted 2019-08-10 22:07 ` Ram Pai (?) @ 2019-08-11 8:12 ` Michael S. Tsirkin -1 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:12 UTC (permalink / raw) To: Ram Pai Cc: Thiago Jung Bauermann, virtualization, linuxppc-devel, iommu, linux-kernel, Jason Wang, Christoph Hellwig, David Gibson, Alexey Kardashevskiy, Paul Mackerras, Benjamin Herrenschmidt On Sat, Aug 10, 2019 at 03:07:02PM -0700, Ram Pai wrote: > On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > > > Hello, > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > below is the only change needed in virtio to make it work in a POWER > > > secure guest under the ultravisor. > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > so that the dma-direct/swiotlb code is used) can be made in > > > powerpc-specific code. > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > What do you think? > > > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > The host can't access the guest memory when it's encrypted, so using > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > --- > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index cd7e755484e3..321a27075380 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > * not work without an even larger kludge. Instead, enable > > > * the DMA API if we're a Xen guest, which at least allows > > > * all of the sensible Xen configurations to work correctly. > > > + * > > > + * Also, if guest memory is encrypted the host can't access > > > + * it directly. In this case, we'll need to use the DMA API. > > > */ > > > - if (xen_domain()) > > > + if (xen_domain() || sev_active()) > > > return true; > > > > > > return false; > > > > So I gave this lots of thought, and I'm coming round to > > basically accepting something very similar to this patch. > > > > But not exactly like this :). > > > > Let's see what are the requirements. > > > > If > > > > 1. We do not trust the device (so we want to use a bounce buffer with it) > > 2. DMA address is also a physical address of a buffer > > > > then we should use DMA API with virtio. > > > > > > sev_active() above is one way to put (1). I can't say I love it but > > it's tolerable. > > > > > > But we also want promise from DMA API about 2. > > > > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > > and with a virtio device without ACCESS_PLATFORM, we are trying > > to pass a virtual address, and devices without ACCESS_PLATFORM > > can only access CPU physical addresses. > > > > So something like: > > > > dma_addr_is_phys_addr? > > > On our Secure pseries platform, dma address is physical address and this > proposal will help us, use DMA API. > > On our normal pseries platform, dma address is physical address too. > But we do not necessarily need to use the DMA API. We can use the DMA > API, but our handlers will do the same thing, the generic virtio handlers > would do. If there is an opt-out option; even when dma addr is same as > physical addr, than there will be less code duplication. > > Would something like this be better. > > (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? > > > RP I think sev_active() is an OK replacement for arch_want_to_use_dma_api. So just the addition of dma_addr_is_phys_addr would be enough. > > > -- > > MST > > -- > Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:12 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:12 UTC (permalink / raw) To: Ram Pai Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 03:07:02PM -0700, Ram Pai wrote: > On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > > > Hello, > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > below is the only change needed in virtio to make it work in a POWER > > > secure guest under the ultravisor. > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > so that the dma-direct/swiotlb code is used) can be made in > > > powerpc-specific code. > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > What do you think? > > > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > The host can't access the guest memory when it's encrypted, so using > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > --- > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index cd7e755484e3..321a27075380 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > * not work without an even larger kludge. Instead, enable > > > * the DMA API if we're a Xen guest, which at least allows > > > * all of the sensible Xen configurations to work correctly. > > > + * > > > + * Also, if guest memory is encrypted the host can't access > > > + * it directly. In this case, we'll need to use the DMA API. > > > */ > > > - if (xen_domain()) > > > + if (xen_domain() || sev_active()) > > > return true; > > > > > > return false; > > > > So I gave this lots of thought, and I'm coming round to > > basically accepting something very similar to this patch. > > > > But not exactly like this :). > > > > Let's see what are the requirements. > > > > If > > > > 1. We do not trust the device (so we want to use a bounce buffer with it) > > 2. DMA address is also a physical address of a buffer > > > > then we should use DMA API with virtio. > > > > > > sev_active() above is one way to put (1). I can't say I love it but > > it's tolerable. > > > > > > But we also want promise from DMA API about 2. > > > > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > > and with a virtio device without ACCESS_PLATFORM, we are trying > > to pass a virtual address, and devices without ACCESS_PLATFORM > > can only access CPU physical addresses. > > > > So something like: > > > > dma_addr_is_phys_addr? > > > On our Secure pseries platform, dma address is physical address and this > proposal will help us, use DMA API. > > On our normal pseries platform, dma address is physical address too. > But we do not necessarily need to use the DMA API. We can use the DMA > API, but our handlers will do the same thing, the generic virtio handlers > would do. If there is an opt-out option; even when dma addr is same as > physical addr, than there will be less code duplication. > > Would something like this be better. > > (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? > > > RP I think sev_active() is an OK replacement for arch_want_to_use_dma_api. So just the addition of dma_addr_is_phys_addr would be enough. > > > -- > > MST > > -- > Ram Pai ^ permalink raw reply [flat|nested] 198+ messages in thread
* Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-08-11 8:12 ` Michael S. Tsirkin 0 siblings, 0 replies; 198+ messages in thread From: Michael S. Tsirkin @ 2019-08-11 8:12 UTC (permalink / raw) To: Ram Pai Cc: Benjamin Herrenschmidt, Jason Wang, Alexey Kardashevskiy, linux-kernel, virtualization, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson On Sat, Aug 10, 2019 at 03:07:02PM -0700, Ram Pai wrote: > On Sat, Aug 10, 2019 at 02:57:17PM -0400, Michael S. Tsirkin wrote: > > On Tue, Jan 29, 2019 at 03:08:12PM -0200, Thiago Jung Bauermann wrote: > > > > > > Hello, > > > > > > With Christoph's rework of the DMA API that recently landed, the patch > > > below is the only change needed in virtio to make it work in a POWER > > > secure guest under the ultravisor. > > > > > > The other change we need (making sure the device's dma_map_ops is NULL > > > so that the dma-direct/swiotlb code is used) can be made in > > > powerpc-specific code. > > > > > > Of course, I also have patches (soon to be posted as RFC) which hook up > > > <linux/mem_encrypt.h> to the powerpc secure guest support code. > > > > > > What do you think? > > > > > > >From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 > > > From: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > Date: Thu, 24 Jan 2019 22:08:02 -0200 > > > Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted > > > > > > The host can't access the guest memory when it's encrypted, so using > > > regular memory pages for the ring isn't an option. Go through the DMA API. > > > > > > Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> > > > --- > > > drivers/virtio/virtio_ring.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index cd7e755484e3..321a27075380 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > * not work without an even larger kludge. Instead, enable > > > * the DMA API if we're a Xen guest, which at least allows > > > * all of the sensible Xen configurations to work correctly. > > > + * > > > + * Also, if guest memory is encrypted the host can't access > > > + * it directly. In this case, we'll need to use the DMA API. > > > */ > > > - if (xen_domain()) > > > + if (xen_domain() || sev_active()) > > > return true; > > > > > > return false; > > > > So I gave this lots of thought, and I'm coming round to > > basically accepting something very similar to this patch. > > > > But not exactly like this :). > > > > Let's see what are the requirements. > > > > If > > > > 1. We do not trust the device (so we want to use a bounce buffer with it) > > 2. DMA address is also a physical address of a buffer > > > > then we should use DMA API with virtio. > > > > > > sev_active() above is one way to put (1). I can't say I love it but > > it's tolerable. > > > > > > But we also want promise from DMA API about 2. > > > > > > Without promise 2 we simply can't use DMA API with a legacy device. > > > > > > Otherwise, on a SEV system with an IOMMU which isn't 1:1 > > and with a virtio device without ACCESS_PLATFORM, we are trying > > to pass a virtual address, and devices without ACCESS_PLATFORM > > can only access CPU physical addresses. > > > > So something like: > > > > dma_addr_is_phys_addr? > > > On our Secure pseries platform, dma address is physical address and this > proposal will help us, use DMA API. > > On our normal pseries platform, dma address is physical address too. > But we do not necessarily need to use the DMA API. We can use the DMA > API, but our handlers will do the same thing, the generic virtio handlers > would do. If there is an opt-out option; even when dma addr is same as > physical addr, than there will be less code duplication. > > Would something like this be better. > > (dma_addr_is_phys_addr && arch_want_to_use_dma_api()) ? > > > RP I think sev_active() is an OK replacement for arch_want_to_use_dma_api. So just the addition of dma_addr_is_phys_addr would be enough. > > > -- > > MST > > -- > Ram Pai _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 198+ messages in thread
* [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted @ 2019-01-29 17:08 Thiago Jung Bauermann 0 siblings, 0 replies; 198+ messages in thread From: Thiago Jung Bauermann @ 2019-01-29 17:08 UTC (permalink / raw) To: virtualization Cc: Michael S . Tsirkin, Benjamin Herrenschmidt, Alexey Kardashevskiy, Ram Pai, linux-kernel, Paul Mackerras, iommu, linuxppc-devel, Christoph Hellwig, David Gibson Hello, With Christoph's rework of the DMA API that recently landed, the patch below is the only change needed in virtio to make it work in a POWER secure guest under the ultravisor. The other change we need (making sure the device's dma_map_ops is NULL so that the dma-direct/swiotlb code is used) can be made in powerpc-specific code. Of course, I also have patches (soon to be posted as RFC) which hook up <linux/mem_encrypt.h> to the powerpc secure guest support code. What do you think? From d0629a36a75c678b4a72b853f8f7f8c17eedd6b3 Mon Sep 17 00:00:00 2001 From: Thiago Jung Bauermann <bauerman@linux.ibm.com> Date: Thu, 24 Jan 2019 22:08:02 -0200 Subject: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted The host can't access the guest memory when it's encrypted, so using regular memory pages for the ring isn't an option. Go through the DMA API. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> --- drivers/virtio/virtio_ring.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..321a27075380 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -259,8 +259,11 @@ static bool vring_use_dma_api(struct virtio_device *vdev) * not work without an even larger kludge. Instead, enable * the DMA API if we're a Xen guest, which at least allows * all of the sensible Xen configurations to work correctly. + * + * Also, if guest memory is encrypted the host can't access + * it directly. In this case, we'll need to use the DMA API. */ - if (xen_domain()) + if (xen_domain() || sev_active()) return true; return false; ^ permalink raw reply related [flat|nested] 198+ messages in thread
end of thread, other threads:[~2019-09-06 5:08 UTC | newest] Thread overview: 198+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-29 17:08 [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted Thiago Jung Bauermann 2019-01-29 17:08 ` Thiago Jung Bauermann 2019-01-29 17:42 ` Thiago Jung Bauermann 2019-01-29 17:42 ` Thiago Jung Bauermann 2019-01-29 19:02 ` Michael S. Tsirkin 2019-01-29 19:02 ` Michael S. Tsirkin 2019-01-30 2:24 ` Jason Wang 2019-01-30 2:24 ` Jason Wang 2019-01-30 2:36 ` Michael S. Tsirkin 2019-01-30 2:36 ` Michael S. Tsirkin 2019-01-30 2:36 ` Michael S. Tsirkin 2019-01-30 3:05 ` Jason Wang 2019-01-30 3:05 ` Jason Wang 2019-01-30 3:05 ` Jason Wang 2019-01-30 3:26 ` Michael S. Tsirkin 2019-01-30 3:26 ` Michael S. Tsirkin 2019-01-30 3:26 ` Michael S. Tsirkin 2019-01-30 7:44 ` Christoph Hellwig 2019-01-30 7:44 ` Christoph Hellwig 2019-01-30 7:44 ` Christoph Hellwig 2019-02-04 18:15 ` Thiago Jung Bauermann 2019-02-04 18:15 ` Thiago Jung Bauermann 2019-02-04 18:15 ` Thiago Jung Bauermann 2019-02-04 21:38 ` Michael S. Tsirkin 2019-02-04 21:38 ` Michael S. Tsirkin 2019-02-05 7:24 ` Christoph Hellwig 2019-02-05 7:24 ` Christoph Hellwig 2019-02-05 16:13 ` Michael S. Tsirkin 2019-02-05 16:13 ` Michael S. Tsirkin 2019-02-05 16:13 ` Michael S. Tsirkin 2019-02-05 16:13 ` Michael S. Tsirkin 2019-02-05 7:24 ` Christoph Hellwig 2019-02-04 21:38 ` Michael S. Tsirkin 2019-03-26 16:53 ` Michael S. Tsirkin 2019-03-26 16:53 ` Michael S. Tsirkin 2019-03-26 16:53 ` Michael S. Tsirkin 2019-01-30 2:24 ` Jason Wang 2019-02-04 18:14 ` Thiago Jung Bauermann 2019-02-04 18:14 ` Thiago Jung Bauermann 2019-02-04 18:14 ` Thiago Jung Bauermann 2019-02-04 20:23 ` Michael S. Tsirkin 2019-02-04 20:23 ` Michael S. Tsirkin 2019-03-20 16:13 ` Thiago Jung Bauermann 2019-03-20 16:13 ` Thiago Jung Bauermann 2019-03-20 16:13 ` Thiago Jung Bauermann 2019-03-20 21:17 ` Michael S. Tsirkin 2019-03-20 21:17 ` Michael S. Tsirkin 2019-03-20 21:17 ` Michael S. Tsirkin 2019-03-22 0:05 ` Thiago Jung Bauermann 2019-03-22 0:05 ` Thiago Jung Bauermann 2019-03-22 0:05 ` Thiago Jung Bauermann 2019-03-23 21:01 ` Michael S. Tsirkin 2019-03-23 21:01 ` Michael S. Tsirkin 2019-03-23 21:01 ` Michael S. Tsirkin 2019-03-25 0:57 ` David Gibson 2019-03-25 0:57 ` David Gibson 2019-03-25 0:57 ` David Gibson 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-17 21:42 ` Thiago Jung Bauermann 2019-04-19 23:09 ` Michael S. Tsirkin 2019-04-19 23:09 ` Michael S. Tsirkin 2019-04-19 23:09 ` Michael S. Tsirkin 2019-04-19 23:09 ` Michael S. Tsirkin 2019-04-25 1:01 ` Thiago Jung Bauermann 2019-04-25 1:01 ` Thiago Jung Bauermann 2019-04-25 1:01 ` Thiago Jung Bauermann 2019-04-25 1:18 ` Michael S. Tsirkin 2019-04-25 1:18 ` Michael S. Tsirkin 2019-04-25 1:18 ` Michael S. Tsirkin 2019-04-25 1:18 ` Michael S. Tsirkin 2019-04-25 1:18 ` Michael S. Tsirkin 2019-04-26 23:56 ` Thiago Jung Bauermann 2019-04-26 23:56 ` Thiago Jung Bauermann 2019-04-26 23:56 ` Thiago Jung Bauermann 2019-04-26 23:56 ` Thiago Jung Bauermann 2019-05-20 13:08 ` Michael S. Tsirkin 2019-05-20 13:08 ` Michael S. Tsirkin 2019-05-20 13:08 ` Michael S. Tsirkin 2019-05-20 13:08 ` Michael S. Tsirkin 2019-04-25 1:01 ` Thiago Jung Bauermann 2019-05-20 13:16 ` Michael S. Tsirkin 2019-05-20 13:16 ` Michael S. Tsirkin 2019-05-20 13:16 ` Michael S. Tsirkin 2019-05-20 13:16 ` Michael S. Tsirkin 2019-06-04 1:13 ` Thiago Jung Bauermann 2019-06-04 1:13 ` Thiago Jung Bauermann 2019-06-04 1:13 ` Thiago Jung Bauermann 2019-06-04 1:42 ` Michael S. Tsirkin 2019-06-04 1:42 ` Michael S. Tsirkin 2019-06-04 1:42 ` Michael S. Tsirkin 2019-06-04 1:42 ` Michael S. Tsirkin 2019-06-28 1:58 ` Thiago Jung Bauermann 2019-06-28 1:58 ` Thiago Jung Bauermann 2019-06-28 1:58 ` Thiago Jung Bauermann 2019-06-28 1:58 ` Thiago Jung Bauermann 2019-07-01 14:17 ` Michael S. Tsirkin 2019-07-01 14:17 ` Michael S. Tsirkin 2019-07-01 14:17 ` Michael S. Tsirkin 2019-07-01 14:17 ` Michael S. Tsirkin 2019-07-14 5:51 ` Thiago Jung Bauermann 2019-07-14 5:51 ` Thiago Jung Bauermann 2019-07-14 5:51 ` Thiago Jung Bauermann 2019-07-14 5:51 ` Thiago Jung Bauermann 2019-07-15 14:35 ` Michael S. Tsirkin 2019-07-15 14:35 ` Michael S. Tsirkin 2019-07-15 14:35 ` Michael S. Tsirkin 2019-07-15 14:35 ` Michael S. Tsirkin 2019-07-15 20:29 ` Thiago Jung Bauermann 2019-07-15 20:29 ` Thiago Jung Bauermann 2019-07-15 20:29 ` Thiago Jung Bauermann 2019-07-15 20:36 ` Michael S. Tsirkin 2019-07-15 20:36 ` Michael S. Tsirkin 2019-07-15 20:36 ` Michael S. Tsirkin 2019-07-15 20:36 ` Michael S. Tsirkin 2019-07-15 22:03 ` Thiago Jung Bauermann 2019-07-15 22:03 ` Thiago Jung Bauermann 2019-07-15 22:03 ` Thiago Jung Bauermann 2019-07-15 22:03 ` Thiago Jung Bauermann 2019-07-15 22:16 ` Michael S. Tsirkin 2019-07-15 22:16 ` Michael S. Tsirkin 2019-07-15 22:16 ` Michael S. Tsirkin 2019-07-15 22:16 ` Michael S. Tsirkin 2019-07-15 23:05 ` Thiago Jung Bauermann 2019-07-15 23:05 ` Thiago Jung Bauermann 2019-07-15 23:05 ` Thiago Jung Bauermann 2019-07-15 23:05 ` Thiago Jung Bauermann 2019-07-15 23:24 ` Benjamin Herrenschmidt 2019-07-15 23:24 ` Benjamin Herrenschmidt 2019-07-15 23:24 ` Benjamin Herrenschmidt 2019-07-15 23:24 ` Benjamin Herrenschmidt 2019-07-15 20:29 ` Thiago Jung Bauermann 2019-07-18 3:39 ` Thiago Jung Bauermann 2019-07-18 3:39 ` Thiago Jung Bauermann 2019-07-18 3:39 ` Thiago Jung Bauermann 2019-06-04 1:13 ` Thiago Jung Bauermann 2019-03-20 16:13 ` Thiago Jung Bauermann 2019-02-04 20:23 ` Michael S. Tsirkin 2019-01-29 19:02 ` Michael S. Tsirkin 2019-01-29 17:42 ` Thiago Jung Bauermann 2019-08-10 18:57 ` Michael S. Tsirkin 2019-08-10 18:57 ` Michael S. Tsirkin 2019-08-10 18:57 ` Michael S. Tsirkin 2019-08-10 22:07 ` Ram Pai 2019-08-10 22:07 ` Ram Pai 2019-08-10 22:07 ` Ram Pai 2019-08-11 5:56 ` Christoph Hellwig 2019-08-11 5:56 ` Christoph Hellwig 2019-08-11 5:56 ` Christoph Hellwig 2019-08-11 6:46 ` Ram Pai 2019-08-11 6:46 ` Ram Pai 2019-08-11 6:46 ` Ram Pai 2019-08-11 8:44 ` Michael S. Tsirkin 2019-08-11 8:44 ` Michael S. Tsirkin 2019-08-11 8:44 ` Michael S. Tsirkin 2019-08-12 12:13 ` Christoph Hellwig 2019-08-12 12:13 ` Christoph Hellwig 2019-08-12 12:13 ` Christoph Hellwig 2019-08-12 20:29 ` Ram Pai 2019-08-12 20:29 ` Ram Pai 2019-08-12 20:29 ` Ram Pai 2019-08-11 8:42 ` Michael S. Tsirkin 2019-08-11 8:42 ` Michael S. Tsirkin 2019-08-11 8:42 ` Michael S. Tsirkin 2019-08-11 8:55 ` Michael S. Tsirkin 2019-08-11 8:55 ` Michael S. Tsirkin 2019-08-11 8:55 ` Michael S. Tsirkin 2019-08-12 12:15 ` Christoph Hellwig 2019-08-12 12:15 ` Christoph Hellwig 2019-08-12 12:15 ` Christoph Hellwig 2019-09-06 5:07 ` Michael S. Tsirkin 2019-09-06 5:07 ` Michael S. Tsirkin 2019-09-06 5:07 ` Michael S. Tsirkin 2019-08-12 9:51 ` David Gibson 2019-08-12 9:51 ` David Gibson 2019-08-12 9:51 ` David Gibson 2019-08-13 13:26 ` Christoph Hellwig 2019-08-13 13:26 ` Christoph Hellwig 2019-08-13 14:24 ` David Gibson 2019-08-13 14:24 ` David Gibson 2019-08-13 15:45 ` Ram Pai 2019-08-13 15:45 ` Ram Pai 2019-08-13 15:45 ` Ram Pai 2019-08-26 17:48 ` Ram Pai 2019-08-26 17:48 ` Ram Pai 2019-08-26 17:48 ` Ram Pai 2019-08-13 14:24 ` David Gibson 2019-08-13 13:26 ` Christoph Hellwig 2019-08-11 8:12 ` Michael S. Tsirkin 2019-08-11 8:12 ` Michael S. Tsirkin 2019-08-11 8:12 ` Michael S. Tsirkin -- strict thread matches above, loose matches on Subject: below -- 2019-01-29 17:08 Thiago Jung Bauermann
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.