linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] virtio: Force DMA restricted devices through DMA API
@ 2022-07-19 10:02 Keir Fraser
  2022-07-19 11:56 ` Michael S. Tsirkin
  2022-07-19 15:23 ` Christoph Hellwig
  0 siblings, 2 replies; 13+ messages in thread
From: Keir Fraser @ 2022-07-19 10:02 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: kernel-team, Keir Fraser, virtualization, linux-kernel

If virtio devices are tagged for "restricted-dma-pool", then that
pool should be used for virtio ring setup, via the DMA API.

In particular, this fixes virtio_balloon for ARM PKVM, where the usual
workaround of setting VIRTIO_F_ACCESS_PLATFORM in the virtio device
doesn't work because the virtio_balloon driver clears the flag. This
seems a more robust fix than fiddling the flag again.

Signed-off-by: Keir Fraser <keirf@google.com>
---
 drivers/virtio/virtio_ring.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index a5ec724c01d8..12be2607c648 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -12,6 +12,7 @@
 #include <linux/hrtimer.h>
 #include <linux/dma-mapping.h>
 #include <linux/spinlock.h>
+#include <linux/swiotlb.h>
 #include <xen/xen.h>
 
 #ifdef DEBUG
@@ -248,6 +249,13 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
 	if (!virtio_has_dma_quirk(vdev))
 		return true;
 
+	/* If the device is configured to use a DMA restricted pool,
+	 * we had better use it.
+	 */
+	if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
+	    is_swiotlb_for_alloc(vdev->dev.parent))
+		return true;
+
 	/* Otherwise, we are left to guess. */
 	/*
 	 * In theory, it's possible to have a buggy QEMU-supposed
-- 
2.37.0.170.g444d1eabd0-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 10:02 [PATCH] virtio: Force DMA restricted devices through DMA API Keir Fraser
@ 2022-07-19 11:56 ` Michael S. Tsirkin
  2022-07-19 14:05   ` Keir Fraser
  2022-07-19 15:23 ` Christoph Hellwig
  1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2022-07-19 11:56 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Jason Wang, kernel-team, virtualization, linux-kernel

On Tue, Jul 19, 2022 at 10:02:56AM +0000, Keir Fraser wrote:
> If virtio devices are tagged for "restricted-dma-pool", then that
> pool should be used for virtio ring setup, via the DMA API.
> 
> In particular, this fixes virtio_balloon for ARM PKVM, where the usual
> workaround of setting VIRTIO_F_ACCESS_PLATFORM in the virtio device
> doesn't work because the virtio_balloon driver clears the flag. This
> seems a more robust fix than fiddling the flag again.
> 
> Signed-off-by: Keir Fraser <keirf@google.com>


So the reason balloon disables ACCESS_PLATFORM is simply
because it passes physical addresses to device and
expects device to be able to poke at them.

I worry about modifying DMA semantics yet again - it has as much of a
chance to break some legacy configs as it has to fix some.


And I don't really know much about restricted-dma-pool but
I'd like to understand why does it make sense to set it for
the balloon since it pokes at all and any system memory.


> ---
>  drivers/virtio/virtio_ring.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index a5ec724c01d8..12be2607c648 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -12,6 +12,7 @@
>  #include <linux/hrtimer.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/spinlock.h>
> +#include <linux/swiotlb.h>
>  #include <xen/xen.h>
>  
>  #ifdef DEBUG
> @@ -248,6 +249,13 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  	if (!virtio_has_dma_quirk(vdev))
>  		return true;
>  
> +	/* If the device is configured to use a DMA restricted pool,
> +	 * we had better use it.
> +	 */
> +	if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
> +	    is_swiotlb_for_alloc(vdev->dev.parent))
> +		return true;
> +
>  	/* Otherwise, we are left to guess. */
>  	/*
>  	 * In theory, it's possible to have a buggy QEMU-supposed
> -- 
> 2.37.0.170.g444d1eabd0-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 11:56 ` Michael S. Tsirkin
@ 2022-07-19 14:05   ` Keir Fraser
  2022-07-19 21:31     ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2022-07-19 14:05 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jason Wang, kernel-team, virtualization, linux-kernel

On Tue, Jul 19, 2022 at 07:56:09AM -0400, Michael S. Tsirkin wrote:
> On Tue, Jul 19, 2022 at 10:02:56AM +0000, Keir Fraser wrote:
> > If virtio devices are tagged for "restricted-dma-pool", then that
> > pool should be used for virtio ring setup, via the DMA API.
> > 
> > In particular, this fixes virtio_balloon for ARM PKVM, where the usual
> > workaround of setting VIRTIO_F_ACCESS_PLATFORM in the virtio device
> > doesn't work because the virtio_balloon driver clears the flag. This
> > seems a more robust fix than fiddling the flag again.
> > 
> > Signed-off-by: Keir Fraser <keirf@google.com>
> 
> 
> So the reason balloon disables ACCESS_PLATFORM is simply
> because it passes physical addresses to device and
> expects device to be able to poke at them.
> 
> I worry about modifying DMA semantics yet again - it has as much of a
> chance to break some legacy configs as it has to fix some.
> 
> 
> And I don't really know much about restricted-dma-pool but
> I'd like to understand why does it make sense to set it for
> the balloon since it pokes at all and any system memory.

So this is set in the device tree by the host, telling it to bounce all DMA
through a restricted memory window (basically swiotlb). The original reason
is simply to isolate DMA, to the extent possible, on IOMMU-less systems.

However it is also useful for PKVM because the host is not trusted to access
ordinary protected VM memory. To allow I/O via the host, restricted-dma-pool
is used to cause a bounce aperture to be allocated during VM boot, which is
then explicitly shared with the host. For correct PKVM virtio operation, all
data *and metadata* (virtio rings and descriptors) must be allocated in or
bounced through this aperture.

Insofar as virtio device accesses to virtio rings in guest memory essentially
*are* DMA (from the pov of the guest), I think it makes sense to respect the
bounce buffer for those rings, if so configured by the device tree.

> > ---
> >  drivers/virtio/virtio_ring.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index a5ec724c01d8..12be2607c648 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/hrtimer.h>
> >  #include <linux/dma-mapping.h>
> >  #include <linux/spinlock.h>
> > +#include <linux/swiotlb.h>
> >  #include <xen/xen.h>
> >  
> >  #ifdef DEBUG
> > @@ -248,6 +249,13 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  	if (!virtio_has_dma_quirk(vdev))
> >  		return true;
> >  
> > +	/* If the device is configured to use a DMA restricted pool,
> > +	 * we had better use it.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
> > +	    is_swiotlb_for_alloc(vdev->dev.parent))
> > +		return true;
> > +
> >  	/* Otherwise, we are left to guess. */
> >  	/*
> >  	 * In theory, it's possible to have a buggy QEMU-supposed
> > -- 
> > 2.37.0.170.g444d1eabd0-goog
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 10:02 [PATCH] virtio: Force DMA restricted devices through DMA API Keir Fraser
  2022-07-19 11:56 ` Michael S. Tsirkin
@ 2022-07-19 15:23 ` Christoph Hellwig
  2022-07-19 15:46   ` Keir Fraser
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-07-19 15:23 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Michael S. Tsirkin, Jason Wang, kernel-team, virtualization,
	linux-kernel

On Tue, Jul 19, 2022 at 10:02:56AM +0000, Keir Fraser wrote:
> +#include <linux/swiotlb.h>

Drivers must never use this header.  We have a few pre-existing abuses
in the drm code, but they will go away.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 15:23 ` Christoph Hellwig
@ 2022-07-19 15:46   ` Keir Fraser
  2022-07-19 15:51     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2022-07-19 15:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Jason Wang, kernel-team, virtualization,
	linux-kernel

On Tue, Jul 19, 2022 at 08:23:28AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 19, 2022 at 10:02:56AM +0000, Keir Fraser wrote:
> > +#include <linux/swiotlb.h>
> 
> Drivers must never use this header.  We have a few pre-existing abuses
> in the drm code, but they will go away.
> 

Ok fair enough, and I'll admit I don't like my use of
swiotlb_for_alloc() a lot either.

However, if the general idea at least is acceptable, would the
implementation be acceptable if I add an explicit API for this to the
DMA subsystem, and hide the detail there?

Or a completely different approach would be to revert the patch
e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
driver. MST: That's back in your court, as it's your patch!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 15:46   ` Keir Fraser
@ 2022-07-19 15:51     ` Christoph Hellwig
  2022-07-19 16:11       ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2022-07-19 15:51 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Christoph Hellwig, Michael S. Tsirkin, Jason Wang, kernel-team,
	virtualization, linux-kernel

On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> However, if the general idea at least is acceptable, would the
> implementation be acceptable if I add an explicit API for this to the
> DMA subsystem, and hide the detail there?

I don't think so.  The right thing to key off is
VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
virtio device after all the problems we had with the lack of it.

> Or a completely different approach would be to revert the patch
> e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> driver. MST: That's back in your court, as it's your patch!

Which also means this needs to be addresses, but I don't think a
simple revert is enough.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 15:51     ` Christoph Hellwig
@ 2022-07-19 16:11       ` Keir Fraser
  2022-07-20  5:16         ` Christoph Hellwig
  2022-07-20  6:59         ` Michael S. Tsirkin
  0 siblings, 2 replies; 13+ messages in thread
From: Keir Fraser @ 2022-07-19 16:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Jason Wang, kernel-team, virtualization,
	linux-kernel

On Tue, Jul 19, 2022 at 08:51:54AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> > However, if the general idea at least is acceptable, would the
> > implementation be acceptable if I add an explicit API for this to the
> > DMA subsystem, and hide the detail there?
> 
> I don't think so.  The right thing to key off is
> VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
> virtio device after all the problems we had with the lack of it.

Ok. Certainly the flag description in virtio spec fits the bill.

> > Or a completely different approach would be to revert the patch
> > e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> > driver. MST: That's back in your court, as it's your patch!
> 
> Which also means this needs to be addresses, but I don't think a
> simple revert is enough.

Well here are two possible approaches:

1. Revert e41b1355508d outright. I'm not even sure what it would mean
for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
is no longer IOMMU-specific anyway.

2. Continue to clear the flag during virtio_balloon negotiation, but
remember that it was offered, and test for that in vring_use_dma_api()
as well as, or instead of, virtio_has_dma_quirk().

Do either of those appeal?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 14:05   ` Keir Fraser
@ 2022-07-19 21:31     ` Michael S. Tsirkin
  0 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2022-07-19 21:31 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Jason Wang, kernel-team, virtualization, linux-kernel

On Tue, Jul 19, 2022 at 02:05:58PM +0000, Keir Fraser wrote:
> On Tue, Jul 19, 2022 at 07:56:09AM -0400, Michael S. Tsirkin wrote:
> > On Tue, Jul 19, 2022 at 10:02:56AM +0000, Keir Fraser wrote:
> > > If virtio devices are tagged for "restricted-dma-pool", then that
> > > pool should be used for virtio ring setup, via the DMA API.
> > > 
> > > In particular, this fixes virtio_balloon for ARM PKVM, where the usual
> > > workaround of setting VIRTIO_F_ACCESS_PLATFORM in the virtio device
> > > doesn't work because the virtio_balloon driver clears the flag. This
> > > seems a more robust fix than fiddling the flag again.
> > > 
> > > Signed-off-by: Keir Fraser <keirf@google.com>
> > 
> > 
> > So the reason balloon disables ACCESS_PLATFORM is simply
> > because it passes physical addresses to device and
> > expects device to be able to poke at them.
> > 
> > I worry about modifying DMA semantics yet again - it has as much of a
> > chance to break some legacy configs as it has to fix some.
> > 
> > 
> > And I don't really know much about restricted-dma-pool but
> > I'd like to understand why does it make sense to set it for
> > the balloon since it pokes at all and any system memory.
> 
> So this is set in the device tree by the host, telling it to bounce all DMA
> through a restricted memory window (basically swiotlb). The original reason
> is simply to isolate DMA, to the extent possible, on IOMMU-less systems.
> 
> However it is also useful for PKVM because the host is not trusted to access
> ordinary protected VM memory.

I'll have to read up on pKVM. Will get back to you.

> To allow I/O via the host, restricted-dma-pool
> is used to cause a bounce aperture to be allocated during VM boot, which is
> then explicitly shared with the host. For correct PKVM virtio operation, all
> data *and metadata* (virtio rings and descriptors) must be allocated in or
> bounced through this aperture.
>
> Insofar as virtio device accesses to virtio rings in guest memory essentially
> *are* DMA (from the pov of the guest), I think it makes sense to respect the
> bounce buffer for those rings, if so configured by the device tree.
>
> > > ---
> > >  drivers/virtio/virtio_ring.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index a5ec724c01d8..12be2607c648 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -12,6 +12,7 @@
> > >  #include <linux/hrtimer.h>
> > >  #include <linux/dma-mapping.h>
> > >  #include <linux/spinlock.h>
> > > +#include <linux/swiotlb.h>
> > >  #include <xen/xen.h>
> > >  
> > >  #ifdef DEBUG
> > > @@ -248,6 +249,13 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  	if (!virtio_has_dma_quirk(vdev))
> > >  		return true;
> > >  
> > > +	/* If the device is configured to use a DMA restricted pool,
> > > +	 * we had better use it.
> > > +	 */
> > > +	if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
> > > +	    is_swiotlb_for_alloc(vdev->dev.parent))
> > > +		return true;
> > > +
> > >  	/* Otherwise, we are left to guess. */
> > >  	/*
> > >  	 * In theory, it's possible to have a buggy QEMU-supposed
> > > -- 
> > > 2.37.0.170.g444d1eabd0-goog
> > 
> > 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 16:11       ` Keir Fraser
@ 2022-07-20  5:16         ` Christoph Hellwig
  2022-07-20  6:59         ` Michael S. Tsirkin
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2022-07-20  5:16 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Christoph Hellwig, Michael S. Tsirkin, Jason Wang, kernel-team,
	virtualization, linux-kernel

On Tue, Jul 19, 2022 at 04:11:50PM +0000, Keir Fraser wrote:
> Well here are two possible approaches:
> 
> 1. Revert e41b1355508d outright. I'm not even sure what it would mean
> for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
> is no longer IOMMU-specific anyway.
> 
> 2. Continue to clear the flag during virtio_balloon negotiation, but
> remember that it was offered, and test for that in vring_use_dma_api()
> as well as, or instead of, virtio_has_dma_quirk().
> 
> Do either of those appeal?

I'll have to defer to people that actually understand the virtio_balloon
code, because I haven't evey seriously looked at it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-19 16:11       ` Keir Fraser
  2022-07-20  5:16         ` Christoph Hellwig
@ 2022-07-20  6:59         ` Michael S. Tsirkin
  2022-07-20  8:27           ` Keir Fraser
  1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2022-07-20  6:59 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Christoph Hellwig, Jason Wang, kernel-team, virtualization, linux-kernel

On Tue, Jul 19, 2022 at 04:11:50PM +0000, Keir Fraser wrote:
> On Tue, Jul 19, 2022 at 08:51:54AM -0700, Christoph Hellwig wrote:
> > On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> > > However, if the general idea at least is acceptable, would the
> > > implementation be acceptable if I add an explicit API for this to the
> > > DMA subsystem, and hide the detail there?
> > 
> > I don't think so.  The right thing to key off is
> > VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
> > virtio device after all the problems we had with the lack of it.
> 
> Ok. Certainly the flag description in virtio spec fits the bill.

Maybe. But balloon really isn't a normal device. Yes the rings kind of
operate more or less normally. However consider for example free page
hinting. We stick a page address in ring for purposes of telling host it
can blow it away at any later time unless we write something into the
page.  Free page reporting is similar.
Sending gigabytes of memory through swiotlb is at minimum not
a good idea.

Conversely, is it even okay security wise that host can blow away any
guest page?  Because with balloon GFNs do not go through bounce
buffering.



> > > Or a completely different approach would be to revert the patch
> > > e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> > > driver. MST: That's back in your court, as it's your patch!
> > 
> > Which also means this needs to be addresses, but I don't think a
> > simple revert is enough.
> 
> Well here are two possible approaches:
> 
> 1. Revert e41b1355508d outright. I'm not even sure what it would mean
> for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
> is no longer IOMMU-specific anyway.
> 
> 2. Continue to clear the flag during virtio_balloon negotiation, but
> remember that it was offered, and test for that in vring_use_dma_api()
> as well as, or instead of, virtio_has_dma_quirk().
> 
> Do either of those appeal?

I think the use case needs to be documented better.


-- 
MST


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-20  6:59         ` Michael S. Tsirkin
@ 2022-07-20  8:27           ` Keir Fraser
  2022-07-20  9:58             ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2022-07-20  8:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Jason Wang, kernel-team, virtualization, linux-kernel

On Wed, Jul 20, 2022 at 02:59:53AM -0400, Michael S. Tsirkin wrote:
> On Tue, Jul 19, 2022 at 04:11:50PM +0000, Keir Fraser wrote:
> > On Tue, Jul 19, 2022 at 08:51:54AM -0700, Christoph Hellwig wrote:
> > > On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> > > > However, if the general idea at least is acceptable, would the
> > > > implementation be acceptable if I add an explicit API for this to the
> > > > DMA subsystem, and hide the detail there?
> > > 
> > > I don't think so.  The right thing to key off is
> > > VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
> > > virtio device after all the problems we had with the lack of it.
> > 
> > Ok. Certainly the flag description in virtio spec fits the bill.
> 
> Maybe. But balloon really isn't a normal device. Yes the rings kind of
> operate more or less normally. However consider for example free page
> hinting. We stick a page address in ring for purposes of telling host it
> can blow it away at any later time unless we write something into the
> page.  Free page reporting is similar.
> Sending gigabytes of memory through swiotlb is at minimum not
> a good idea.

I don't think any balloon use case needs the page's guest data to be
made available to the host device. What the device side does with
reported guest-physical page addresses is somewhat VMM/Hyp specific,
but I think it's fair to say it will know how to reclaim or track
pages by guest PA, and bouncing reported/hinted page addresses through
a SWIOTLB or IOMMU would not be of any use to any use case I can think
of.

As far as I can see, the only translation that makes sense at all for
virtio balloon is in ring management.

> Conversely, is it even okay security wise that host can blow away any
> guest page?  Because with balloon GFNs do not go through bounce
> buffering.

Ok, I think this is fair and I can address that by describing the use
case more broadly.

The short answer is that there will be more patches forthcoming,
because the balloon driver will need to tell the hypervisor (EL2 Hyp
in the ARM PKVM case) that is is willingly relinquishing memory
pages. So, there will be a patch to add a new HVC to PKVM Hyp, and a
patch to detect and use the new HVC via a new API that hooks into
Linux's balloon infrastructure.

So the use case is that virtio_balloon needs to set up its rings and
descriptors in a shared memory region, as requested via
dma-restricted-pool and the VIRTIO_F_PALTFORM_ACCESS flag. This is
required because the host device has no access to regular guest
memory.

However, in PKVM, page notifications will notify both the (trusted)
Hypervisor, via hypercall, and the (untrusted) VMM, via virtio. Guest
physical addresses are fine here. The VMM understands guest PAs
perfectly well, it's just not normally allowed to access their
contents or otherwise map or use those pages itself.

> 
> > > > Or a completely different approach would be to revert the patch
> > > > e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> > > > driver. MST: That's back in your court, as it's your patch!
> > > 
> > > Which also means this needs to be addresses, but I don't think a
> > > simple revert is enough.
> > 
> > Well here are two possible approaches:
> > 
> > 1. Revert e41b1355508d outright. I'm not even sure what it would mean
> > for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
> > is no longer IOMMU-specific anyway.
> > 
> > 2. Continue to clear the flag during virtio_balloon negotiation, but
> > remember that it was offered, and test for that in vring_use_dma_api()
> > as well as, or instead of, virtio_has_dma_quirk().
> > 
> > Do either of those appeal?
> 
> I think the use case needs to be documented better.

I hope the above is helpful in giving some more context.

Perhaps it would make more sense to re-submit this patch as part of
a larger series that includes the rest of the mechanism needed to
support virtio_balloon on PKVM?

Thanks,
Keir

> 
> -- 
> MST
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-20  8:27           ` Keir Fraser
@ 2022-07-20  9:58             ` Michael S. Tsirkin
  2022-07-21  7:37               ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2022-07-20  9:58 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Christoph Hellwig, Jason Wang, kernel-team, virtualization, linux-kernel

On Wed, Jul 20, 2022 at 08:27:51AM +0000, Keir Fraser wrote:
> On Wed, Jul 20, 2022 at 02:59:53AM -0400, Michael S. Tsirkin wrote:
> > On Tue, Jul 19, 2022 at 04:11:50PM +0000, Keir Fraser wrote:
> > > On Tue, Jul 19, 2022 at 08:51:54AM -0700, Christoph Hellwig wrote:
> > > > On Tue, Jul 19, 2022 at 03:46:08PM +0000, Keir Fraser wrote:
> > > > > However, if the general idea at least is acceptable, would the
> > > > > implementation be acceptable if I add an explicit API for this to the
> > > > > DMA subsystem, and hide the detail there?
> > > > 
> > > > I don't think so.  The right thing to key off is
> > > > VIRTIO_F_ACCESS_PLATFORM, which really should be set in any modern
> > > > virtio device after all the problems we had with the lack of it.
> > > 
> > > Ok. Certainly the flag description in virtio spec fits the bill.
> > 
> > Maybe. But balloon really isn't a normal device. Yes the rings kind of
> > operate more or less normally. However consider for example free page
> > hinting. We stick a page address in ring for purposes of telling host it
> > can blow it away at any later time unless we write something into the
> > page.  Free page reporting is similar.
> > Sending gigabytes of memory through swiotlb is at minimum not
> > a good idea.
> 
> I don't think any balloon use case needs the page's guest data to be
> made available to the host device. What the device side does with
> reported guest-physical page addresses is somewhat VMM/Hyp specific,
> but I think it's fair to say it will know how to reclaim or track
> pages by guest PA, and bouncing reported/hinted page addresses through
> a SWIOTLB or IOMMU would not be of any use to any use case I can think
> of.
> 
> As far as I can see, the only translation that makes sense at all for
> virtio balloon is in ring management.
> 
> > Conversely, is it even okay security wise that host can blow away any
> > guest page?  Because with balloon GFNs do not go through bounce
> > buffering.
> 
> Ok, I think this is fair and I can address that by describing the use
> case more broadly.
> 
> The short answer is that there will be more patches forthcoming,
> because the balloon driver will need to tell the hypervisor (EL2 Hyp
> in the ARM PKVM case) that is is willingly relinquishing memory
> pages. So, there will be a patch to add a new HVC to PKVM Hyp, and a
> patch to detect and use the new HVC via a new API that hooks into
> Linux's balloon infrastructure.
> 
> So the use case is that virtio_balloon needs to set up its rings and
> descriptors in a shared memory region, as requested via
> dma-restricted-pool and the VIRTIO_F_PALTFORM_ACCESS flag. This is
> required because the host device has no access to regular guest
> memory.
> 
> However, in PKVM, page notifications will notify both the (trusted)
> Hypervisor, via hypercall, and the (untrusted) VMM, via virtio. Guest
> physical addresses are fine here. The VMM understands guest PAs
> perfectly well, it's just not normally allowed to access their
> contents or otherwise map or use those pages itself.

OK ... and I wonder whether this extends the balloon device
in some way? Is there or can there be a new feature bit for this
functionality?


> > 
> > > > > Or a completely different approach would be to revert the patch
> > > > > e41b1355508d which clears VIRTIO_F_ACCESS_PLATFORM in the balloon
> > > > > driver. MST: That's back in your court, as it's your patch!
> > > > 
> > > > Which also means this needs to be addresses, but I don't think a
> > > > simple revert is enough.
> > > 
> > > Well here are two possible approaches:
> > > 
> > > 1. Revert e41b1355508d outright. I'm not even sure what it would mean
> > > for reported pages to go through IOMMU. And VIRTIO_F_ACCESS_PLATFORM
> > > is no longer IOMMU-specific anyway.
> > > 
> > > 2. Continue to clear the flag during virtio_balloon negotiation, but
> > > remember that it was offered, and test for that in vring_use_dma_api()
> > > as well as, or instead of, virtio_has_dma_quirk().
> > > 
> > > Do either of those appeal?
> > 
> > I think the use case needs to be documented better.
> 
> I hope the above is helpful in giving some more context.
> 
> Perhaps it would make more sense to re-submit this patch as part of
> a larger series that includes the rest of the mechanism needed to
> support virtio_balloon on PKVM?
> 
> Thanks,
> Keir

I suspect so, yes.


> > 
> > -- 
> > MST
> > 
> > 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] virtio: Force DMA restricted devices through DMA API
  2022-07-20  9:58             ` Michael S. Tsirkin
@ 2022-07-21  7:37               ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2022-07-21  7:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Jason Wang, kernel-team, virtualization, linux-kernel

On Wed, Jul 20, 2022 at 05:58:28AM -0400, Michael S. Tsirkin wrote:
> On Wed, Jul 20, 2022 at 08:27:51AM +0000, Keir Fraser wrote:
> > The short answer is that there will be more patches forthcoming,
> > because the balloon driver will need to tell the hypervisor (EL2 Hyp
> > in the ARM PKVM case) that is is willingly relinquishing memory
> > pages. So, there will be a patch to add a new HVC to PKVM Hyp, and a
> > patch to detect and use the new HVC via a new API that hooks into
> > Linux's balloon infrastructure.
> > 
> > So the use case is that virtio_balloon needs to set up its rings and
> > descriptors in a shared memory region, as requested via
> > dma-restricted-pool and the VIRTIO_F_PALTFORM_ACCESS flag. This is
> > required because the host device has no access to regular guest
> > memory.
> > 
> > However, in PKVM, page notifications will notify both the (trusted)
> > Hypervisor, via hypercall, and the (untrusted) VMM, via virtio. Guest
> > physical addresses are fine here. The VMM understands guest PAs
> > perfectly well, it's just not normally allowed to access their
> > contents or otherwise map or use those pages itself.
> 
> OK ... and I wonder whether this extends the balloon device
> in some way? Is there or can there be a new feature bit for this
> functionality?

To my mind it is implementation dependent whether the balloon device
needs to be extended. In my current implementation it is not, and
probably it will continue to be entirely handled host-side by the host
kernel and hypervisor. But we should consider the possibility of
requiring knowledge/extension in the device for sure.

Currently there is no feature flag for the new Hyp-notification path
on the driver side. The notification hypercall is hidden behind a new
API, and there is an init/probe call on that API by which the driver
unilaterally decides the extended path including Hyp notification is to
be used. One downside of this is that the device cannot detect a
legacy driver that lacks knowledge of the extended PKVM path. Any
pages returned by a legacy driver will simply crash the VMM, because
the hypervisor is still protecting those pages. A rather inelegant
failure mode!

I can envision a new feature flag that:

1. Is advertised by the device (makes sense: the VMM surely knows that
it is managing a protected VM).

2. Is Ack'ed by an aware driver, and which switches on the extended
notification path.

3. Is not negotiated by a legacy driver, causing the device to clear
FEATURES_OK, and the balloon is unavailable.

A balloon-specific flag called perhaps VIRTIO_F_BALLOON_UNTRUSTED_HOST,
or VIRTIO_F_BALLOON_NOTIFY_HYP, or somesuch? In some senses it's not a
balloon-specific piece of information, but it's only a pertinent feature
for balloon (at least as of now).

My understanding is that the first step in upstreaming such a new flag
would be to get it accepted into the virtio specification? If so and
this sounds agreeable, I'll rework my private patches, and cook up an
extension to the virtio spec. If an RFC posting of the patches here is
preferred before posting to the virtio-spec list, I can do that too.

> > Perhaps it would make more sense to re-submit this patch as part of
> > a larger series that includes the rest of the mechanism needed to
> > support virtio_balloon on PKVM?
> > 
> > Thanks,
> > Keir
> 
> I suspect so, yes.

Thanks for your review feedback. I will submit a full patch series in
due course.

Regards,
Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-21  7:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-19 10:02 [PATCH] virtio: Force DMA restricted devices through DMA API Keir Fraser
2022-07-19 11:56 ` Michael S. Tsirkin
2022-07-19 14:05   ` Keir Fraser
2022-07-19 21:31     ` Michael S. Tsirkin
2022-07-19 15:23 ` Christoph Hellwig
2022-07-19 15:46   ` Keir Fraser
2022-07-19 15:51     ` Christoph Hellwig
2022-07-19 16:11       ` Keir Fraser
2022-07-20  5:16         ` Christoph Hellwig
2022-07-20  6:59         ` Michael S. Tsirkin
2022-07-20  8:27           ` Keir Fraser
2022-07-20  9:58             ` Michael S. Tsirkin
2022-07-21  7:37               ` Keir Fraser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).