All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	aik@ozlabs.ru, robh@kernel.org, joe@perches.com,
	elfring@users.sourceforge.net, david@gibson.dropbear.id.au,
	jasowang@redhat.com, mpe@ellerman.id.au, hch@infradead.org
Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices
Date: Mon, 04 Jun 2018 23:11:52 +1000	[thread overview]
Message-ID: <e7ceddbec11711a89282e9b70b7fd3c8af10b030.camel@kernel.crashing.org> (raw)
In-Reply-To: <20180604153558-mutt-send-email-mst@kernel.org>

On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin wrote:
> On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > 
> > > I re-read that discussion and I'm still unclear on the
> > > original question, since I got several apparently
> > > conflicting answers.
> > > 
> > > I asked:
> > > 
> > > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > > 	hypervisor side sufficient?
> > 
> > I thought I had replied to this...
> > 
> > There are a couple of reasons:
> > 
> > - First qemu doesn't know that the guest will switch to "secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter secure
> > mode" and qemu doesn't see any of it. So who can set the flag here ?
> 
> The user should set it. You just tell user "to be able to use with
> feature X, enable IOMMU".

That's completely backwards. The user has no idea what that stuff is.
And it would have to percolate all the way up the management stack,
libvirt, kimchi, whatever else ... that's just nonsense.

Especially since, as I explained in my other email, this is *not* a
qemu problem and thus the solution shouldn't be messing around with
qemu.

> 
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> > 
> > Cheers,
> > Ben.
> 
> There are several answers to this.  One is that we are working hard to
> make overhead small when the mappings are static (which they would be if
> there's no actual IOMMU). So maybe especially given you are using
> a bounce buffer on top it's not so bad - did you try to
> benchmark?
> 
> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.

The point is that requiring specific qemu command line arguments isn't
going to fly. We have additional problems due to the fact that our
firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
though those can be fixed.

Overall, however, this seems to be the most convoluted way of achieving
things, require user interventions where none should be needed etc...

Again, what's wrong with a 2 lines hook instead that solves it all and
completely avoids involving qemu ?

Ben.

> 
> > > 
> > > 
> > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > > >  3 files changed, 27 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
> > > > index 8fa3945..056e578 100644
> > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device *dev);
> > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > +
> > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > > +
> > > > +struct virtio_device;
> > > > +
> > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> > > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> > > > index 06f0296..a2ec15a 100644
> > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > @@ -38,6 +38,7 @@
> > > >  #include <linux/of.h>
> > > >  #include <linux/iommu.h>
> > > >  #include <linux/rculist.h>
> > > > +#include <linux/virtio.h>
> > > >  #include <asm/io.h>
> > > >  #include <asm/prom.h>
> > > >  #include <asm/rtas.h>
> > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> > > >  __setup("multitce=", disable_multitce);
> > > >  
> > > >  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
> > > > +
> > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	/*
> > > > +	 * On protected guest platforms, force virtio core to use DMA
> > > > +	 * MAP API for all virtio devices. But there can also be some
> > > > +	 * exceptions for individual devices like virtio balloon.
> > > > +	 */
> > > > +	return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") != NULL);
> > > > +}
> > > 
> > > Isn't this kind of slow?  vring_use_dma_api is on
> > > data path and supposed to be very fast.
> > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 21d464a..47ea6c3 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > > >   * unconditionally on data path.
> > > >   */
> > > >  
> > > > +#ifndef platform_forces_virtio_dma
> > > > +static inline bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +#endif
> > > > +
> > > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > > >  {
> > > > +	if (platform_forces_virtio_dma(vdev))
> > > > +		return true;
> > > > +
> > > >  	if (!virtio_has_iommu_quirk(vdev))
> > > >  		return true;
> > > >  
> > > > -- 
> > > > 2.9.3

WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: robh@kernel.org, mpe@ellerman.id.au,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, hch@infradead.org,
	joe@perches.com, david@gibson.dropbear.id.au,
	linuxppc-dev@lists.ozlabs.org, elfring@users.sourceforge.net,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>
Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices
Date: Mon, 04 Jun 2018 23:11:52 +1000	[thread overview]
Message-ID: <e7ceddbec11711a89282e9b70b7fd3c8af10b030.camel@kernel.crashing.org> (raw)
In-Reply-To: <20180604153558-mutt-send-email-mst@kernel.org>

On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin wrote:
> On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > 
> > > I re-read that discussion and I'm still unclear on the
> > > original question, since I got several apparently
> > > conflicting answers.
> > > 
> > > I asked:
> > > 
> > > 	Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > > 	hypervisor side sufficient?
> > 
> > I thought I had replied to this...
> > 
> > There are a couple of reasons:
> > 
> > - First qemu doesn't know that the guest will switch to "secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter secure
> > mode" and qemu doesn't see any of it. So who can set the flag here ?
> 
> The user should set it. You just tell user "to be able to use with
> feature X, enable IOMMU".

That's completely backwards. The user has no idea what that stuff is.
And it would have to percolate all the way up the management stack,
libvirt, kimchi, whatever else ... that's just nonsense.

Especially since, as I explained in my other email, this is *not* a
qemu problem and thus the solution shouldn't be messing around with
qemu.

> 
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> > 
> > Cheers,
> > Ben.
> 
> There are several answers to this.  One is that we are working hard to
> make overhead small when the mappings are static (which they would be if
> there's no actual IOMMU). So maybe especially given you are using
> a bounce buffer on top it's not so bad - did you try to
> benchmark?
> 
> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.

The point is that requiring specific qemu command line arguments isn't
going to fly. We have additional problems due to the fact that our
firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
though those can be fixed.

Overall, however, this seems to be the most convoluted way of achieving
things, require user interventions where none should be needed etc...

Again, what's wrong with a 2 lines hook instead that solves it all and
completely avoids involving qemu ?

Ben.

> 
> > > 
> > > 
> > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++++++
> > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++
> > > >  drivers/virtio/virtio_ring.c           | 10 ++++++++++
> > > >  3 files changed, 27 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
> > > > index 8fa3945..056e578 100644
> > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device *dev);
> > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > +
> > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > > +
> > > > +struct virtio_device;
> > > > +
> > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> > > >  #endif	/* _ASM_DMA_MAPPING_H */
> > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> > > > index 06f0296..a2ec15a 100644
> > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > @@ -38,6 +38,7 @@
> > > >  #include <linux/of.h>
> > > >  #include <linux/iommu.h>
> > > >  #include <linux/rculist.h>
> > > > +#include <linux/virtio.h>
> > > >  #include <asm/io.h>
> > > >  #include <asm/prom.h>
> > > >  #include <asm/rtas.h>
> > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> > > >  __setup("multitce=", disable_multitce);
> > > >  
> > > >  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
> > > > +
> > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	/*
> > > > +	 * On protected guest platforms, force virtio core to use DMA
> > > > +	 * MAP API for all virtio devices. But there can also be some
> > > > +	 * exceptions for individual devices like virtio balloon.
> > > > +	 */
> > > > +	return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") != NULL);
> > > > +}
> > > 
> > > Isn't this kind of slow?  vring_use_dma_api is on
> > > data path and supposed to be very fast.
> > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 21d464a..47ea6c3 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > > >   * unconditionally on data path.
> > > >   */
> > > >  
> > > > +#ifndef platform_forces_virtio_dma
> > > > +static inline bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +#endif
> > > > +
> > > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > > >  {
> > > > +	if (platform_forces_virtio_dma(vdev))
> > > > +		return true;
> > > > +
> > > >  	if (!virtio_has_iommu_quirk(vdev))
> > > >  		return true;
> > > >  
> > > > -- 
> > > > 2.9.3

  parent reply	other threads:[~2018-06-04 13:14 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-22  6:33 [RFC V2] virtio: Add platform specific DMA API translation for virito devices Anshuman Khandual
2018-05-23 18:50 ` Michael S. Tsirkin
2018-05-23 18:50 ` Michael S. Tsirkin
2018-05-23 22:27   ` Benjamin Herrenschmidt
2018-05-23 22:27     ` Benjamin Herrenschmidt
2018-05-24  7:17     ` Christoph Hellwig
2018-05-24  7:17     ` Christoph Hellwig
2018-05-25 17:45     ` Michael S. Tsirkin
2018-05-28 23:48       ` Benjamin Herrenschmidt
2018-05-28 23:48         ` Benjamin Herrenschmidt
2018-05-28 23:56         ` Benjamin Herrenschmidt
2018-05-28 23:56           ` Benjamin Herrenschmidt
2018-05-29 14:03           ` Christoph Hellwig
2018-05-29 14:03           ` Christoph Hellwig
2018-05-29 22:13             ` Benjamin Herrenschmidt
2018-05-29 22:13               ` Benjamin Herrenschmidt
2018-05-25 17:45     ` Michael S. Tsirkin
2018-06-04  8:57     ` David Gibson
2018-06-04  8:57     ` David Gibson
2018-06-04  9:48       ` Benjamin Herrenschmidt
2018-06-04  9:48         ` Benjamin Herrenschmidt
2018-06-04 12:50         ` Michael S. Tsirkin
2018-06-04 12:50         ` Michael S. Tsirkin
2018-06-05  1:52         ` David Gibson
2018-06-05  1:52         ` David Gibson
2018-06-04 12:43     ` Michael S. Tsirkin
2018-06-04 12:55       ` Christoph Hellwig
2018-06-04 13:14         ` Benjamin Herrenschmidt
2018-06-04 13:14           ` Benjamin Herrenschmidt
2018-06-04 16:34           ` Michael S. Tsirkin
2018-06-04 16:34           ` Michael S. Tsirkin
2018-06-04 12:55       ` Christoph Hellwig
2018-06-04 13:11       ` Benjamin Herrenschmidt [this message]
2018-06-04 13:11         ` Benjamin Herrenschmidt
2018-06-04 16:21         ` Michael S. Tsirkin
2018-06-04 23:26           ` Benjamin Herrenschmidt
2018-06-04 23:26             ` Benjamin Herrenschmidt
2018-06-05  1:25             ` Michael S. Tsirkin
2018-06-05  1:25             ` Michael S. Tsirkin
2018-06-05  4:52             ` Christoph Hellwig
2018-06-05  4:52             ` Christoph Hellwig
2018-06-04 16:21         ` Michael S. Tsirkin
2018-06-04 12:43     ` Michael S. Tsirkin
2018-05-24  7:21   ` Ram Pai
2018-05-31  3:39     ` Anshuman Khandual
2018-05-31 17:43       ` Michael S. Tsirkin
2018-05-31 17:43       ` Michael S. Tsirkin
2018-06-07  5:23         ` Christoph Hellwig
2018-06-07 16:28           ` Michael S. Tsirkin
2018-06-08  6:36             ` Christoph Hellwig
2018-06-08  6:36             ` Christoph Hellwig
2018-06-13 13:49               ` Michael S. Tsirkin
2018-06-13 13:49                 ` Michael S. Tsirkin
2018-06-11  2:39             ` Ram Pai
2018-06-11  3:28               ` Michael S. Tsirkin
2018-06-11  3:28                 ` Michael S. Tsirkin
2018-06-11  3:34                 ` Benjamin Herrenschmidt
2018-06-11  3:34                   ` Benjamin Herrenschmidt
2018-06-13 14:23                   ` Michael S. Tsirkin
2018-06-13 14:23                   ` Michael S. Tsirkin
2018-06-11  3:29               ` Benjamin Herrenschmidt
2018-06-11  3:29                 ` Benjamin Herrenschmidt
2018-06-13  7:41                 ` Christoph Hellwig
2018-06-13  7:41                   ` Christoph Hellwig
2018-06-13 12:25                   ` Benjamin Herrenschmidt
2018-06-13 12:25                     ` Benjamin Herrenschmidt
2018-06-13 13:11                     ` Benjamin Herrenschmidt
2018-06-13 13:11                       ` Benjamin Herrenschmidt
2018-06-15  9:16                       ` Christoph Hellwig
2018-06-15  9:16                         ` Christoph Hellwig
2018-06-16  1:07                         ` Benjamin Herrenschmidt
2018-06-16  1:07                           ` Benjamin Herrenschmidt
2018-06-13 13:59                   ` Michael S. Tsirkin
2018-06-13 13:59                     ` Michael S. Tsirkin
2018-06-13 14:03                 ` Michael S. Tsirkin
2018-06-13 14:03                   ` Michael S. Tsirkin
2018-06-07 16:28           ` Michael S. Tsirkin
2018-06-07  5:23         ` Christoph Hellwig
2018-05-31  3:39     ` Anshuman Khandual
  -- strict thread matches above, loose matches on Subject: below --
2018-05-22  6:33 Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7ceddbec11711a89282e9b70b7fd3c8af10b030.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=aik@ozlabs.ru \
    --cc=david@gibson.dropbear.id.au \
    --cc=elfring@users.sourceforge.net \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=joe@perches.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=mst@redhat.com \
    --cc=robh@kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.