linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] virtio: new feature to detect IOMMU device quirk
@ 2016-07-27 22:50 Michael S. Tsirkin
  2016-07-28  6:59 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Michael S. Tsirkin @ 2016-07-27 22:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: virtualization, Christoph Hellwig, Jason Wang

The interaction between virtio and IOMMUs is messy.

On most systems with virtio, physical addresses match bus addresses,
and it doesn't particularly matter which one we use to program
the device.

On some systems, including Xen and any system with a physical device
that speaks virtio behind a physical IOMMU, we must program the IOMMU
for virtio DMA to work at all.

On other systems, including SPARC and PPC64, virtio-pci devices are
enumerated as though they are behind an IOMMU, but the virtio host
ignores the IOMMU, so we must either pretend that the IOMMU isn't
there or somehow map everything as the identity.

Add a feature bit to detect that quirk: VIRTIO_F_IOMMU_PLATFORM.

Any device with this feature bit set to 0 needs a quirk and has to be
passed physical addresses (as opposed to bus addresses) even though
the device is behind an IOMMU.

Note: it has to be a per-device quirk because for example, there could
be a mix of passed-through and virtual virtio devices. As another
example, some devices could be implemented by an out of process
hypervisor backend (in case of qemu vhost, or vhost-user) and so support
for an IOMMU needs to be coded up separately.

It would be cleanest to handle this in IOMMU core code, but that needs
per-device DMA ops. This should not be hard to do, and it will serve
to replace the existing Xen hacks, but doesn't seem practical for 4.8
and we don't want to defer hardware support until that's ready.
Add a TODO so we remember to clean it up for 4.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Changes from v3:
	In response to Christoph's comments, add TODO so we remember to
	move this into DMA core.  Definitely something we want to do but doesn't
	seem practical for this release cycle.

 include/uapi/linux/virtio_config.h |  4 +++-
 drivers/virtio/virtio_ring.c       | 15 ++++++++++++++-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index 4cb65bb..3f6195e 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -49,7 +49,7 @@
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		33
+#define VIRTIO_TRANSPORT_F_END		34
 
 #ifndef VIRTIO_CONFIG_NO_LEGACY
 /* Do we get callbacks when the ring is completely used, even if we've
@@ -63,4 +63,6 @@
 /* v1.0 compliant. */
 #define VIRTIO_F_VERSION_1		32
 
+/* Do not bypass the IOMMU (if configured) */
+#define VIRTIO_F_IOMMU_PLATFORM		33
 #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ca6bfdd..a0571f6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -117,7 +117,10 @@ struct vring_virtqueue {
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
 
 /*
- * The interaction between virtio and a possible IOMMU is a mess.
+ * Modern virtio devices might set feature bits to specify whether
+ * they use the platform IOMMU. If there, just use the DMA API.
+ *
+ * If not there, the interaction between virtio and DMA API is messy.
  *
  * On most systems with virtio, physical addresses match bus addresses,
  * and it doesn't particularly matter whether we use the DMA API.
@@ -133,10 +136,18 @@ struct vring_virtqueue {
  *
  * For the time being, we preserve historic behavior and bypass the DMA
  * API.
+ *
+ * TODO: install a per-device DMA ops structure that does the right thing
+ * taking into account all the above quirks, and use the DMA API
+ * unconditionally on data path.
  */
 
 static bool vring_use_dma_api(struct virtio_device *vdev)
 {
+	if (virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
+		return true;
+
+	/* Otherwise, we are left to guess. */
 	/*
 	 * In theory, it's possible to have a buggy QEMU-supposed
 	 * emulated Q35 IOMMU and Xen enabled at the same time.  On
@@ -1099,6 +1110,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_F_VERSION_1:
 			break;
+		case VIRTIO_F_IOMMU_PLATFORM:
+			break;
 		default:
 			/* We don't understand this bit. */
 			__virtio_clear_bit(vdev, i);
-- 
MST

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] virtio: new feature to detect IOMMU device quirk
  2016-07-27 22:50 [PATCH v4] virtio: new feature to detect IOMMU device quirk Michael S. Tsirkin
@ 2016-07-28  6:59 ` Christoph Hellwig
  2016-07-28 21:28   ` Michael S. Tsirkin
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2016-07-28  6:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, virtualization, Christoph Hellwig, Jason Wang

Again, this is still the wrong way around.  A "noiommu" feature is a
quirk and should not be the default.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] virtio: new feature to detect IOMMU device quirk
  2016-07-28  6:59 ` Christoph Hellwig
@ 2016-07-28 21:28   ` Michael S. Tsirkin
  0 siblings, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2016-07-28 21:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, virtualization, Jason Wang

On Wed, Jul 27, 2016 at 11:59:18PM -0700, Christoph Hellwig wrote:
> Again, this is still the wrong way around.  A "noiommu" feature is a
> quirk and should not be the default.

Christoph, I'm not sure what you mean by the default here.

We read a register from the device (bit 33 in the feature qword)
and act on it.

The specific register value is 0 on noiommu quirky devices
(it happened to be that way in the past),
and 1 on clean iommu devices.

 static bool vring_use_dma_api(struct virtio_device *vdev)
 {
+       if (virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
+               return true;
+
+       /* Otherwise, we are left to guess. */

As a hypothesis, do you object to use of virtio_has_feature?

Yes this might be confusing but in fact that
is just testing a cached register bit: at init time we read it:

        device_features = dev->config->get_features(dev);

        ....
        vdev->features = device_features

and later

            return vdev->features & BIT_ULL(fbit);

I'll add a comment clarifying that in the next version.

-- 
MST

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-28 21:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-27 22:50 [PATCH v4] virtio: new feature to detect IOMMU device quirk Michael S. Tsirkin
2016-07-28  6:59 ` Christoph Hellwig
2016-07-28 21:28   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).