All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 12:25 ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-02-01 12:25 UTC (permalink / raw)
  To: mst, jasowang
  Cc: mark.rutland, devicetree, pawel.moll, will.deacon,
	virtualization, robh+dt, linux-arm-kernel

By forcing on DMA API usage for ARM systems, we have inadvertently
kicked open a hornets' nest in terms of cache-coherency. Namely that
unless the virtio device is explicitly described as capable of coherent
DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
assume it is non-coherent. This turns out to cause a big problem for the
likes of QEMU and kvmtool, which generate virtio-mmio devices in their
guest DTs but neglect to add the often-overlooked "dma-coherent"
property; as a result, we end up with the guest making non-cacheable
accesses to the vring, the host doing so cacheably, both talking past
each other and things going horribly wrong.

To prevent regressing those existing use cases relying on implicit
coherency, but still fixing the original problem of (coherent PCI)
legacy devices behind IOMMUs, take the more conservative approach of
only hitting the DMA API switch for coherent devices, where we can be
sure it is safe, and preserving the old non-DMA behaviour otherwise.
This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
which as before are still at the mercy of architecture code correctly
knowing their coherency, so explicitly call this out in the virtio-mmio
DT binding in the hope of heading off any further workarounds for future
firmware mishaps.

Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
 drivers/virtio/virtio_ring.c                      | 11 ++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
index 5069c1b8e193..8f2a981e1010 100644
--- a/Documentation/devicetree/bindings/virtio/mmio.txt
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -7,6 +7,8 @@ Required properties:
 - compatible:	"virtio,mmio" compatibility string
 - reg:		control registers base address and size including configuration space
 - interrupts:	interrupt generated by the device
+- dma-coherent:	required if the device (or host emulation) accesses memory
+		cache-coherently, absent otherwise
 
 Example:
 
@@ -14,4 +16,5 @@ Example:
 		compatible = "virtio,mmio";
 		reg = <0x3000 0x100>;
 		interrupts = <41>;
+		dma-coherent;
 	}
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 7e38ed79c3fc..961af25b385c 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -20,6 +20,7 @@
 #include <linux/virtio_ring.h>
 #include <linux/virtio_config.h>
 #include <linux/device.h>
+#include <linux/property.h>
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/hrtimer.h>
@@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
 		return true;
 
 	/*
-	 * On ARM-based machines, the DMA ops will do the right thing,
-	 * so always use them with legacy devices.
+	 * On ARM-based machines, the coherent DMA ops will do the right
+	 * thing, so always use them with legacy devices. However, using
+	 * non-coherent DMA when the host *is* actually coherent, but has
+	 * forgotten to tell us, is going to break badly; since this situation
+	 * already exists in the wild, maintain the old behaviour there.
 	 */
-	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
+	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
+	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
 		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
 
 	return false;
-- 
2.11.0.dirty

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 12:25 ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-02-01 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

By forcing on DMA API usage for ARM systems, we have inadvertently
kicked open a hornets' nest in terms of cache-coherency. Namely that
unless the virtio device is explicitly described as capable of coherent
DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
assume it is non-coherent. This turns out to cause a big problem for the
likes of QEMU and kvmtool, which generate virtio-mmio devices in their
guest DTs but neglect to add the often-overlooked "dma-coherent"
property; as a result, we end up with the guest making non-cacheable
accesses to the vring, the host doing so cacheably, both talking past
each other and things going horribly wrong.

To prevent regressing those existing use cases relying on implicit
coherency, but still fixing the original problem of (coherent PCI)
legacy devices behind IOMMUs, take the more conservative approach of
only hitting the DMA API switch for coherent devices, where we can be
sure it is safe, and preserving the old non-DMA behaviour otherwise.
This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
which as before are still at the mercy of architecture code correctly
knowing their coherency, so explicitly call this out in the virtio-mmio
DT binding in the hope of heading off any further workarounds for future
firmware mishaps.

Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
 drivers/virtio/virtio_ring.c                      | 11 ++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
index 5069c1b8e193..8f2a981e1010 100644
--- a/Documentation/devicetree/bindings/virtio/mmio.txt
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -7,6 +7,8 @@ Required properties:
 - compatible:	"virtio,mmio" compatibility string
 - reg:		control registers base address and size including configuration space
 - interrupts:	interrupt generated by the device
+- dma-coherent:	required if the device (or host emulation) accesses memory
+		cache-coherently, absent otherwise
 
 Example:
 
@@ -14,4 +16,5 @@ Example:
 		compatible = "virtio,mmio";
 		reg = <0x3000 0x100>;
 		interrupts = <41>;
+		dma-coherent;
 	}
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 7e38ed79c3fc..961af25b385c 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -20,6 +20,7 @@
 #include <linux/virtio_ring.h>
 #include <linux/virtio_config.h>
 #include <linux/device.h>
+#include <linux/property.h>
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/hrtimer.h>
@@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
 		return true;
 
 	/*
-	 * On ARM-based machines, the DMA ops will do the right thing,
-	 * so always use them with legacy devices.
+	 * On ARM-based machines, the coherent DMA ops will do the right
+	 * thing, so always use them with legacy devices. However, using
+	 * non-coherent DMA when the host *is* actually coherent, but has
+	 * forgotten to tell us, is going to break badly; since this situation
+	 * already exists in the wild, maintain the old behaviour there.
 	 */
-	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
+	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
+	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
 		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
 
 	return false;
-- 
2.11.0.dirty

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 12:25 ` Robin Murphy
@ 2017-02-01 14:57   ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-01 14:57 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, devicetree, pawel.moll, mst, virtualization,
	robh+dt, linux-arm-kernel

On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> By forcing on DMA API usage for ARM systems, we have inadvertently
> kicked open a hornets' nest in terms of cache-coherency. Namely that
> unless the virtio device is explicitly described as capable of coherent
> DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> assume it is non-coherent. This turns out to cause a big problem for the
> likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> guest DTs but neglect to add the often-overlooked "dma-coherent"
> property; as a result, we end up with the guest making non-cacheable
> accesses to the vring, the host doing so cacheably, both talking past
> each other and things going horribly wrong.
> 
> To prevent regressing those existing use cases relying on implicit
> coherency, but still fixing the original problem of (coherent PCI)
> legacy devices behind IOMMUs, take the more conservative approach of
> only hitting the DMA API switch for coherent devices, where we can be
> sure it is safe, and preserving the old non-DMA behaviour otherwise.
> This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> which as before are still at the mercy of architecture code correctly
> knowing their coherency, so explicitly call this out in the virtio-mmio
> DT binding in the hope of heading off any further workarounds for future
> firmware mishaps.
> 
> Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
>  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..8f2a981e1010 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -7,6 +7,8 @@ Required properties:
>  - compatible:	"virtio,mmio" compatibility string
>  - reg:		control registers base address and size including configuration space
>  - interrupts:	interrupt generated by the device
> +- dma-coherent:	required if the device (or host emulation) accesses memory
> +		cache-coherently, absent otherwise

So QEMU is getting with not providing this property at the moment and this
patch continues to ensure that works coherently, which is the right thing
to do. However, if QEMU ever emulates devices with VIRTIO_F_IOMMU_PLATFORM,
then it will need to provide this property for coherent virtio-mmio
devices upstream of the IOMMU otherwise things won't work.

I think that's acceptable given that VIRTIO_F_IOMMU_PLATFORM has never
done the right thing with respect to coherency if "dma-coherent" is not
present, but it would be nice to call that out somewhere so that QEMU
developers can be aware of this pitfall.

Could we add something like:


  Linux implementation note:

  virtio-mmio devices that do not advertise the VIRTIO_F_IOMMU_PLATFORM
  feature are treated as cache-coherent irrespective of the "dma-coherent"
  property.  If VIRTIO_F_IOMMU_PLATFORM is advertised, then the
  "dma-coherent" property must accurately reflect the coherency of the
  device.


?

I know that the binding documentation needs to be OS agnostic, but I think
it's useful to describe the Linux behaviour here given that QEMU is
technically in violation of the binding after this patch is applied.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 14:57   ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-01 14:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> By forcing on DMA API usage for ARM systems, we have inadvertently
> kicked open a hornets' nest in terms of cache-coherency. Namely that
> unless the virtio device is explicitly described as capable of coherent
> DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> assume it is non-coherent. This turns out to cause a big problem for the
> likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> guest DTs but neglect to add the often-overlooked "dma-coherent"
> property; as a result, we end up with the guest making non-cacheable
> accesses to the vring, the host doing so cacheably, both talking past
> each other and things going horribly wrong.
> 
> To prevent regressing those existing use cases relying on implicit
> coherency, but still fixing the original problem of (coherent PCI)
> legacy devices behind IOMMUs, take the more conservative approach of
> only hitting the DMA API switch for coherent devices, where we can be
> sure it is safe, and preserving the old non-DMA behaviour otherwise.
> This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> which as before are still at the mercy of architecture code correctly
> knowing their coherency, so explicitly call this out in the virtio-mmio
> DT binding in the hope of heading off any further workarounds for future
> firmware mishaps.
> 
> Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
>  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..8f2a981e1010 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -7,6 +7,8 @@ Required properties:
>  - compatible:	"virtio,mmio" compatibility string
>  - reg:		control registers base address and size including configuration space
>  - interrupts:	interrupt generated by the device
> +- dma-coherent:	required if the device (or host emulation) accesses memory
> +		cache-coherently, absent otherwise

So QEMU is getting with not providing this property at the moment and this
patch continues to ensure that works coherently, which is the right thing
to do. However, if QEMU ever emulates devices with VIRTIO_F_IOMMU_PLATFORM,
then it will need to provide this property for coherent virtio-mmio
devices upstream of the IOMMU otherwise things won't work.

I think that's acceptable given that VIRTIO_F_IOMMU_PLATFORM has never
done the right thing with respect to coherency if "dma-coherent" is not
present, but it would be nice to call that out somewhere so that QEMU
developers can be aware of this pitfall.

Could we add something like:


  Linux implementation note:

  virtio-mmio devices that do not advertise the VIRTIO_F_IOMMU_PLATFORM
  feature are treated as cache-coherent irrespective of the "dma-coherent"
  property.  If VIRTIO_F_IOMMU_PLATFORM is advertised, then the
  "dma-coherent" property must accurately reflect the coherency of the
  device.


?

I know that the binding documentation needs to be OS agnostic, but I think
it's useful to describe the Linux behaviour here given that QEMU is
technically in violation of the binding after this patch is applied.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 14:57   ` Will Deacon
@ 2017-02-01 17:58       ` Rob Herring
  -1 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2017-02-01 17:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, mst-H+wXaHxf7aLQT0dZR+AlfA,
	jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Wed, Feb 01, 2017 at 02:57:11PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > By forcing on DMA API usage for ARM systems, we have inadvertently
> > kicked open a hornets' nest in terms of cache-coherency. Namely that
> > unless the virtio device is explicitly described as capable of coherent
> > DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> > assume it is non-coherent. This turns out to cause a big problem for the
> > likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> > guest DTs but neglect to add the often-overlooked "dma-coherent"
> > property; as a result, we end up with the guest making non-cacheable
> > accesses to the vring, the host doing so cacheably, both talking past
> > each other and things going horribly wrong.
> > 
> > To prevent regressing those existing use cases relying on implicit
> > coherency, but still fixing the original problem of (coherent PCI)
> > legacy devices behind IOMMUs, take the more conservative approach of
> > only hitting the DMA API switch for coherent devices, where we can be
> > sure it is safe, and preserving the old non-DMA behaviour otherwise.
> > This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> > which as before are still at the mercy of architecture code correctly
> > knowing their coherency, so explicitly call this out in the virtio-mmio
> > DT binding in the hope of heading off any further workarounds for future
> > firmware mishaps.
> > 
> > Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
> >  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> > index 5069c1b8e193..8f2a981e1010 100644
> > --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> > +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> > @@ -7,6 +7,8 @@ Required properties:
> >  - compatible:	"virtio,mmio" compatibility string
> >  - reg:		control registers base address and size including configuration space
> >  - interrupts:	interrupt generated by the device
> > +- dma-coherent:	required if the device (or host emulation) accesses memory
> > +		cache-coherently, absent otherwise
> 
> So QEMU is getting with not providing this property at the moment and this
> patch continues to ensure that works coherently, which is the right thing
> to do. However, if QEMU ever emulates devices with VIRTIO_F_IOMMU_PLATFORM,
> then it will need to provide this property for coherent virtio-mmio
> devices upstream of the IOMMU otherwise things won't work.
> 
> I think that's acceptable given that VIRTIO_F_IOMMU_PLATFORM has never
> done the right thing with respect to coherency if "dma-coherent" is not
> present, but it would be nice to call that out somewhere so that QEMU
> developers can be aware of this pitfall.
> 
> Could we add something like:
> 
> 
>   Linux implementation note:
> 
>   virtio-mmio devices that do not advertise the VIRTIO_F_IOMMU_PLATFORM
>   feature are treated as cache-coherent irrespective of the "dma-coherent"
>   property.  If VIRTIO_F_IOMMU_PLATFORM is advertised, then the
>   "dma-coherent" property must accurately reflect the coherency of the
>   device.
> 
> 
> ?
> 
> I know that the binding documentation needs to be OS agnostic, but I think
> it's useful to describe the Linux behaviour here given that QEMU is
> technically in violation of the binding after this patch is applied.

Sounds fine to me.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 14:57   ` Will Deacon
  (?)
  (?)
@ 2017-02-01 17:58   ` Rob Herring
  -1 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2017-02-01 17:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, mst, virtualization,
	Robin Murphy, linux-arm-kernel

On Wed, Feb 01, 2017 at 02:57:11PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > By forcing on DMA API usage for ARM systems, we have inadvertently
> > kicked open a hornets' nest in terms of cache-coherency. Namely that
> > unless the virtio device is explicitly described as capable of coherent
> > DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> > assume it is non-coherent. This turns out to cause a big problem for the
> > likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> > guest DTs but neglect to add the often-overlooked "dma-coherent"
> > property; as a result, we end up with the guest making non-cacheable
> > accesses to the vring, the host doing so cacheably, both talking past
> > each other and things going horribly wrong.
> > 
> > To prevent regressing those existing use cases relying on implicit
> > coherency, but still fixing the original problem of (coherent PCI)
> > legacy devices behind IOMMUs, take the more conservative approach of
> > only hitting the DMA API switch for coherent devices, where we can be
> > sure it is safe, and preserving the old non-DMA behaviour otherwise.
> > This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> > which as before are still at the mercy of architecture code correctly
> > knowing their coherency, so explicitly call this out in the virtio-mmio
> > DT binding in the hope of heading off any further workarounds for future
> > firmware mishaps.
> > 
> > Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
> >  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> > index 5069c1b8e193..8f2a981e1010 100644
> > --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> > +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> > @@ -7,6 +7,8 @@ Required properties:
> >  - compatible:	"virtio,mmio" compatibility string
> >  - reg:		control registers base address and size including configuration space
> >  - interrupts:	interrupt generated by the device
> > +- dma-coherent:	required if the device (or host emulation) accesses memory
> > +		cache-coherently, absent otherwise
> 
> So QEMU is getting with not providing this property at the moment and this
> patch continues to ensure that works coherently, which is the right thing
> to do. However, if QEMU ever emulates devices with VIRTIO_F_IOMMU_PLATFORM,
> then it will need to provide this property for coherent virtio-mmio
> devices upstream of the IOMMU otherwise things won't work.
> 
> I think that's acceptable given that VIRTIO_F_IOMMU_PLATFORM has never
> done the right thing with respect to coherency if "dma-coherent" is not
> present, but it would be nice to call that out somewhere so that QEMU
> developers can be aware of this pitfall.
> 
> Could we add something like:
> 
> 
>   Linux implementation note:
> 
>   virtio-mmio devices that do not advertise the VIRTIO_F_IOMMU_PLATFORM
>   feature are treated as cache-coherent irrespective of the "dma-coherent"
>   property.  If VIRTIO_F_IOMMU_PLATFORM is advertised, then the
>   "dma-coherent" property must accurately reflect the coherency of the
>   device.
> 
> 
> ?
> 
> I know that the binding documentation needs to be OS agnostic, but I think
> it's useful to describe the Linux behaviour here given that QEMU is
> technically in violation of the binding after this patch is applied.

Sounds fine to me.

Rob

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 17:58       ` Rob Herring
  0 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2017-02-01 17:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 02:57:11PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > By forcing on DMA API usage for ARM systems, we have inadvertently
> > kicked open a hornets' nest in terms of cache-coherency. Namely that
> > unless the virtio device is explicitly described as capable of coherent
> > DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> > assume it is non-coherent. This turns out to cause a big problem for the
> > likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> > guest DTs but neglect to add the often-overlooked "dma-coherent"
> > property; as a result, we end up with the guest making non-cacheable
> > accesses to the vring, the host doing so cacheably, both talking past
> > each other and things going horribly wrong.
> > 
> > To prevent regressing those existing use cases relying on implicit
> > coherency, but still fixing the original problem of (coherent PCI)
> > legacy devices behind IOMMUs, take the more conservative approach of
> > only hitting the DMA API switch for coherent devices, where we can be
> > sure it is safe, and preserving the old non-DMA behaviour otherwise.
> > This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> > which as before are still at the mercy of architecture code correctly
> > knowing their coherency, so explicitly call this out in the virtio-mmio
> > DT binding in the hope of heading off any further workarounds for future
> > firmware mishaps.
> > 
> > Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
> >  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> > index 5069c1b8e193..8f2a981e1010 100644
> > --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> > +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> > @@ -7,6 +7,8 @@ Required properties:
> >  - compatible:	"virtio,mmio" compatibility string
> >  - reg:		control registers base address and size including configuration space
> >  - interrupts:	interrupt generated by the device
> > +- dma-coherent:	required if the device (or host emulation) accesses memory
> > +		cache-coherently, absent otherwise
> 
> So QEMU is getting with not providing this property at the moment and this
> patch continues to ensure that works coherently, which is the right thing
> to do. However, if QEMU ever emulates devices with VIRTIO_F_IOMMU_PLATFORM,
> then it will need to provide this property for coherent virtio-mmio
> devices upstream of the IOMMU otherwise things won't work.
> 
> I think that's acceptable given that VIRTIO_F_IOMMU_PLATFORM has never
> done the right thing with respect to coherency if "dma-coherent" is not
> present, but it would be nice to call that out somewhere so that QEMU
> developers can be aware of this pitfall.
> 
> Could we add something like:
> 
> 
>   Linux implementation note:
> 
>   virtio-mmio devices that do not advertise the VIRTIO_F_IOMMU_PLATFORM
>   feature are treated as cache-coherent irrespective of the "dma-coherent"
>   property.  If VIRTIO_F_IOMMU_PLATFORM is advertised, then the
>   "dma-coherent" property must accurately reflect the coherency of the
>   device.
> 
> 
> ?
> 
> I know that the binding documentation needs to be OS agnostic, but I think
> it's useful to describe the Linux behaviour here given that QEMU is
> technically in violation of the binding after this patch is applied.

Sounds fine to me.

Rob

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 12:25 ` Robin Murphy
@ 2017-02-01 18:09     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 18:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	will.deacon-5wv7dgnIgG8, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> By forcing on DMA API usage for ARM systems, we have inadvertently
> kicked open a hornets' nest in terms of cache-coherency. Namely that
> unless the virtio device is explicitly described as capable of coherent
> DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> assume it is non-coherent. This turns out to cause a big problem for the
> likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> guest DTs but neglect to add the often-overlooked "dma-coherent"
> property; as a result, we end up with the guest making non-cacheable
> accesses to the vring, the host doing so cacheably, both talking past
> each other and things going horribly wrong.
> 
> To prevent regressing those existing use cases relying on implicit
> coherency, but still fixing the original problem of (coherent PCI)
> legacy devices behind IOMMUs, take the more conservative approach of
> only hitting the DMA API switch for coherent devices, where we can be
> sure it is safe, and preserving the old non-DMA behaviour otherwise.
> This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> which as before are still at the mercy of architecture code correctly
> knowing their coherency, so explicitly call this out in the virtio-mmio
> DT binding in the hope of heading off any further workarounds for future
> firmware mishaps.
> 
> Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
>  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..8f2a981e1010 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -7,6 +7,8 @@ Required properties:
>  - compatible:	"virtio,mmio" compatibility string
>  - reg:		control registers base address and size including configuration space
>  - interrupts:	interrupt generated by the device
> +- dma-coherent:	required if the device (or host emulation) accesses memory
> +		cache-coherently, absent otherwise
>  
>  Example:
>  
> @@ -14,4 +16,5 @@ Example:
>  		compatible = "virtio,mmio";
>  		reg = <0x3000 0x100>;
>  		interrupts = <41>;
> +		dma-coherent;
>  	}
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 7e38ed79c3fc..961af25b385c 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -20,6 +20,7 @@
>  #include <linux/virtio_ring.h>
>  #include <linux/virtio_config.h>
>  #include <linux/device.h>
> +#include <linux/property.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  #include <linux/hrtimer.h>
> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  		return true;
>  
>  	/*
> -	 * On ARM-based machines, the DMA ops will do the right thing,
> -	 * so always use them with legacy devices.
> +	 * On ARM-based machines, the coherent DMA ops will do the right
> +	 * thing, so always use them with legacy devices. However, using
> +	 * non-coherent DMA when the host *is* actually coherent, but has
> +	 * forgotten to tell us, is going to break badly; since this situation
> +	 * already exists in the wild, maintain the old behaviour there.
>  	 */
> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>  
>  	return false;

This is exactly what I feared.

Could we identify fastboot and do the special dance just for it?

I'd like to do that instead. It's fastboot doing the unreasonable thing
here and deviating from what every other legacy device without exception
did for years. If this means fastboot will need to update to virtio 1,
all the better.

> -- 
> 2.11.0.dirty
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 12:25 ` Robin Murphy
                   ` (2 preceding siblings ...)
  (?)
@ 2017-02-01 18:09 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 18:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, devicetree, pawel.moll, will.deacon,
	virtualization, robh+dt, linux-arm-kernel

On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> By forcing on DMA API usage for ARM systems, we have inadvertently
> kicked open a hornets' nest in terms of cache-coherency. Namely that
> unless the virtio device is explicitly described as capable of coherent
> DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> assume it is non-coherent. This turns out to cause a big problem for the
> likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> guest DTs but neglect to add the often-overlooked "dma-coherent"
> property; as a result, we end up with the guest making non-cacheable
> accesses to the vring, the host doing so cacheably, both talking past
> each other and things going horribly wrong.
> 
> To prevent regressing those existing use cases relying on implicit
> coherency, but still fixing the original problem of (coherent PCI)
> legacy devices behind IOMMUs, take the more conservative approach of
> only hitting the DMA API switch for coherent devices, where we can be
> sure it is safe, and preserving the old non-DMA behaviour otherwise.
> This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> which as before are still at the mercy of architecture code correctly
> knowing their coherency, so explicitly call this out in the virtio-mmio
> DT binding in the hope of heading off any further workarounds for future
> firmware mishaps.
> 
> Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
>  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..8f2a981e1010 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -7,6 +7,8 @@ Required properties:
>  - compatible:	"virtio,mmio" compatibility string
>  - reg:		control registers base address and size including configuration space
>  - interrupts:	interrupt generated by the device
> +- dma-coherent:	required if the device (or host emulation) accesses memory
> +		cache-coherently, absent otherwise
>  
>  Example:
>  
> @@ -14,4 +16,5 @@ Example:
>  		compatible = "virtio,mmio";
>  		reg = <0x3000 0x100>;
>  		interrupts = <41>;
> +		dma-coherent;
>  	}
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 7e38ed79c3fc..961af25b385c 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -20,6 +20,7 @@
>  #include <linux/virtio_ring.h>
>  #include <linux/virtio_config.h>
>  #include <linux/device.h>
> +#include <linux/property.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  #include <linux/hrtimer.h>
> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  		return true;
>  
>  	/*
> -	 * On ARM-based machines, the DMA ops will do the right thing,
> -	 * so always use them with legacy devices.
> +	 * On ARM-based machines, the coherent DMA ops will do the right
> +	 * thing, so always use them with legacy devices. However, using
> +	 * non-coherent DMA when the host *is* actually coherent, but has
> +	 * forgotten to tell us, is going to break badly; since this situation
> +	 * already exists in the wild, maintain the old behaviour there.
>  	 */
> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>  
>  	return false;

This is exactly what I feared.

Could we identify fastboot and do the special dance just for it?

I'd like to do that instead. It's fastboot doing the unreasonable thing
here and deviating from what every other legacy device without exception
did for years. If this means fastboot will need to update to virtio 1,
all the better.

> -- 
> 2.11.0.dirty

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 18:09     ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> By forcing on DMA API usage for ARM systems, we have inadvertently
> kicked open a hornets' nest in terms of cache-coherency. Namely that
> unless the virtio device is explicitly described as capable of coherent
> DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
> assume it is non-coherent. This turns out to cause a big problem for the
> likes of QEMU and kvmtool, which generate virtio-mmio devices in their
> guest DTs but neglect to add the often-overlooked "dma-coherent"
> property; as a result, we end up with the guest making non-cacheable
> accesses to the vring, the host doing so cacheably, both talking past
> each other and things going horribly wrong.
> 
> To prevent regressing those existing use cases relying on implicit
> coherency, but still fixing the original problem of (coherent PCI)
> legacy devices behind IOMMUs, take the more conservative approach of
> only hitting the DMA API switch for coherent devices, where we can be
> sure it is safe, and preserving the old non-DMA behaviour otherwise.
> This does not affect devices setting the VIRTIO_F_IOMMU_PLATFORM flag,
> which as before are still at the mercy of architecture code correctly
> knowing their coherency, so explicitly call this out in the virtio-mmio
> DT binding in the hope of heading off any further workarounds for future
> firmware mishaps.
> 
> Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  Documentation/devicetree/bindings/virtio/mmio.txt |  3 +++
>  drivers/virtio/virtio_ring.c                      | 11 ++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
> index 5069c1b8e193..8f2a981e1010 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.txt
> +++ b/Documentation/devicetree/bindings/virtio/mmio.txt
> @@ -7,6 +7,8 @@ Required properties:
>  - compatible:	"virtio,mmio" compatibility string
>  - reg:		control registers base address and size including configuration space
>  - interrupts:	interrupt generated by the device
> +- dma-coherent:	required if the device (or host emulation) accesses memory
> +		cache-coherently, absent otherwise
>  
>  Example:
>  
> @@ -14,4 +16,5 @@ Example:
>  		compatible = "virtio,mmio";
>  		reg = <0x3000 0x100>;
>  		interrupts = <41>;
> +		dma-coherent;
>  	}
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 7e38ed79c3fc..961af25b385c 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -20,6 +20,7 @@
>  #include <linux/virtio_ring.h>
>  #include <linux/virtio_config.h>
>  #include <linux/device.h>
> +#include <linux/property.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  #include <linux/hrtimer.h>
> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>  		return true;
>  
>  	/*
> -	 * On ARM-based machines, the DMA ops will do the right thing,
> -	 * so always use them with legacy devices.
> +	 * On ARM-based machines, the coherent DMA ops will do the right
> +	 * thing, so always use them with legacy devices. However, using
> +	 * non-coherent DMA when the host *is* actually coherent, but has
> +	 * forgotten to tell us, is going to break badly; since this situation
> +	 * already exists in the wild, maintain the old behaviour there.
>  	 */
> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>  
>  	return false;

This is exactly what I feared.

Could we identify fastboot and do the special dance just for it?

I'd like to do that instead. It's fastboot doing the unreasonable thing
here and deviating from what every other legacy device without exception
did for years. If this means fastboot will need to update to virtio 1,
all the better.

> -- 
> 2.11.0.dirty

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 18:09     ` Michael S. Tsirkin
  (?)
@ 2017-02-01 18:27       ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-01 18:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 7e38ed79c3fc..961af25b385c 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/virtio_ring.h>
> >  #include <linux/virtio_config.h>
> >  #include <linux/device.h>
> > +#include <linux/property.h>
> >  #include <linux/slab.h>
> >  #include <linux/module.h>
> >  #include <linux/hrtimer.h>
> > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  		return true;
> >  
> >  	/*
> > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > -	 * so always use them with legacy devices.
> > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > +	 * thing, so always use them with legacy devices. However, using
> > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > +	 * forgotten to tell us, is going to break badly; since this situation
> > +	 * already exists in the wild, maintain the old behaviour there.
> >  	 */
> > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> >  
> >  	return false;
> 
> This is exactly what I feared.

Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
is used) and it also works on the fastmodel if you disable cache-modelling
(which is needed to make the thing run at a usable pace) so we didn't spot
this in testing.

> Could we identify fastboot and do the special dance just for it?

[assuming you mean fastmodel instead of fastboot]

> I'd like to do that instead. It's fastboot doing the unreasonable thing
> here and deviating from what every other legacy device without exception
> did for years. If this means fastboot will need to update to virtio 1,
> all the better.

The problem still exists with virtio 1, unless we require that the
"dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
is advertised by the device (which is what I suggested in my reply).

We can't detect the fastmodel, but we could implicitly treat virtio-mmio
devices as cache-coherent regardless of the "dma-coherent" flag. I already
prototyped this, but I suspect the devicetree people will push back (and
there's a similar patch needed for ACPI).

See below. Do you prefer this approach?

Will

--->8

>From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Mon, 30 Jan 2017 17:28:31 +0000
Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
 compatible string

Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
to DMA operations and therefore do not mandate the use of "dma-coherent"
in their devicetree bindings. In order to ensure that these devices work
correctly when using the DMA API, we need to treat them specially in
of_dma_is_coherent by identifying them as unconditionally coherent.

This patch adds a static, table-based search against the compatible
string for the device in of_dma_is_coherent before walking the
hierarchy looking for "dma-coherent". This allows existing virtio-mmio
devices (e.g. those emulated by QEMU) to function correctly when placed
behind an IOMMU that requires use of the DMA ops to map the vring.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903fe9d2..af29b115b8aa 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 }
 EXPORT_SYMBOL_GPL(of_dma_get_range);
 
+/*
+ * DMA from some device types is always cache-coherent, and in some unfortunate
+ * cases the "dma-coherent" property is not used.
+ */
+static const char *of_device_dma_coherent_tbl[] = {
+	/*
+	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
+	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
+	 * for their generated virtio MMIO device nodes, and the binding
+	 * documentation doesn't mention them either. When using the DMA API
+	 * (e.g. because there is an IOMMU in the system), we must report true
+	 * here to avoid lockups where writes to the vring via a non-coherent
+	 * mapping are not made visible to the device emulation.
+	 */
+	"virtio,mmio",
+	NULL,
+};
+
 /**
  * of_dma_is_coherent - Check if device is coherent
  * @np:	device node
  *
  * It returns true if "dma-coherent" property was found
- * for this device in DT.
+ * for this device in DT or the device is statically known to be
+ * coherent.
  */
 bool of_dma_is_coherent(struct device_node *np)
 {
 	struct device_node *node = of_node_get(np);
 
+	/*
+	 * Check for implicit DMA coherence first, since we don't want
+	 * to inherit this.
+	 */
+	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
+		of_node_put(node);
+		return true;
+	}
+
 	while (node) {
-		if (of_property_read_bool(node, "dma-coherent")) {
+		if (of_property_read_bool(node, "dma-coherent")){
 			of_node_put(node);
 			return true;
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 18:27       ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-01 18:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 7e38ed79c3fc..961af25b385c 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/virtio_ring.h>
> >  #include <linux/virtio_config.h>
> >  #include <linux/device.h>
> > +#include <linux/property.h>
> >  #include <linux/slab.h>
> >  #include <linux/module.h>
> >  #include <linux/hrtimer.h>
> > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  		return true;
> >  
> >  	/*
> > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > -	 * so always use them with legacy devices.
> > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > +	 * thing, so always use them with legacy devices. However, using
> > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > +	 * forgotten to tell us, is going to break badly; since this situation
> > +	 * already exists in the wild, maintain the old behaviour there.
> >  	 */
> > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> >  
> >  	return false;
> 
> This is exactly what I feared.

Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
is used) and it also works on the fastmodel if you disable cache-modelling
(which is needed to make the thing run at a usable pace) so we didn't spot
this in testing.

> Could we identify fastboot and do the special dance just for it?

[assuming you mean fastmodel instead of fastboot]

> I'd like to do that instead. It's fastboot doing the unreasonable thing
> here and deviating from what every other legacy device without exception
> did for years. If this means fastboot will need to update to virtio 1,
> all the better.

The problem still exists with virtio 1, unless we require that the
"dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
is advertised by the device (which is what I suggested in my reply).

We can't detect the fastmodel, but we could implicitly treat virtio-mmio
devices as cache-coherent regardless of the "dma-coherent" flag. I already
prototyped this, but I suspect the devicetree people will push back (and
there's a similar patch needed for ACPI).

See below. Do you prefer this approach?

Will

--->8

From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Mon, 30 Jan 2017 17:28:31 +0000
Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
 compatible string

Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
to DMA operations and therefore do not mandate the use of "dma-coherent"
in their devicetree bindings. In order to ensure that these devices work
correctly when using the DMA API, we need to treat them specially in
of_dma_is_coherent by identifying them as unconditionally coherent.

This patch adds a static, table-based search against the compatible
string for the device in of_dma_is_coherent before walking the
hierarchy looking for "dma-coherent". This allows existing virtio-mmio
devices (e.g. those emulated by QEMU) to function correctly when placed
behind an IOMMU that requires use of the DMA ops to map the vring.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903fe9d2..af29b115b8aa 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 }
 EXPORT_SYMBOL_GPL(of_dma_get_range);
 
+/*
+ * DMA from some device types is always cache-coherent, and in some unfortunate
+ * cases the "dma-coherent" property is not used.
+ */
+static const char *of_device_dma_coherent_tbl[] = {
+	/*
+	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
+	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
+	 * for their generated virtio MMIO device nodes, and the binding
+	 * documentation doesn't mention them either. When using the DMA API
+	 * (e.g. because there is an IOMMU in the system), we must report true
+	 * here to avoid lockups where writes to the vring via a non-coherent
+	 * mapping are not made visible to the device emulation.
+	 */
+	"virtio,mmio",
+	NULL,
+};
+
 /**
  * of_dma_is_coherent - Check if device is coherent
  * @np:	device node
  *
  * It returns true if "dma-coherent" property was found
- * for this device in DT.
+ * for this device in DT or the device is statically known to be
+ * coherent.
  */
 bool of_dma_is_coherent(struct device_node *np)
 {
 	struct device_node *node = of_node_get(np);
 
+	/*
+	 * Check for implicit DMA coherence first, since we don't want
+	 * to inherit this.
+	 */
+	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
+		of_node_put(node);
+		return true;
+	}
+
 	while (node) {
-		if (of_property_read_bool(node, "dma-coherent")) {
+		if (of_property_read_bool(node, "dma-coherent")){
 			of_node_put(node);
 			return true;
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 18:27       ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-01 18:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 7e38ed79c3fc..961af25b385c 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/virtio_ring.h>
> >  #include <linux/virtio_config.h>
> >  #include <linux/device.h>
> > +#include <linux/property.h>
> >  #include <linux/slab.h>
> >  #include <linux/module.h>
> >  #include <linux/hrtimer.h>
> > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> >  		return true;
> >  
> >  	/*
> > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > -	 * so always use them with legacy devices.
> > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > +	 * thing, so always use them with legacy devices. However, using
> > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > +	 * forgotten to tell us, is going to break badly; since this situation
> > +	 * already exists in the wild, maintain the old behaviour there.
> >  	 */
> > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> >  
> >  	return false;
> 
> This is exactly what I feared.

Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
is used) and it also works on the fastmodel if you disable cache-modelling
(which is needed to make the thing run at a usable pace) so we didn't spot
this in testing.

> Could we identify fastboot and do the special dance just for it?

[assuming you mean fastmodel instead of fastboot]

> I'd like to do that instead. It's fastboot doing the unreasonable thing
> here and deviating from what every other legacy device without exception
> did for years. If this means fastboot will need to update to virtio 1,
> all the better.

The problem still exists with virtio 1, unless we require that the
"dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
is advertised by the device (which is what I suggested in my reply).

We can't detect the fastmodel, but we could implicitly treat virtio-mmio
devices as cache-coherent regardless of the "dma-coherent" flag. I already
prototyped this, but I suspect the devicetree people will push back (and
there's a similar patch needed for ACPI).

See below. Do you prefer this approach?

Will

--->8

>From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Mon, 30 Jan 2017 17:28:31 +0000
Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
 compatible string

Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
to DMA operations and therefore do not mandate the use of "dma-coherent"
in their devicetree bindings. In order to ensure that these devices work
correctly when using the DMA API, we need to treat them specially in
of_dma_is_coherent by identifying them as unconditionally coherent.

This patch adds a static, table-based search against the compatible
string for the device in of_dma_is_coherent before walking the
hierarchy looking for "dma-coherent". This allows existing virtio-mmio
devices (e.g. those emulated by QEMU) to function correctly when placed
behind an IOMMU that requires use of the DMA ops to map the vring.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903fe9d2..af29b115b8aa 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 }
 EXPORT_SYMBOL_GPL(of_dma_get_range);
 
+/*
+ * DMA from some device types is always cache-coherent, and in some unfortunate
+ * cases the "dma-coherent" property is not used.
+ */
+static const char *of_device_dma_coherent_tbl[] = {
+	/*
+	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
+	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
+	 * for their generated virtio MMIO device nodes, and the binding
+	 * documentation doesn't mention them either. When using the DMA API
+	 * (e.g. because there is an IOMMU in the system), we must report true
+	 * here to avoid lockups where writes to the vring via a non-coherent
+	 * mapping are not made visible to the device emulation.
+	 */
+	"virtio,mmio",
+	NULL,
+};
+
 /**
  * of_dma_is_coherent - Check if device is coherent
  * @np:	device node
  *
  * It returns true if "dma-coherent" property was found
- * for this device in DT.
+ * for this device in DT or the device is statically known to be
+ * coherent.
  */
 bool of_dma_is_coherent(struct device_node *np)
 {
 	struct device_node *node = of_node_get(np);
 
+	/*
+	 * Check for implicit DMA coherence first, since we don't want
+	 * to inherit this.
+	 */
+	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
+		of_node_put(node);
+		return true;
+	}
+
 	while (node) {
-		if (of_property_read_bool(node, "dma-coherent")) {
+		if (of_property_read_bool(node, "dma-coherent")){
 			of_node_put(node);
 			return true;
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 18:27       ` Will Deacon
@ 2017-02-01 19:19           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 19:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 7e38ed79c3fc..961af25b385c 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/virtio_ring.h>
> > >  #include <linux/virtio_config.h>
> > >  #include <linux/device.h>
> > > +#include <linux/property.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/module.h>
> > >  #include <linux/hrtimer.h>
> > > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  		return true;
> > >  
> > >  	/*
> > > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > > -	 * so always use them with legacy devices.
> > > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > > +	 * thing, so always use them with legacy devices. However, using
> > > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > > +	 * forgotten to tell us, is going to break badly; since this situation
> > > +	 * already exists in the wild, maintain the old behaviour there.
> > >  	 */
> > > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> > >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> > >  
> > >  	return false;
> > 
> > This is exactly what I feared.
> 
> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
> is used) and it also works on the fastmodel if you disable cache-modelling
> (which is needed to make the thing run at a usable pace) so we didn't spot
> this in testing.
> 
> > Could we identify fastboot and do the special dance just for it?
> 
> [assuming you mean fastmodel instead of fastboot]
> 
> > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > here and deviating from what every other legacy device without exception
> > did for years. If this means fastboot will need to update to virtio 1,
> > all the better.
> 
> The problem still exists with virtio 1, unless we require that the
> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> is advertised by the device (which is what I suggested in my reply).

I'm not ignoring that, but I need to understand that part a bit better.
I'll reply to that patch in a day or two after looking at how _CCA is
supposed to work.

> We can't detect the fastmodel,

Surely, it puts a hardware id somewhere? I think you mean
fastmodel isn't always affected, right?

> but we could implicitly treat virtio-mmio
> devices as cache-coherent regardless of the "dma-coherent" flag. I already
> prototyped this, but I suspect the devicetree people will push back (and
> there's a similar patch needed for ACPI).
> 
> See below. Do you prefer this approach?
> 
> Will
> 
> --->8

I'd like to see basically

if (fastmodel)
	a pile of special work-arounds
else
	not less hacky but more common virtio work-arounds

:)

And then I can apply whatever comes from @arm.com and not
worry about breaking actual hardware.

> >From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
> From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
> Date: Mon, 30 Jan 2017 17:28:31 +0000
> Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
>  compatible string
> 
> Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
> to DMA operations and therefore do not mandate the use of "dma-coherent"
> in their devicetree bindings. In order to ensure that these devices work
> correctly when using the DMA API, we need to treat them specially in
> of_dma_is_coherent by identifying them as unconditionally coherent.
> 
> This patch adds a static, table-based search against the compatible
> string for the device in of_dma_is_coherent before walking the
> hierarchy looking for "dma-coherent". This allows existing virtio-mmio
> devices (e.g. those emulated by QEMU) to function correctly when placed
> behind an IOMMU that requires use of the DMA ops to map the vring.
> 
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
> Cc: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
> Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 02b2903fe9d2..af29b115b8aa 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>  }
>  EXPORT_SYMBOL_GPL(of_dma_get_range);
>  
> +/*
> + * DMA from some device types is always cache-coherent, and in some unfortunate
> + * cases the "dma-coherent" property is not used.
> + */
> +static const char *of_device_dma_coherent_tbl[] = {
> +	/*
> +	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
> +	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
> +	 * for their generated virtio MMIO device nodes, and the binding
> +	 * documentation doesn't mention them either. When using the DMA API
> +	 * (e.g. because there is an IOMMU in the system), we must report true
> +	 * here to avoid lockups where writes to the vring via a non-coherent
> +	 * mapping are not made visible to the device emulation.
> +	 */
> +	"virtio,mmio",
> +	NULL,
> +};
> +
>  /**
>   * of_dma_is_coherent - Check if device is coherent
>   * @np:	device node
>   *
>   * It returns true if "dma-coherent" property was found
> - * for this device in DT.
> + * for this device in DT or the device is statically known to be
> + * coherent.
>   */
>  bool of_dma_is_coherent(struct device_node *np)
>  {
>  	struct device_node *node = of_node_get(np);
>  
> +	/*
> +	 * Check for implicit DMA coherence first, since we don't want
> +	 * to inherit this.
> +	 */
> +	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
> +		of_node_put(node);
> +		return true;
> +	}
> +
>  	while (node) {
> -		if (of_property_read_bool(node, "dma-coherent")) {
> +		if (of_property_read_bool(node, "dma-coherent")){
>  			of_node_put(node);
>  			return true;
>  		}
> -- 
> 2.1.4
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 18:27       ` Will Deacon
  (?)
  (?)
@ 2017-02-01 19:19       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 19:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 7e38ed79c3fc..961af25b385c 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/virtio_ring.h>
> > >  #include <linux/virtio_config.h>
> > >  #include <linux/device.h>
> > > +#include <linux/property.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/module.h>
> > >  #include <linux/hrtimer.h>
> > > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  		return true;
> > >  
> > >  	/*
> > > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > > -	 * so always use them with legacy devices.
> > > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > > +	 * thing, so always use them with legacy devices. However, using
> > > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > > +	 * forgotten to tell us, is going to break badly; since this situation
> > > +	 * already exists in the wild, maintain the old behaviour there.
> > >  	 */
> > > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> > >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> > >  
> > >  	return false;
> > 
> > This is exactly what I feared.
> 
> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
> is used) and it also works on the fastmodel if you disable cache-modelling
> (which is needed to make the thing run at a usable pace) so we didn't spot
> this in testing.
> 
> > Could we identify fastboot and do the special dance just for it?
> 
> [assuming you mean fastmodel instead of fastboot]
> 
> > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > here and deviating from what every other legacy device without exception
> > did for years. If this means fastboot will need to update to virtio 1,
> > all the better.
> 
> The problem still exists with virtio 1, unless we require that the
> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> is advertised by the device (which is what I suggested in my reply).

I'm not ignoring that, but I need to understand that part a bit better.
I'll reply to that patch in a day or two after looking at how _CCA is
supposed to work.

> We can't detect the fastmodel,

Surely, it puts a hardware id somewhere? I think you mean
fastmodel isn't always affected, right?

> but we could implicitly treat virtio-mmio
> devices as cache-coherent regardless of the "dma-coherent" flag. I already
> prototyped this, but I suspect the devicetree people will push back (and
> there's a similar patch needed for ACPI).
> 
> See below. Do you prefer this approach?
> 
> Will
> 
> --->8

I'd like to see basically

if (fastmodel)
	a pile of special work-arounds
else
	not less hacky but more common virtio work-arounds

:)

And then I can apply whatever comes from @arm.com and not
worry about breaking actual hardware.

> >From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
> From: Will Deacon <will.deacon@arm.com>
> Date: Mon, 30 Jan 2017 17:28:31 +0000
> Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
>  compatible string
> 
> Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
> to DMA operations and therefore do not mandate the use of "dma-coherent"
> in their devicetree bindings. In order to ensure that these devices work
> correctly when using the DMA API, we need to treat them specially in
> of_dma_is_coherent by identifying them as unconditionally coherent.
> 
> This patch adds a static, table-based search against the compatible
> string for the device in of_dma_is_coherent before walking the
> hierarchy looking for "dma-coherent". This allows existing virtio-mmio
> devices (e.g. those emulated by QEMU) to function correctly when placed
> behind an IOMMU that requires use of the DMA ops to map the vring.
> 
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 02b2903fe9d2..af29b115b8aa 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>  }
>  EXPORT_SYMBOL_GPL(of_dma_get_range);
>  
> +/*
> + * DMA from some device types is always cache-coherent, and in some unfortunate
> + * cases the "dma-coherent" property is not used.
> + */
> +static const char *of_device_dma_coherent_tbl[] = {
> +	/*
> +	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
> +	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
> +	 * for their generated virtio MMIO device nodes, and the binding
> +	 * documentation doesn't mention them either. When using the DMA API
> +	 * (e.g. because there is an IOMMU in the system), we must report true
> +	 * here to avoid lockups where writes to the vring via a non-coherent
> +	 * mapping are not made visible to the device emulation.
> +	 */
> +	"virtio,mmio",
> +	NULL,
> +};
> +
>  /**
>   * of_dma_is_coherent - Check if device is coherent
>   * @np:	device node
>   *
>   * It returns true if "dma-coherent" property was found
> - * for this device in DT.
> + * for this device in DT or the device is statically known to be
> + * coherent.
>   */
>  bool of_dma_is_coherent(struct device_node *np)
>  {
>  	struct device_node *node = of_node_get(np);
>  
> +	/*
> +	 * Check for implicit DMA coherence first, since we don't want
> +	 * to inherit this.
> +	 */
> +	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
> +		of_node_put(node);
> +		return true;
> +	}
> +
>  	while (node) {
> -		if (of_property_read_bool(node, "dma-coherent")) {
> +		if (of_property_read_bool(node, "dma-coherent")){
>  			of_node_put(node);
>  			return true;
>  		}
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-01 19:19           ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-01 19:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 7e38ed79c3fc..961af25b385c 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/virtio_ring.h>
> > >  #include <linux/virtio_config.h>
> > >  #include <linux/device.h>
> > > +#include <linux/property.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/module.h>
> > >  #include <linux/hrtimer.h>
> > > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  		return true;
> > >  
> > >  	/*
> > > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > > -	 * so always use them with legacy devices.
> > > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > > +	 * thing, so always use them with legacy devices. However, using
> > > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > > +	 * forgotten to tell us, is going to break badly; since this situation
> > > +	 * already exists in the wild, maintain the old behaviour there.
> > >  	 */
> > > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> > >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> > >  
> > >  	return false;
> > 
> > This is exactly what I feared.
> 
> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
> is used) and it also works on the fastmodel if you disable cache-modelling
> (which is needed to make the thing run at a usable pace) so we didn't spot
> this in testing.
> 
> > Could we identify fastboot and do the special dance just for it?
> 
> [assuming you mean fastmodel instead of fastboot]
> 
> > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > here and deviating from what every other legacy device without exception
> > did for years. If this means fastboot will need to update to virtio 1,
> > all the better.
> 
> The problem still exists with virtio 1, unless we require that the
> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> is advertised by the device (which is what I suggested in my reply).

I'm not ignoring that, but I need to understand that part a bit better.
I'll reply to that patch in a day or two after looking at how _CCA is
supposed to work.

> We can't detect the fastmodel,

Surely, it puts a hardware id somewhere? I think you mean
fastmodel isn't always affected, right?

> but we could implicitly treat virtio-mmio
> devices as cache-coherent regardless of the "dma-coherent" flag. I already
> prototyped this, but I suspect the devicetree people will push back (and
> there's a similar patch needed for ACPI).
> 
> See below. Do you prefer this approach?
> 
> Will
> 
> --->8

I'd like to see basically

if (fastmodel)
	a pile of special work-arounds
else
	not less hacky but more common virtio work-arounds

:)

And then I can apply whatever comes from @arm.com and not
worry about breaking actual hardware.

> >From f6ad4e331c26e7ba53132c8cc74e26f782391570 Mon Sep 17 00:00:00 2001
> From: Will Deacon <will.deacon@arm.com>
> Date: Mon, 30 Jan 2017 17:28:31 +0000
> Subject: [PATCH] of/address: Allow devices to report DMA coherency based on
>  compatible string
> 
> Some devices (e.g. virtio-mmio) are implicitly cache coherent with respect
> to DMA operations and therefore do not mandate the use of "dma-coherent"
> in their devicetree bindings. In order to ensure that these devices work
> correctly when using the DMA API, we need to treat them specially in
> of_dma_is_coherent by identifying them as unconditionally coherent.
> 
> This patch adds a static, table-based search against the compatible
> string for the device in of_dma_is_coherent before walking the
> hierarchy looking for "dma-coherent". This allows existing virtio-mmio
> devices (e.g. those emulated by QEMU) to function correctly when placed
> behind an IOMMU that requires use of the DMA ops to map the vring.
> 
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/of/address.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 02b2903fe9d2..af29b115b8aa 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -891,19 +891,47 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>  }
>  EXPORT_SYMBOL_GPL(of_dma_get_range);
>  
> +/*
> + * DMA from some device types is always cache-coherent, and in some unfortunate
> + * cases the "dma-coherent" property is not used.
> + */
> +static const char *of_device_dma_coherent_tbl[] = {
> +	/*
> +	 * Virtio MMIO devices are assumed to be cache-coherent when accessing
> +	 * main memory. Neither QEMU nor kvmtool emit "dma-coherent" properties
> +	 * for their generated virtio MMIO device nodes, and the binding
> +	 * documentation doesn't mention them either. When using the DMA API
> +	 * (e.g. because there is an IOMMU in the system), we must report true
> +	 * here to avoid lockups where writes to the vring via a non-coherent
> +	 * mapping are not made visible to the device emulation.
> +	 */
> +	"virtio,mmio",
> +	NULL,
> +};
> +
>  /**
>   * of_dma_is_coherent - Check if device is coherent
>   * @np:	device node
>   *
>   * It returns true if "dma-coherent" property was found
> - * for this device in DT.
> + * for this device in DT or the device is statically known to be
> + * coherent.
>   */
>  bool of_dma_is_coherent(struct device_node *np)
>  {
>  	struct device_node *node = of_node_get(np);
>  
> +	/*
> +	 * Check for implicit DMA coherence first, since we don't want
> +	 * to inherit this.
> +	 */
> +	if (of_device_compatible_match(np, of_device_dma_coherent_tbl)) {
> +		of_node_put(node);
> +		return true;
> +	}
> +
>  	while (node) {
> -		if (of_property_read_bool(node, "dma-coherent")) {
> +		if (of_property_read_bool(node, "dma-coherent")){
>  			of_node_put(node);
>  			return true;
>  		}
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 19:19           ` Michael S. Tsirkin
@ 2017-02-02 11:26               ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-02 11:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Robin Murphy, jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > > here and deviating from what every other legacy device without exception
> > > did for years. If this means fastboot will need to update to virtio 1,
> > > all the better.
> > 
> > The problem still exists with virtio 1, unless we require that the
> > "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> > is advertised by the device (which is what I suggested in my reply).
> 
> I'm not ignoring that, but I need to understand that part a bit better.
> I'll reply to that patch in a day or two after looking at how _CCA is
> supposed to work.

Thanks. I do think that whatever solution we come up with for virtio 1
should influence what we do for legacy.

> > We can't detect the fastmodel,
> 
> Surely, it puts a hardware id somewhere? I think you mean
> fastmodel isn't always affected, right?

I don't think there's a hardware ID. The thing is, the fastmodel is a
toolkit for building all sorts of platforms: you can chop and change
the CPUs, the peripherals, the memory, the interrupt controller, the
interconnect etc. Pretty much everything can be customised. So, for
any fastmodel configuration that places virtio upstream of the SMMU
(which is common, because virtio is one of the few DMA-capable peripherals
that the fastmodel supports), we need to do something special.

> I'd like to see basically
> 
> if (fastmodel)
> 	a pile of special work-arounds
> else
> 	not less hacky but more common virtio work-arounds
> 
> :)
> 
> And then I can apply whatever comes from @arm.com and not
> worry about breaking actual hardware.

What we could do is call iommu_group_get(&vdev->dev) for legacy
devices if CONFIG_ARM64. If that returns non-NULL, then we know that
the device is upstream of an SMMU, which means it must be the fastmodel.

Will
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 19:19           ` Michael S. Tsirkin
  (?)
  (?)
@ 2017-02-02 11:26           ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-02 11:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > > here and deviating from what every other legacy device without exception
> > > did for years. If this means fastboot will need to update to virtio 1,
> > > all the better.
> > 
> > The problem still exists with virtio 1, unless we require that the
> > "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> > is advertised by the device (which is what I suggested in my reply).
> 
> I'm not ignoring that, but I need to understand that part a bit better.
> I'll reply to that patch in a day or two after looking at how _CCA is
> supposed to work.

Thanks. I do think that whatever solution we come up with for virtio 1
should influence what we do for legacy.

> > We can't detect the fastmodel,
> 
> Surely, it puts a hardware id somewhere? I think you mean
> fastmodel isn't always affected, right?

I don't think there's a hardware ID. The thing is, the fastmodel is a
toolkit for building all sorts of platforms: you can chop and change
the CPUs, the peripherals, the memory, the interrupt controller, the
interconnect etc. Pretty much everything can be customised. So, for
any fastmodel configuration that places virtio upstream of the SMMU
(which is common, because virtio is one of the few DMA-capable peripherals
that the fastmodel supports), we need to do something special.

> I'd like to see basically
> 
> if (fastmodel)
> 	a pile of special work-arounds
> else
> 	not less hacky but more common virtio work-arounds
> 
> :)
> 
> And then I can apply whatever comes from @arm.com and not
> worry about breaking actual hardware.

What we could do is call iommu_group_get(&vdev->dev) for legacy
devices if CONFIG_ARM64. If that returns non-NULL, then we know that
the device is upstream of an SMMU, which means it must be the fastmodel.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 11:26               ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-02 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > > here and deviating from what every other legacy device without exception
> > > did for years. If this means fastboot will need to update to virtio 1,
> > > all the better.
> > 
> > The problem still exists with virtio 1, unless we require that the
> > "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> > is advertised by the device (which is what I suggested in my reply).
> 
> I'm not ignoring that, but I need to understand that part a bit better.
> I'll reply to that patch in a day or two after looking at how _CCA is
> supposed to work.

Thanks. I do think that whatever solution we come up with for virtio 1
should influence what we do for legacy.

> > We can't detect the fastmodel,
> 
> Surely, it puts a hardware id somewhere? I think you mean
> fastmodel isn't always affected, right?

I don't think there's a hardware ID. The thing is, the fastmodel is a
toolkit for building all sorts of platforms: you can chop and change
the CPUs, the peripherals, the memory, the interrupt controller, the
interconnect etc. Pretty much everything can be customised. So, for
any fastmodel configuration that places virtio upstream of the SMMU
(which is common, because virtio is one of the few DMA-capable peripherals
that the fastmodel supports), we need to do something special.

> I'd like to see basically
> 
> if (fastmodel)
> 	a pile of special work-arounds
> else
> 	not less hacky but more common virtio work-arounds
> 
> :)
> 
> And then I can apply whatever comes from @arm.com and not
> worry about breaking actual hardware.

What we could do is call iommu_group_get(&vdev->dev) for legacy
devices if CONFIG_ARM64. If that returns non-NULL, then we know that
the device is upstream of an SMMU, which means it must be the fastmodel.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 11:26               ` Will Deacon
@ 2017-02-02 13:34                 ` Robin Murphy
  -1 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-02-02 13:34 UTC (permalink / raw)
  To: Will Deacon, Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	linux-arm-kernel

On 02/02/17 11:26, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>> here and deviating from what every other legacy device without exception
>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>> all the better.
>>>
>>> The problem still exists with virtio 1, unless we require that the
>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>> is advertised by the device (which is what I suggested in my reply).
>>
>> I'm not ignoring that, but I need to understand that part a bit better.
>> I'll reply to that patch in a day or two after looking at how _CCA is
>> supposed to work.
> 
> Thanks. I do think that whatever solution we come up with for virtio 1
> should influence what we do for legacy.
> 
>>> We can't detect the fastmodel,
>>
>> Surely, it puts a hardware id somewhere? I think you mean
>> fastmodel isn't always affected, right?
> 
> I don't think there's a hardware ID. The thing is, the fastmodel is a
> toolkit for building all sorts of platforms: you can chop and change
> the CPUs, the peripherals, the memory, the interrupt controller, the
> interconnect etc. Pretty much everything can be customised. So, for
> any fastmodel configuration that places virtio upstream of the SMMU
> (which is common, because virtio is one of the few DMA-capable peripherals
> that the fastmodel supports), we need to do something special.
> 
>> I'd like to see basically
>>
>> if (fastmodel)
>> 	a pile of special work-arounds
>> else
>> 	not less hacky but more common virtio work-arounds
>>
>> :)
>>
>> And then I can apply whatever comes from @arm.com and not
>> worry about breaking actual hardware.
> 
> What we could do is call iommu_group_get(&vdev->dev) for legacy

Actually, that should be vdev->dev.parent - I'm now not sure quite what
I managed to successfully test yesterday, but apparently it wasn't this
patch :(

> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> the device is upstream of an SMMU, which means it must be the fastmodel.

We can boot 32-bit kernels on models, so I'd be inclined to keep
CONFIG_ARM included, but I do tend to agree that explicitly checking for
an IOMMU is probably the cleanest approach if we reposition this as a
more specific quirk. I'll split apart the "Fast Models are wacky" vs.
"uh-oh device coherency" aspects and spin a v2 so that we can
(hopefully) sort the regression out ASAP.

Robin.

> 
> Will
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 13:34                 ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-02-02 13:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/02/17 11:26, Will Deacon wrote:
> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>> here and deviating from what every other legacy device without exception
>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>> all the better.
>>>
>>> The problem still exists with virtio 1, unless we require that the
>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>> is advertised by the device (which is what I suggested in my reply).
>>
>> I'm not ignoring that, but I need to understand that part a bit better.
>> I'll reply to that patch in a day or two after looking at how _CCA is
>> supposed to work.
> 
> Thanks. I do think that whatever solution we come up with for virtio 1
> should influence what we do for legacy.
> 
>>> We can't detect the fastmodel,
>>
>> Surely, it puts a hardware id somewhere? I think you mean
>> fastmodel isn't always affected, right?
> 
> I don't think there's a hardware ID. The thing is, the fastmodel is a
> toolkit for building all sorts of platforms: you can chop and change
> the CPUs, the peripherals, the memory, the interrupt controller, the
> interconnect etc. Pretty much everything can be customised. So, for
> any fastmodel configuration that places virtio upstream of the SMMU
> (which is common, because virtio is one of the few DMA-capable peripherals
> that the fastmodel supports), we need to do something special.
> 
>> I'd like to see basically
>>
>> if (fastmodel)
>> 	a pile of special work-arounds
>> else
>> 	not less hacky but more common virtio work-arounds
>>
>> :)
>>
>> And then I can apply whatever comes from @arm.com and not
>> worry about breaking actual hardware.
> 
> What we could do is call iommu_group_get(&vdev->dev) for legacy

Actually, that should be vdev->dev.parent - I'm now not sure quite what
I managed to successfully test yesterday, but apparently it wasn't this
patch :(

> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> the device is upstream of an SMMU, which means it must be the fastmodel.

We can boot 32-bit kernels on models, so I'd be inclined to keep
CONFIG_ARM included, but I do tend to agree that explicitly checking for
an IOMMU is probably the cleanest approach if we reposition this as a
more specific quirk. I'll split apart the "Fast Models are wacky" vs.
"uh-oh device coherency" aspects and spin a v2 so that we can
(hopefully) sort the regression out ASAP.

Robin.

> 
> Will
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 13:34                 ` Robin Murphy
@ 2017-02-02 16:16                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.
> 
> > 
> > Will
> > 

I still think the right thing to do for this platform is to add an API
for driver to say "disable protection for this device and allow
this device 1:1 access to all memory".  This
would make both platforms which bypass the iommu and platforms that
don't happy as PA == dma address.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 13:34                 ` Robin Murphy
  (?)
@ 2017-02-02 16:16                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, devicetree, pawel.moll, Will Deacon,
	virtualization, robh+dt, linux-arm-kernel

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.
> 
> > 
> > Will
> > 

I still think the right thing to do for this platform is to add an API
for driver to say "disable protection for this device and allow
this device 1:1 access to all memory".  This
would make both platforms which bypass the iommu and platforms that
don't happy as PA == dma address.

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:16                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.
> 
> > 
> > Will
> > 

I still think the right thing to do for this platform is to add an API
for driver to say "disable protection for this device and allow
this device 1:1 access to all memory".  This
would make both platforms which bypass the iommu and platforms that
don't happy as PA == dma address.

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 13:34                 ` Robin Murphy
@ 2017-02-02 16:30                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:30 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.

I am inclined to say, for 4.10 let's revert
c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
regression in 4.10.  So I think we can defer the fix to 4.11.
I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
for hosts with virtio 1 support.

All this will hopefully push hosts to just implement virtio 1.
For mmio the changes are very small: several new registers,
that's all. You want this for proper 64 bit dma mask anyway.

> > 
> > Will
> > 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 13:34                 ` Robin Murphy
  (?)
  (?)
@ 2017-02-02 16:30                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:30 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, devicetree, pawel.moll, Will Deacon,
	virtualization, robh+dt, linux-arm-kernel

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.

I am inclined to say, for 4.10 let's revert
c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
regression in 4.10.  So I think we can defer the fix to 4.11.
I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
for hosts with virtio 1 support.

All this will hopefully push hosts to just implement virtio 1.
For mmio the changes are very small: several new registers,
that's all. You want this for proper 64 bit dma mask anyway.

> > 
> > Will
> > 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:30                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> On 02/02/17 11:26, Will Deacon wrote:
> > On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>> here and deviating from what every other legacy device without exception
> >>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>> all the better.
> >>>
> >>> The problem still exists with virtio 1, unless we require that the
> >>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>> is advertised by the device (which is what I suggested in my reply).
> >>
> >> I'm not ignoring that, but I need to understand that part a bit better.
> >> I'll reply to that patch in a day or two after looking at how _CCA is
> >> supposed to work.
> > 
> > Thanks. I do think that whatever solution we come up with for virtio 1
> > should influence what we do for legacy.
> > 
> >>> We can't detect the fastmodel,
> >>
> >> Surely, it puts a hardware id somewhere? I think you mean
> >> fastmodel isn't always affected, right?
> > 
> > I don't think there's a hardware ID. The thing is, the fastmodel is a
> > toolkit for building all sorts of platforms: you can chop and change
> > the CPUs, the peripherals, the memory, the interrupt controller, the
> > interconnect etc. Pretty much everything can be customised. So, for
> > any fastmodel configuration that places virtio upstream of the SMMU
> > (which is common, because virtio is one of the few DMA-capable peripherals
> > that the fastmodel supports), we need to do something special.
> > 
> >> I'd like to see basically
> >>
> >> if (fastmodel)
> >> 	a pile of special work-arounds
> >> else
> >> 	not less hacky but more common virtio work-arounds
> >>
> >> :)
> >>
> >> And then I can apply whatever comes from @arm.com and not
> >> worry about breaking actual hardware.
> > 
> > What we could do is call iommu_group_get(&vdev->dev) for legacy
> 
> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> I managed to successfully test yesterday, but apparently it wasn't this
> patch :(
> 
> > devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> > the device is upstream of an SMMU, which means it must be the fastmodel.
> 
> We can boot 32-bit kernels on models, so I'd be inclined to keep
> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> an IOMMU is probably the cleanest approach if we reposition this as a
> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> "uh-oh device coherency" aspects and spin a v2 so that we can
> (hopefully) sort the regression out ASAP.
> 
> Robin.

I am inclined to say, for 4.10 let's revert
c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
regression in 4.10.  So I think we can defer the fix to 4.11.
I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
for hosts with virtio 1 support.

All this will hopefully push hosts to just implement virtio 1.
For mmio the changes are very small: several new registers,
that's all. You want this for proper 64 bit dma mask anyway.

> > 
> > Will
> > 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:16                     ` Michael S. Tsirkin
@ 2017-02-02 16:39                         ` Marc Zyngier
  -1 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-02-02 16:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Robin Murphy
  Cc: mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA,
	pawel.moll-5wv7dgnIgG8, Will Deacon,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 02/02/17 16:16, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
>> On 02/02/17 11:26, Will Deacon wrote:
>>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>>>> here and deviating from what every other legacy device without exception
>>>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>>>> all the better.
>>>>>
>>>>> The problem still exists with virtio 1, unless we require that the
>>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>>>> is advertised by the device (which is what I suggested in my reply).
>>>>
>>>> I'm not ignoring that, but I need to understand that part a bit better.
>>>> I'll reply to that patch in a day or two after looking at how _CCA is
>>>> supposed to work.
>>>
>>> Thanks. I do think that whatever solution we come up with for virtio 1
>>> should influence what we do for legacy.
>>>
>>>>> We can't detect the fastmodel,
>>>>
>>>> Surely, it puts a hardware id somewhere? I think you mean
>>>> fastmodel isn't always affected, right?
>>>
>>> I don't think there's a hardware ID. The thing is, the fastmodel is a
>>> toolkit for building all sorts of platforms: you can chop and change
>>> the CPUs, the peripherals, the memory, the interrupt controller, the
>>> interconnect etc. Pretty much everything can be customised. So, for
>>> any fastmodel configuration that places virtio upstream of the SMMU
>>> (which is common, because virtio is one of the few DMA-capable peripherals
>>> that the fastmodel supports), we need to do something special.
>>>
>>>> I'd like to see basically
>>>>
>>>> if (fastmodel)
>>>> 	a pile of special work-arounds
>>>> else
>>>> 	not less hacky but more common virtio work-arounds
>>>>
>>>> :)
>>>>
>>>> And then I can apply whatever comes from @arm.com and not
>>>> worry about breaking actual hardware.
>>>
>>> What we could do is call iommu_group_get(&vdev->dev) for legacy
>>
>> Actually, that should be vdev->dev.parent - I'm now not sure quite what
>> I managed to successfully test yesterday, but apparently it wasn't this
>> patch :(
>>
>>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
>>> the device is upstream of an SMMU, which means it must be the fastmodel.
>>
>> We can boot 32-bit kernels on models, so I'd be inclined to keep
>> CONFIG_ARM included, but I do tend to agree that explicitly checking for
>> an IOMMU is probably the cleanest approach if we reposition this as a
>> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
>> "uh-oh device coherency" aspects and spin a v2 so that we can
>> (hopefully) sort the regression out ASAP.
>>
>> Robin.
>>
>>>
>>> Will
>>>
> 
> I still think the right thing to do for this platform is to add an API
> for driver to say "disable protection for this device and allow
> this device 1:1 access to all memory".  This
> would make both platforms which bypass the iommu and platforms that
> don't happy as PA == dma address.

Hi Michael,

What would this API be? A command-line option? A magic DT property? Yet
another ACPI bodge? Given the type of HW platform we're looking at,
changing the firmware to expose a new property is unlikely to be practical.

My point is: we have all the required information in the kernel to do
the right thing without asking the user to change anything in their
existing setup. With this (admittedly unpleasant) change, we can make
things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
And frankly, it doesn't look much worse than the Xen thing that sits a
few lines above.

Am I missing something more obvious than "you should use a flying car
instead"? ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:16                     ` Michael S. Tsirkin
  (?)
@ 2017-02-02 16:39                     ` Marc Zyngier
  -1 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-02-02 16:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Robin Murphy
  Cc: mark.rutland, devicetree, pawel.moll, Will Deacon,
	virtualization, robh+dt, linux-arm-kernel

On 02/02/17 16:16, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
>> On 02/02/17 11:26, Will Deacon wrote:
>>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>>>> here and deviating from what every other legacy device without exception
>>>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>>>> all the better.
>>>>>
>>>>> The problem still exists with virtio 1, unless we require that the
>>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>>>> is advertised by the device (which is what I suggested in my reply).
>>>>
>>>> I'm not ignoring that, but I need to understand that part a bit better.
>>>> I'll reply to that patch in a day or two after looking at how _CCA is
>>>> supposed to work.
>>>
>>> Thanks. I do think that whatever solution we come up with for virtio 1
>>> should influence what we do for legacy.
>>>
>>>>> We can't detect the fastmodel,
>>>>
>>>> Surely, it puts a hardware id somewhere? I think you mean
>>>> fastmodel isn't always affected, right?
>>>
>>> I don't think there's a hardware ID. The thing is, the fastmodel is a
>>> toolkit for building all sorts of platforms: you can chop and change
>>> the CPUs, the peripherals, the memory, the interrupt controller, the
>>> interconnect etc. Pretty much everything can be customised. So, for
>>> any fastmodel configuration that places virtio upstream of the SMMU
>>> (which is common, because virtio is one of the few DMA-capable peripherals
>>> that the fastmodel supports), we need to do something special.
>>>
>>>> I'd like to see basically
>>>>
>>>> if (fastmodel)
>>>> 	a pile of special work-arounds
>>>> else
>>>> 	not less hacky but more common virtio work-arounds
>>>>
>>>> :)
>>>>
>>>> And then I can apply whatever comes from @arm.com and not
>>>> worry about breaking actual hardware.
>>>
>>> What we could do is call iommu_group_get(&vdev->dev) for legacy
>>
>> Actually, that should be vdev->dev.parent - I'm now not sure quite what
>> I managed to successfully test yesterday, but apparently it wasn't this
>> patch :(
>>
>>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
>>> the device is upstream of an SMMU, which means it must be the fastmodel.
>>
>> We can boot 32-bit kernels on models, so I'd be inclined to keep
>> CONFIG_ARM included, but I do tend to agree that explicitly checking for
>> an IOMMU is probably the cleanest approach if we reposition this as a
>> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
>> "uh-oh device coherency" aspects and spin a v2 so that we can
>> (hopefully) sort the regression out ASAP.
>>
>> Robin.
>>
>>>
>>> Will
>>>
> 
> I still think the right thing to do for this platform is to add an API
> for driver to say "disable protection for this device and allow
> this device 1:1 access to all memory".  This
> would make both platforms which bypass the iommu and platforms that
> don't happy as PA == dma address.

Hi Michael,

What would this API be? A command-line option? A magic DT property? Yet
another ACPI bodge? Given the type of HW platform we're looking at,
changing the firmware to expose a new property is unlikely to be practical.

My point is: we have all the required information in the kernel to do
the right thing without asking the user to change anything in their
existing setup. With this (admittedly unpleasant) change, we can make
things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
And frankly, it doesn't look much worse than the Xen thing that sits a
few lines above.

Am I missing something more obvious than "you should use a flying car
instead"? ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:39                         ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-02-02 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/02/17 16:16, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
>> On 02/02/17 11:26, Will Deacon wrote:
>>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>>>> here and deviating from what every other legacy device without exception
>>>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>>>> all the better.
>>>>>
>>>>> The problem still exists with virtio 1, unless we require that the
>>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>>>> is advertised by the device (which is what I suggested in my reply).
>>>>
>>>> I'm not ignoring that, but I need to understand that part a bit better.
>>>> I'll reply to that patch in a day or two after looking at how _CCA is
>>>> supposed to work.
>>>
>>> Thanks. I do think that whatever solution we come up with for virtio 1
>>> should influence what we do for legacy.
>>>
>>>>> We can't detect the fastmodel,
>>>>
>>>> Surely, it puts a hardware id somewhere? I think you mean
>>>> fastmodel isn't always affected, right?
>>>
>>> I don't think there's a hardware ID. The thing is, the fastmodel is a
>>> toolkit for building all sorts of platforms: you can chop and change
>>> the CPUs, the peripherals, the memory, the interrupt controller, the
>>> interconnect etc. Pretty much everything can be customised. So, for
>>> any fastmodel configuration that places virtio upstream of the SMMU
>>> (which is common, because virtio is one of the few DMA-capable peripherals
>>> that the fastmodel supports), we need to do something special.
>>>
>>>> I'd like to see basically
>>>>
>>>> if (fastmodel)
>>>> 	a pile of special work-arounds
>>>> else
>>>> 	not less hacky but more common virtio work-arounds
>>>>
>>>> :)
>>>>
>>>> And then I can apply whatever comes from @arm.com and not
>>>> worry about breaking actual hardware.
>>>
>>> What we could do is call iommu_group_get(&vdev->dev) for legacy
>>
>> Actually, that should be vdev->dev.parent - I'm now not sure quite what
>> I managed to successfully test yesterday, but apparently it wasn't this
>> patch :(
>>
>>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
>>> the device is upstream of an SMMU, which means it must be the fastmodel.
>>
>> We can boot 32-bit kernels on models, so I'd be inclined to keep
>> CONFIG_ARM included, but I do tend to agree that explicitly checking for
>> an IOMMU is probably the cleanest approach if we reposition this as a
>> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
>> "uh-oh device coherency" aspects and spin a v2 so that we can
>> (hopefully) sort the regression out ASAP.
>>
>> Robin.
>>
>>>
>>> Will
>>>
> 
> I still think the right thing to do for this platform is to add an API
> for driver to say "disable protection for this device and allow
> this device 1:1 access to all memory".  This
> would make both platforms which bypass the iommu and platforms that
> don't happy as PA == dma address.

Hi Michael,

What would this API be? A command-line option? A magic DT property? Yet
another ACPI bodge? Given the type of HW platform we're looking at,
changing the firmware to expose a new property is unlikely to be practical.

My point is: we have all the required information in the kernel to do
the right thing without asking the user to change anything in their
existing setup. With this (admittedly unpleasant) change, we can make
things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
And frankly, it doesn't look much worse than the Xen thing that sits a
few lines above.

Am I missing something more obvious than "you should use a flying car
instead"? ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:30                     ` Michael S. Tsirkin
@ 2017-02-02 16:40                       ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-02 16:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> I am inclined to say, for 4.10 let's revert
> c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> regression in 4.10.

No complaints there, as long as we can keep working to fix this for 4.11
and onwards. You'll also need to cc stable on the revert.

> So I think we can defer the fix to 4.11.
> I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> for hosts with virtio 1 support.
> 
> All this will hopefully push hosts to just implement virtio 1.
> For mmio the changes are very small: several new registers,
> that's all. You want this for proper 64 bit dma mask anyway.

As I've said, virtio 1 will have exactly the same issue unless we start
requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
devices correctly.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:40                       ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-02 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> I am inclined to say, for 4.10 let's revert
> c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> regression in 4.10.

No complaints there, as long as we can keep working to fix this for 4.11
and onwards. You'll also need to cc stable on the revert.

> So I think we can defer the fix to 4.11.
> I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> for hosts with virtio 1 support.
> 
> All this will hopefully push hosts to just implement virtio 1.
> For mmio the changes are very small: several new registers,
> that's all. You want this for proper 64 bit dma mask anyway.

As I've said, virtio 1 will have exactly the same issue unless we start
requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
devices correctly.

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:39                         ` Marc Zyngier
@ 2017-02-02 16:44                             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:44 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Robin Murphy, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA, pawel.moll-5wv7dgnIgG8,
	Will Deacon,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Thu, Feb 02, 2017 at 04:39:35PM +0000, Marc Zyngier wrote:
> On 02/02/17 16:16, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> >> On 02/02/17 11:26, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>>>> here and deviating from what every other legacy device without exception
> >>>>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>>>> all the better.
> >>>>>
> >>>>> The problem still exists with virtio 1, unless we require that the
> >>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>>>> is advertised by the device (which is what I suggested in my reply).
> >>>>
> >>>> I'm not ignoring that, but I need to understand that part a bit better.
> >>>> I'll reply to that patch in a day or two after looking at how _CCA is
> >>>> supposed to work.
> >>>
> >>> Thanks. I do think that whatever solution we come up with for virtio 1
> >>> should influence what we do for legacy.
> >>>
> >>>>> We can't detect the fastmodel,
> >>>>
> >>>> Surely, it puts a hardware id somewhere? I think you mean
> >>>> fastmodel isn't always affected, right?
> >>>
> >>> I don't think there's a hardware ID. The thing is, the fastmodel is a
> >>> toolkit for building all sorts of platforms: you can chop and change
> >>> the CPUs, the peripherals, the memory, the interrupt controller, the
> >>> interconnect etc. Pretty much everything can be customised. So, for
> >>> any fastmodel configuration that places virtio upstream of the SMMU
> >>> (which is common, because virtio is one of the few DMA-capable peripherals
> >>> that the fastmodel supports), we need to do something special.
> >>>
> >>>> I'd like to see basically
> >>>>
> >>>> if (fastmodel)
> >>>> 	a pile of special work-arounds
> >>>> else
> >>>> 	not less hacky but more common virtio work-arounds
> >>>>
> >>>> :)
> >>>>
> >>>> And then I can apply whatever comes from @arm.com and not
> >>>> worry about breaking actual hardware.
> >>>
> >>> What we could do is call iommu_group_get(&vdev->dev) for legacy
> >>
> >> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> >> I managed to successfully test yesterday, but apparently it wasn't this
> >> patch :(
> >>
> >>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> >>> the device is upstream of an SMMU, which means it must be the fastmodel.
> >>
> >> We can boot 32-bit kernels on models, so I'd be inclined to keep
> >> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> >> an IOMMU is probably the cleanest approach if we reposition this as a
> >> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> >> "uh-oh device coherency" aspects and spin a v2 so that we can
> >> (hopefully) sort the regression out ASAP.
> >>
> >> Robin.
> >>
> >>>
> >>> Will
> >>>
> > 
> > I still think the right thing to do for this platform is to add an API
> > for driver to say "disable protection for this device and allow
> > this device 1:1 access to all memory".  This
> > would make both platforms which bypass the iommu and platforms that
> > don't happy as PA == dma address.
> 
> Hi Michael,
> 
> What would this API be? A command-line option? A magic DT property? Yet
> another ACPI bodge? Given the type of HW platform we're looking at,
> changing the firmware to expose a new property is unlikely to be practical.

No - I mean an internal kernel API that the legacy driver can call.

> My point is: we have all the required information in the kernel to do
> the right thing without asking the user to change anything in their
> existing setup. With this (admittedly unpleasant) change, we can make
> things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
> And frankly, it doesn't look much worse than the Xen thing that sits a
> few lines above.
> 
> Am I missing something more obvious than "you should use a flying car
> instead"? ;-)
> 
> Thanks,
> 
> 	M.

I agree we should try to do the right thing. Using an MMU to
protect against a hypervisor is a futile effort though.
It simply makes sense to allow virtio access to all of memory.
virtio 1 has a flag to control that to handle weird corner cases.


> -- 
> Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:39                         ` Marc Zyngier
  (?)
@ 2017-02-02 16:44                         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:44 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: mark.rutland, devicetree, pawel.moll, Will Deacon,
	virtualization, robh+dt, Robin Murphy, linux-arm-kernel

On Thu, Feb 02, 2017 at 04:39:35PM +0000, Marc Zyngier wrote:
> On 02/02/17 16:16, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> >> On 02/02/17 11:26, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>>>> here and deviating from what every other legacy device without exception
> >>>>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>>>> all the better.
> >>>>>
> >>>>> The problem still exists with virtio 1, unless we require that the
> >>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>>>> is advertised by the device (which is what I suggested in my reply).
> >>>>
> >>>> I'm not ignoring that, but I need to understand that part a bit better.
> >>>> I'll reply to that patch in a day or two after looking at how _CCA is
> >>>> supposed to work.
> >>>
> >>> Thanks. I do think that whatever solution we come up with for virtio 1
> >>> should influence what we do for legacy.
> >>>
> >>>>> We can't detect the fastmodel,
> >>>>
> >>>> Surely, it puts a hardware id somewhere? I think you mean
> >>>> fastmodel isn't always affected, right?
> >>>
> >>> I don't think there's a hardware ID. The thing is, the fastmodel is a
> >>> toolkit for building all sorts of platforms: you can chop and change
> >>> the CPUs, the peripherals, the memory, the interrupt controller, the
> >>> interconnect etc. Pretty much everything can be customised. So, for
> >>> any fastmodel configuration that places virtio upstream of the SMMU
> >>> (which is common, because virtio is one of the few DMA-capable peripherals
> >>> that the fastmodel supports), we need to do something special.
> >>>
> >>>> I'd like to see basically
> >>>>
> >>>> if (fastmodel)
> >>>> 	a pile of special work-arounds
> >>>> else
> >>>> 	not less hacky but more common virtio work-arounds
> >>>>
> >>>> :)
> >>>>
> >>>> And then I can apply whatever comes from @arm.com and not
> >>>> worry about breaking actual hardware.
> >>>
> >>> What we could do is call iommu_group_get(&vdev->dev) for legacy
> >>
> >> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> >> I managed to successfully test yesterday, but apparently it wasn't this
> >> patch :(
> >>
> >>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> >>> the device is upstream of an SMMU, which means it must be the fastmodel.
> >>
> >> We can boot 32-bit kernels on models, so I'd be inclined to keep
> >> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> >> an IOMMU is probably the cleanest approach if we reposition this as a
> >> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> >> "uh-oh device coherency" aspects and spin a v2 so that we can
> >> (hopefully) sort the regression out ASAP.
> >>
> >> Robin.
> >>
> >>>
> >>> Will
> >>>
> > 
> > I still think the right thing to do for this platform is to add an API
> > for driver to say "disable protection for this device and allow
> > this device 1:1 access to all memory".  This
> > would make both platforms which bypass the iommu and platforms that
> > don't happy as PA == dma address.
> 
> Hi Michael,
> 
> What would this API be? A command-line option? A magic DT property? Yet
> another ACPI bodge? Given the type of HW platform we're looking at,
> changing the firmware to expose a new property is unlikely to be practical.

No - I mean an internal kernel API that the legacy driver can call.

> My point is: we have all the required information in the kernel to do
> the right thing without asking the user to change anything in their
> existing setup. With this (admittedly unpleasant) change, we can make
> things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
> And frankly, it doesn't look much worse than the Xen thing that sits a
> few lines above.
> 
> Am I missing something more obvious than "you should use a flying car
> instead"? ;-)
> 
> Thanks,
> 
> 	M.

I agree we should try to do the right thing. Using an MMU to
protect against a hypervisor is a futile effort though.
It simply makes sense to allow virtio access to all of memory.
virtio 1 has a flag to control that to handle weird corner cases.


> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:44                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 04:39:35PM +0000, Marc Zyngier wrote:
> On 02/02/17 16:16, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 01:34:03PM +0000, Robin Murphy wrote:
> >> On 02/02/17 11:26, Will Deacon wrote:
> >>> On Wed, Feb 01, 2017 at 09:19:22PM +0200, Michael S. Tsirkin wrote:
> >>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> >>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> >>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
> >>>>>> here and deviating from what every other legacy device without exception
> >>>>>> did for years. If this means fastboot will need to update to virtio 1,
> >>>>>> all the better.
> >>>>>
> >>>>> The problem still exists with virtio 1, unless we require that the
> >>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> >>>>> is advertised by the device (which is what I suggested in my reply).
> >>>>
> >>>> I'm not ignoring that, but I need to understand that part a bit better.
> >>>> I'll reply to that patch in a day or two after looking at how _CCA is
> >>>> supposed to work.
> >>>
> >>> Thanks. I do think that whatever solution we come up with for virtio 1
> >>> should influence what we do for legacy.
> >>>
> >>>>> We can't detect the fastmodel,
> >>>>
> >>>> Surely, it puts a hardware id somewhere? I think you mean
> >>>> fastmodel isn't always affected, right?
> >>>
> >>> I don't think there's a hardware ID. The thing is, the fastmodel is a
> >>> toolkit for building all sorts of platforms: you can chop and change
> >>> the CPUs, the peripherals, the memory, the interrupt controller, the
> >>> interconnect etc. Pretty much everything can be customised. So, for
> >>> any fastmodel configuration that places virtio upstream of the SMMU
> >>> (which is common, because virtio is one of the few DMA-capable peripherals
> >>> that the fastmodel supports), we need to do something special.
> >>>
> >>>> I'd like to see basically
> >>>>
> >>>> if (fastmodel)
> >>>> 	a pile of special work-arounds
> >>>> else
> >>>> 	not less hacky but more common virtio work-arounds
> >>>>
> >>>> :)
> >>>>
> >>>> And then I can apply whatever comes from @arm.com and not
> >>>> worry about breaking actual hardware.
> >>>
> >>> What we could do is call iommu_group_get(&vdev->dev) for legacy
> >>
> >> Actually, that should be vdev->dev.parent - I'm now not sure quite what
> >> I managed to successfully test yesterday, but apparently it wasn't this
> >> patch :(
> >>
> >>> devices if CONFIG_ARM64. If that returns non-NULL, then we know that
> >>> the device is upstream of an SMMU, which means it must be the fastmodel.
> >>
> >> We can boot 32-bit kernels on models, so I'd be inclined to keep
> >> CONFIG_ARM included, but I do tend to agree that explicitly checking for
> >> an IOMMU is probably the cleanest approach if we reposition this as a
> >> more specific quirk. I'll split apart the "Fast Models are wacky" vs.
> >> "uh-oh device coherency" aspects and spin a v2 so that we can
> >> (hopefully) sort the regression out ASAP.
> >>
> >> Robin.
> >>
> >>>
> >>> Will
> >>>
> > 
> > I still think the right thing to do for this platform is to add an API
> > for driver to say "disable protection for this device and allow
> > this device 1:1 access to all memory".  This
> > would make both platforms which bypass the iommu and platforms that
> > don't happy as PA == dma address.
> 
> Hi Michael,
> 
> What would this API be? A command-line option? A magic DT property? Yet
> another ACPI bodge? Given the type of HW platform we're looking at,
> changing the firmware to expose a new property is unlikely to be practical.

No - I mean an internal kernel API that the legacy driver can call.

> My point is: we have all the required information in the kernel to do
> the right thing without asking the user to change anything in their
> existing setup. With this (admittedly unpleasant) change, we can make
> things work for both IOMMU and non-IOMMU setups, dma-coherent or not.
> And frankly, it doesn't look much worse than the Xen thing that sits a
> few lines above.
> 
> Am I missing something more obvious than "you should use a flying car
> instead"? ;-)
> 
> Thanks,
> 
> 	M.

I agree we should try to do the right thing. Using an MMU to
protect against a hypervisor is a futile effort though.
It simply makes sense to allow virtio access to all of memory.
virtio 1 has a flag to control that to handle weird corner cases.


> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:40                       ` Will Deacon
@ 2017-02-02 16:45                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, jasowang-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	pawel.moll-5wv7dgnIgG8

On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > I am inclined to say, for 4.10 let's revert
> > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > regression in 4.10.
> 
> No complaints there, as long as we can keep working to fix this for 4.11
> and onwards. You'll also need to cc stable on the revert.
> 
> > So I think we can defer the fix to 4.11.
> > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > for hosts with virtio 1 support.
> > 
> > All this will hopefully push hosts to just implement virtio 1.
> > For mmio the changes are very small: several new registers,
> > that's all. You want this for proper 64 bit dma mask anyway.
> 
> As I've said, virtio 1 will have exactly the same issue unless we start
> requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> devices correctly.
> 
> Will

I mostly agree with that, just didn't have time to read up on _CCA yet.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:40                       ` Will Deacon
  (?)
@ 2017-02-02 16:45                       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > I am inclined to say, for 4.10 let's revert
> > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > regression in 4.10.
> 
> No complaints there, as long as we can keep working to fix this for 4.11
> and onwards. You'll also need to cc stable on the revert.
> 
> > So I think we can defer the fix to 4.11.
> > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > for hosts with virtio 1 support.
> > 
> > All this will hopefully push hosts to just implement virtio 1.
> > For mmio the changes are very small: several new registers,
> > that's all. You want this for proper 64 bit dma mask anyway.
> 
> As I've said, virtio 1 will have exactly the same issue unless we start
> requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> devices correctly.
> 
> Will

I mostly agree with that, just didn't have time to read up on _CCA yet.

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-02 16:45                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-02 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > I am inclined to say, for 4.10 let's revert
> > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > regression in 4.10.
> 
> No complaints there, as long as we can keep working to fix this for 4.11
> and onwards. You'll also need to cc stable on the revert.
> 
> > So I think we can defer the fix to 4.11.
> > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > for hosts with virtio 1 support.
> > 
> > All this will hopefully push hosts to just implement virtio 1.
> > For mmio the changes are very small: several new registers,
> > that's all. You want this for proper 64 bit dma mask anyway.
> 
> As I've said, virtio 1 will have exactly the same issue unless we start
> requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> devices correctly.
> 
> Will

I mostly agree with that, just didn't have time to read up on _CCA yet.

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-01 19:19           ` Michael S. Tsirkin
@ 2017-02-08 12:58             ` Alexander Graf
  -1 siblings, 0 replies; 59+ messages in thread
From: Alexander Graf @ 2017-02-08 12:58 UTC (permalink / raw)
  To: Michael Tsirkin, Will Deacon
  Cc: Mark Rutland, rob.herring, devicetree, linux-arm-kernel, virtualization

On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>>> index 7e38ed79c3fc..961af25b385c 100644
>>>> --- a/drivers/virtio/virtio_ring.c
>>>> +++ b/drivers/virtio/virtio_ring.c
>>>> @@ -20,6 +20,7 @@
>>>>  #include <linux/virtio_ring.h>
>>>>  #include <linux/virtio_config.h>
>>>>  #include <linux/device.h>
>>>> +#include <linux/property.h>
>>>>  #include <linux/slab.h>
>>>>  #include <linux/module.h>
>>>>  #include <linux/hrtimer.h>
>>>> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>>  		return true;
>>>>
>>>>  	/*
>>>> -	 * On ARM-based machines, the DMA ops will do the right thing,
>>>> -	 * so always use them with legacy devices.
>>>> +	 * On ARM-based machines, the coherent DMA ops will do the right
>>>> +	 * thing, so always use them with legacy devices. However, using
>>>> +	 * non-coherent DMA when the host *is* actually coherent, but has
>>>> +	 * forgotten to tell us, is going to break badly; since this situation
>>>> +	 * already exists in the wild, maintain the old behaviour there.
>>>>  	 */
>>>> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
>>>> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
>>>> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>>>>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>>>>
>>>>  	return false;
>>>
>>> This is exactly what I feared.
>>
>> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
>> is used) and it also works on the fastmodel if you disable cache-modelling
>> (which is needed to make the thing run at a usable pace) so we didn't spot
>> this in testing.
>>
>>> Could we identify fastboot and do the special dance just for it?
>>
>> [assuming you mean fastmodel instead of fastboot]
>>
>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>> here and deviating from what every other legacy device without exception
>>> did for years. If this means fastboot will need to update to virtio 1,
>>> all the better.
>>
>> The problem still exists with virtio 1, unless we require that the
>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>> is advertised by the device (which is what I suggested in my reply).
>
> I'm not ignoring that, but I need to understand that part a bit better.
> I'll reply to that patch in a day or two after looking at how _CCA is
> supposed to work.
>
>> We can't detect the fastmodel,
>
> Surely, it puts a hardware id somewhere? I think you mean
> fastmodel isn't always affected, right?
>
>> but we could implicitly treat virtio-mmio
>> devices as cache-coherent regardless of the "dma-coherent" flag. I already
>> prototyped this, but I suspect the devicetree people will push back (and
>> there's a similar patch needed for ACPI).
>>
>> See below. Do you prefer this approach?
>>
>> Will
>>
>> --->8
>
> I'd like to see basically
>
> if (fastmodel)
> 	a pile of special work-arounds
> else
> 	not less hacky but more common virtio work-arounds
>
> :)
>
> And then I can apply whatever comes from @arm.com and not
> worry about breaking actual hardware.

I'm actually seeing the exact same breakage in QEMU right now, so it's 
not fast model related at all. In QEMU we also don't properly set the 
dma-coherent flag, so we run into cache coherency problems.


Alex

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-08 12:58             ` Alexander Graf
  0 siblings, 0 replies; 59+ messages in thread
From: Alexander Graf @ 2017-02-08 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>>> index 7e38ed79c3fc..961af25b385c 100644
>>>> --- a/drivers/virtio/virtio_ring.c
>>>> +++ b/drivers/virtio/virtio_ring.c
>>>> @@ -20,6 +20,7 @@
>>>>  #include <linux/virtio_ring.h>
>>>>  #include <linux/virtio_config.h>
>>>>  #include <linux/device.h>
>>>> +#include <linux/property.h>
>>>>  #include <linux/slab.h>
>>>>  #include <linux/module.h>
>>>>  #include <linux/hrtimer.h>
>>>> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>>  		return true;
>>>>
>>>>  	/*
>>>> -	 * On ARM-based machines, the DMA ops will do the right thing,
>>>> -	 * so always use them with legacy devices.
>>>> +	 * On ARM-based machines, the coherent DMA ops will do the right
>>>> +	 * thing, so always use them with legacy devices. However, using
>>>> +	 * non-coherent DMA when the host *is* actually coherent, but has
>>>> +	 * forgotten to tell us, is going to break badly; since this situation
>>>> +	 * already exists in the wild, maintain the old behaviour there.
>>>>  	 */
>>>> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
>>>> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
>>>> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>>>>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>>>>
>>>>  	return false;
>>>
>>> This is exactly what I feared.
>>
>> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
>> is used) and it also works on the fastmodel if you disable cache-modelling
>> (which is needed to make the thing run at a usable pace) so we didn't spot
>> this in testing.
>>
>>> Could we identify fastboot and do the special dance just for it?
>>
>> [assuming you mean fastmodel instead of fastboot]
>>
>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>> here and deviating from what every other legacy device without exception
>>> did for years. If this means fastboot will need to update to virtio 1,
>>> all the better.
>>
>> The problem still exists with virtio 1, unless we require that the
>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>> is advertised by the device (which is what I suggested in my reply).
>
> I'm not ignoring that, but I need to understand that part a bit better.
> I'll reply to that patch in a day or two after looking at how _CCA is
> supposed to work.
>
>> We can't detect the fastmodel,
>
> Surely, it puts a hardware id somewhere? I think you mean
> fastmodel isn't always affected, right?
>
>> but we could implicitly treat virtio-mmio
>> devices as cache-coherent regardless of the "dma-coherent" flag. I already
>> prototyped this, but I suspect the devicetree people will push back (and
>> there's a similar patch needed for ACPI).
>>
>> See below. Do you prefer this approach?
>>
>> Will
>>
>> --->8
>
> I'd like to see basically
>
> if (fastmodel)
> 	a pile of special work-arounds
> else
> 	not less hacky but more common virtio work-arounds
>
> :)
>
> And then I can apply whatever comes from @arm.com and not
> worry about breaking actual hardware.

I'm actually seeing the exact same breakage in QEMU right now, so it's 
not fast model related at all. In QEMU we also don't properly set the 
dma-coherent flag, so we run into cache coherency problems.


Alex

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-02 16:40                       ` Will Deacon
@ 2017-02-09 18:17                         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 18:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > I am inclined to say, for 4.10 let's revert
> > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > regression in 4.10.
> 
> No complaints there, as long as we can keep working to fix this for 4.11
> and onwards. You'll also need to cc stable on the revert.
> 
> > So I think we can defer the fix to 4.11.
> > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > for hosts with virtio 1 support.
> > 
> > All this will hopefully push hosts to just implement virtio 1.
> > For mmio the changes are very small: several new registers,
> > that's all. You want this for proper 64 bit dma mask anyway.
> 
> As I've said, virtio 1 will have exactly the same issue unless we start
> requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> devices correctly.
> 
> Will

OK I read up on _CCA in ACPI spec. It says:
The _CCA object returns whether or not a bus-master device supports
hardware managed cache coherency. Expected values are 0 to indicate it
is not supported, and 1 to indicate that it is supported.

So if host is cache coherent, and guest thinks it isn't, we incur
unnecessary overhead by wasting coherent memory.
I get that but you said it actually breaks - why does it?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 18:17                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 18:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > I am inclined to say, for 4.10 let's revert
> > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > regression in 4.10.
> 
> No complaints there, as long as we can keep working to fix this for 4.11
> and onwards. You'll also need to cc stable on the revert.
> 
> > So I think we can defer the fix to 4.11.
> > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > for hosts with virtio 1 support.
> > 
> > All this will hopefully push hosts to just implement virtio 1.
> > For mmio the changes are very small: several new registers,
> > that's all. You want this for proper 64 bit dma mask anyway.
> 
> As I've said, virtio 1 will have exactly the same issue unless we start
> requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> devices correctly.
> 
> Will

OK I read up on _CCA in ACPI spec. It says:
The _CCA object returns whether or not a bus-master device supports
hardware managed cache coherency. Expected values are 0 to indicate it
is not supported, and 1 to indicate that it is supported.

So if host is cache coherent, and guest thinks it isn't, we incur
unnecessary overhead by wasting coherent memory.
I get that but you said it actually breaks - why does it?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:17                         ` Michael S. Tsirkin
@ 2017-02-09 18:31                           ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-09 18:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > I am inclined to say, for 4.10 let's revert
> > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > regression in 4.10.
> > 
> > No complaints there, as long as we can keep working to fix this for 4.11
> > and onwards. You'll also need to cc stable on the revert.
> > 
> > > So I think we can defer the fix to 4.11.
> > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > for hosts with virtio 1 support.
> > > 
> > > All this will hopefully push hosts to just implement virtio 1.
> > > For mmio the changes are very small: several new registers,
> > > that's all. You want this for proper 64 bit dma mask anyway.
> > 
> > As I've said, virtio 1 will have exactly the same issue unless we start
> > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > devices correctly.
> > 
> 
> OK I read up on _CCA in ACPI spec. It says:
> The _CCA object returns whether or not a bus-master device supports
> hardware managed cache coherency. Expected values are 0 to indicate it
> is not supported, and 1 to indicate that it is supported.
> 
> So if host is cache coherent, and guest thinks it isn't, we incur
> unnecessary overhead by wasting coherent memory.
> I get that but you said it actually breaks - why does it?

It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
only becomes a problem when we use the DMA API, because that results in the
guest taking out a non-cacheable mapping. On ARM (and other archs such as
Power), having a mismatch between a cacheable and a non-cacheable mapping
can result in a loss of coherency between the two (for example, if the
non-cacheable gues accesses bypass the cache, but the cacheable host
accesses allocate in the cache).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 18:31                           ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-09 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > I am inclined to say, for 4.10 let's revert
> > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > regression in 4.10.
> > 
> > No complaints there, as long as we can keep working to fix this for 4.11
> > and onwards. You'll also need to cc stable on the revert.
> > 
> > > So I think we can defer the fix to 4.11.
> > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > for hosts with virtio 1 support.
> > > 
> > > All this will hopefully push hosts to just implement virtio 1.
> > > For mmio the changes are very small: several new registers,
> > > that's all. You want this for proper 64 bit dma mask anyway.
> > 
> > As I've said, virtio 1 will have exactly the same issue unless we start
> > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > devices correctly.
> > 
> 
> OK I read up on _CCA in ACPI spec. It says:
> The _CCA object returns whether or not a bus-master device supports
> hardware managed cache coherency. Expected values are 0 to indicate it
> is not supported, and 1 to indicate that it is supported.
> 
> So if host is cache coherent, and guest thinks it isn't, we incur
> unnecessary overhead by wasting coherent memory.
> I get that but you said it actually breaks - why does it?

It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
only becomes a problem when we use the DMA API, because that results in the
guest taking out a non-cacheable mapping. On ARM (and other archs such as
Power), having a mismatch between a cacheable and a non-cacheable mapping
can result in a loss of coherency between the two (for example, if the
non-cacheable gues accesses bypass the cache, but the cacheable host
accesses allocate in the cache).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:31                           ` Will Deacon
@ 2017-02-09 18:49                             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 18:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > > I am inclined to say, for 4.10 let's revert
> > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > > regression in 4.10.
> > > 
> > > No complaints there, as long as we can keep working to fix this for 4.11
> > > and onwards. You'll also need to cc stable on the revert.
> > > 
> > > > So I think we can defer the fix to 4.11.
> > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > > for hosts with virtio 1 support.
> > > > 
> > > > All this will hopefully push hosts to just implement virtio 1.
> > > > For mmio the changes are very small: several new registers,
> > > > that's all. You want this for proper 64 bit dma mask anyway.
> > > 
> > > As I've said, virtio 1 will have exactly the same issue unless we start
> > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > > devices correctly.
> > > 
> > 
> > OK I read up on _CCA in ACPI spec. It says:
> > The _CCA object returns whether or not a bus-master device supports
> > hardware managed cache coherency. Expected values are 0 to indicate it
> > is not supported, and 1 to indicate that it is supported.
> > 
> > So if host is cache coherent, and guest thinks it isn't, we incur
> > unnecessary overhead by wasting coherent memory.
> > I get that but you said it actually breaks - why does it?
> 
> It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
> only becomes a problem when we use the DMA API, because that results in the
> guest taking out a non-cacheable mapping. On ARM (and other archs such as
> Power), having a mismatch between a cacheable and a non-cacheable mapping
> can result in a loss of coherency between the two (for example, if the
> non-cacheable gues accesses bypass the cache, but the cacheable host
> accesses allocate in the cache).
> 
> Will

I see. And I guess using a cacheable mapping is significantly faster.
I would say we want to typically use cacheable for virtio then,
whether we bypass the IOMMU or not. I guess this is why we always set
_CCA/DT correctly, right?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 18:49                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 18:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > > I am inclined to say, for 4.10 let's revert
> > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > > regression in 4.10.
> > > 
> > > No complaints there, as long as we can keep working to fix this for 4.11
> > > and onwards. You'll also need to cc stable on the revert.
> > > 
> > > > So I think we can defer the fix to 4.11.
> > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > > for hosts with virtio 1 support.
> > > > 
> > > > All this will hopefully push hosts to just implement virtio 1.
> > > > For mmio the changes are very small: several new registers,
> > > > that's all. You want this for proper 64 bit dma mask anyway.
> > > 
> > > As I've said, virtio 1 will have exactly the same issue unless we start
> > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > > devices correctly.
> > > 
> > 
> > OK I read up on _CCA in ACPI spec. It says:
> > The _CCA object returns whether or not a bus-master device supports
> > hardware managed cache coherency. Expected values are 0 to indicate it
> > is not supported, and 1 to indicate that it is supported.
> > 
> > So if host is cache coherent, and guest thinks it isn't, we incur
> > unnecessary overhead by wasting coherent memory.
> > I get that but you said it actually breaks - why does it?
> 
> It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
> only becomes a problem when we use the DMA API, because that results in the
> guest taking out a non-cacheable mapping. On ARM (and other archs such as
> Power), having a mismatch between a cacheable and a non-cacheable mapping
> can result in a loss of coherency between the two (for example, if the
> non-cacheable gues accesses bypass the cache, but the cacheable host
> accesses allocate in the cache).
> 
> Will

I see. And I guess using a cacheable mapping is significantly faster.
I would say we want to typically use cacheable for virtio then,
whether we bypass the IOMMU or not. I guess this is why we always set
_CCA/DT correctly, right?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:49                             ` Michael S. Tsirkin
@ 2017-02-09 18:54                               ` Ard Biesheuvel
  -1 siblings, 0 replies; 59+ messages in thread
From: Ard Biesheuvel @ 2017-02-09 18:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Mark Rutland, devicetree, Pawel Moll, jasowang, Will Deacon,
	virtualization, Rob Herring, Robin Murphy, linux-arm-kernel

On 9 February 2017 at 18:49, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
>> On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
>> > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
>> > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
>> > > > I am inclined to say, for 4.10 let's revert
>> > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
>> > > > regression in 4.10.
>> > >
>> > > No complaints there, as long as we can keep working to fix this for 4.11
>> > > and onwards. You'll also need to cc stable on the revert.
>> > >
>> > > > So I think we can defer the fix to 4.11.
>> > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
>> > > > for hosts with virtio 1 support.
>> > > >
>> > > > All this will hopefully push hosts to just implement virtio 1.
>> > > > For mmio the changes are very small: several new registers,
>> > > > that's all. You want this for proper 64 bit dma mask anyway.
>> > >
>> > > As I've said, virtio 1 will have exactly the same issue unless we start
>> > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
>> > > devices correctly.
>> > >
>> >
>> > OK I read up on _CCA in ACPI spec. It says:
>> > The _CCA object returns whether or not a bus-master device supports
>> > hardware managed cache coherency. Expected values are 0 to indicate it
>> > is not supported, and 1 to indicate that it is supported.
>> >
>> > So if host is cache coherent, and guest thinks it isn't, we incur
>> > unnecessary overhead by wasting coherent memory.
>> > I get that but you said it actually breaks - why does it?
>>
>> It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
>> only becomes a problem when we use the DMA API, because that results in the
>> guest taking out a non-cacheable mapping. On ARM (and other archs such as
>> Power), having a mismatch between a cacheable and a non-cacheable mapping
>> can result in a loss of coherency between the two (for example, if the
>> non-cacheable gues accesses bypass the cache, but the cacheable host
>> accesses allocate in the cache).
>>
>> Will
>
> I see. And I guess using a cacheable mapping is significantly faster.
> I would say we want to typically use cacheable for virtio then,
> whether we bypass the IOMMU or not. I guess this is why we always set
> _CCA/DT correctly, right?
>

The point is that you don't get to choose: if the hardware is DMA
coherent, the CPU should use cacheable mappings, or you may lose data.
If the hardware is non-coherent, the CPU should use non-cacheable
mappings for the same reason.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:49                             ` Michael S. Tsirkin
  (?)
@ 2017-02-09 18:54                             ` Ard Biesheuvel
  -1 siblings, 0 replies; 59+ messages in thread
From: Ard Biesheuvel @ 2017-02-09 18:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Mark Rutland, devicetree, Pawel Moll, Will Deacon,
	virtualization, Rob Herring, Robin Murphy, linux-arm-kernel

On 9 February 2017 at 18:49, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
>> On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
>> > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
>> > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
>> > > > I am inclined to say, for 4.10 let's revert
>> > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
>> > > > regression in 4.10.
>> > >
>> > > No complaints there, as long as we can keep working to fix this for 4.11
>> > > and onwards. You'll also need to cc stable on the revert.
>> > >
>> > > > So I think we can defer the fix to 4.11.
>> > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
>> > > > for hosts with virtio 1 support.
>> > > >
>> > > > All this will hopefully push hosts to just implement virtio 1.
>> > > > For mmio the changes are very small: several new registers,
>> > > > that's all. You want this for proper 64 bit dma mask anyway.
>> > >
>> > > As I've said, virtio 1 will have exactly the same issue unless we start
>> > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
>> > > devices correctly.
>> > >
>> >
>> > OK I read up on _CCA in ACPI spec. It says:
>> > The _CCA object returns whether or not a bus-master device supports
>> > hardware managed cache coherency. Expected values are 0 to indicate it
>> > is not supported, and 1 to indicate that it is supported.
>> >
>> > So if host is cache coherent, and guest thinks it isn't, we incur
>> > unnecessary overhead by wasting coherent memory.
>> > I get that but you said it actually breaks - why does it?
>>
>> It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
>> only becomes a problem when we use the DMA API, because that results in the
>> guest taking out a non-cacheable mapping. On ARM (and other archs such as
>> Power), having a mismatch between a cacheable and a non-cacheable mapping
>> can result in a loss of coherency between the two (for example, if the
>> non-cacheable gues accesses bypass the cache, but the cacheable host
>> accesses allocate in the cache).
>>
>> Will
>
> I see. And I guess using a cacheable mapping is significantly faster.
> I would say we want to typically use cacheable for virtio then,
> whether we bypass the IOMMU or not. I guess this is why we always set
> _CCA/DT correctly, right?
>

The point is that you don't get to choose: if the hardware is DMA
coherent, the CPU should use cacheable mappings, or you may lose data.
If the hardware is non-coherent, the CPU should use non-cacheable
mappings for the same reason.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 18:54                               ` Ard Biesheuvel
  0 siblings, 0 replies; 59+ messages in thread
From: Ard Biesheuvel @ 2017-02-09 18:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 9 February 2017 at 18:49, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
>> On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
>> > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
>> > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
>> > > > I am inclined to say, for 4.10 let's revert
>> > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
>> > > > regression in 4.10.
>> > >
>> > > No complaints there, as long as we can keep working to fix this for 4.11
>> > > and onwards. You'll also need to cc stable on the revert.
>> > >
>> > > > So I think we can defer the fix to 4.11.
>> > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
>> > > > for hosts with virtio 1 support.
>> > > >
>> > > > All this will hopefully push hosts to just implement virtio 1.
>> > > > For mmio the changes are very small: several new registers,
>> > > > that's all. You want this for proper 64 bit dma mask anyway.
>> > >
>> > > As I've said, virtio 1 will have exactly the same issue unless we start
>> > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
>> > > devices correctly.
>> > >
>> >
>> > OK I read up on _CCA in ACPI spec. It says:
>> > The _CCA object returns whether or not a bus-master device supports
>> > hardware managed cache coherency. Expected values are 0 to indicate it
>> > is not supported, and 1 to indicate that it is supported.
>> >
>> > So if host is cache coherent, and guest thinks it isn't, we incur
>> > unnecessary overhead by wasting coherent memory.
>> > I get that but you said it actually breaks - why does it?
>>
>> It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
>> only becomes a problem when we use the DMA API, because that results in the
>> guest taking out a non-cacheable mapping. On ARM (and other archs such as
>> Power), having a mismatch between a cacheable and a non-cacheable mapping
>> can result in a loss of coherency between the two (for example, if the
>> non-cacheable gues accesses bypass the cache, but the cacheable host
>> accesses allocate in the cache).
>>
>> Will
>
> I see. And I guess using a cacheable mapping is significantly faster.
> I would say we want to typically use cacheable for virtio then,
> whether we bypass the IOMMU or not. I guess this is why we always set
> _CCA/DT correctly, right?
>

The point is that you don't get to choose: if the hardware is DMA
coherent, the CPU should use cacheable mappings, or you may lose data.
If the hardware is non-coherent, the CPU should use non-cacheable
mappings for the same reason.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:49                             ` Michael S. Tsirkin
@ 2017-02-09 18:56                               ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-09 18:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 09, 2017 at 08:49:41PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> > On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > > > I am inclined to say, for 4.10 let's revert
> > > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > > > regression in 4.10.
> > > > 
> > > > No complaints there, as long as we can keep working to fix this for 4.11
> > > > and onwards. You'll also need to cc stable on the revert.
> > > > 
> > > > > So I think we can defer the fix to 4.11.
> > > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > > > for hosts with virtio 1 support.
> > > > > 
> > > > > All this will hopefully push hosts to just implement virtio 1.
> > > > > For mmio the changes are very small: several new registers,
> > > > > that's all. You want this for proper 64 bit dma mask anyway.
> > > > 
> > > > As I've said, virtio 1 will have exactly the same issue unless we start
> > > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > > > devices correctly.
> > > > 
> > > 
> > > OK I read up on _CCA in ACPI spec. It says:
> > > The _CCA object returns whether or not a bus-master device supports
> > > hardware managed cache coherency. Expected values are 0 to indicate it
> > > is not supported, and 1 to indicate that it is supported.
> > > 
> > > So if host is cache coherent, and guest thinks it isn't, we incur
> > > unnecessary overhead by wasting coherent memory.
> > > I get that but you said it actually breaks - why does it?
> > 
> > It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
> > only becomes a problem when we use the DMA API, because that results in the
> > guest taking out a non-cacheable mapping. On ARM (and other archs such as
> > Power), having a mismatch between a cacheable and a non-cacheable mapping
> > can result in a loss of coherency between the two (for example, if the
> > non-cacheable gues accesses bypass the cache, but the cacheable host
> > accesses allocate in the cache).
> > 
> I see. And I guess using a cacheable mapping is significantly faster.
> I would say we want to typically use cacheable for virtio then,
> whether we bypass the IOMMU or not. I guess this is why we always set
> _CCA/DT correctly, right?

At the moment, _CCA/DT is pretty much never set correctly for virtio-mmio
(that is, it isn't set even though the device is cache coherent). If it
*was* set correctly, then we wouldn't have needed to revert my patch.

Robin's patch to only use the DMA API if _CCA/DT is present would work
(although the thing that he posted was buggy iirc).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 18:56                               ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-09 18:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 08:49:41PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> > On Thu, Feb 09, 2017 at 08:17:16PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 04:40:49PM +0000, Will Deacon wrote:
> > > > On Thu, Feb 02, 2017 at 06:30:28PM +0200, Michael S. Tsirkin wrote:
> > > > > I am inclined to say, for 4.10 let's revert
> > > > > c7070619f3408d9a0dffbed9149e6f00479cf43b since what it fixes is not a
> > > > > regression in 4.10.
> > > > 
> > > > No complaints there, as long as we can keep working to fix this for 4.11
> > > > and onwards. You'll also need to cc stable on the revert.
> > > > 
> > > > > So I think we can defer the fix to 4.11.
> > > > > I think we still want f7f6634d23830ff74335734fbdb28ea109c1f349
> > > > > for hosts with virtio 1 support.
> > > > > 
> > > > > All this will hopefully push hosts to just implement virtio 1.
> > > > > For mmio the changes are very small: several new registers,
> > > > > that's all. You want this for proper 64 bit dma mask anyway.
> > > > 
> > > > As I've said, virtio 1 will have exactly the same issue unless we start
> > > > requiring firmware to advertise dma-coherent/_CCA for virtio-mmio
> > > > devices correctly.
> > > > 
> > > 
> > > OK I read up on _CCA in ACPI spec. It says:
> > > The _CCA object returns whether or not a bus-master device supports
> > > hardware managed cache coherency. Expected values are 0 to indicate it
> > > is not supported, and 1 to indicate that it is supported.
> > > 
> > > So if host is cache coherent, and guest thinks it isn't, we incur
> > > unnecessary overhead by wasting coherent memory.
> > > I get that but you said it actually breaks - why does it?
> > 
> > It breaks because QEMU doesn't set _CCA for virtio-mmio devices, and that
> > only becomes a problem when we use the DMA API, because that results in the
> > guest taking out a non-cacheable mapping. On ARM (and other archs such as
> > Power), having a mismatch between a cacheable and a non-cacheable mapping
> > can result in a loss of coherency between the two (for example, if the
> > non-cacheable gues accesses bypass the cache, but the cacheable host
> > accesses allocate in the cache).
> > 
> I see. And I guess using a cacheable mapping is significantly faster.
> I would say we want to typically use cacheable for virtio then,
> whether we bypass the IOMMU or not. I guess this is why we always set
> _CCA/DT correctly, right?

At the moment, _CCA/DT is pretty much never set correctly for virtio-mmio
(that is, it isn't set even though the device is cache coherent). If it
*was* set correctly, then we wouldn't have needed to revert my patch.

Robin's patch to only use the DMA API if _CCA/DT is present would work
(although the thing that he posted was buggy iirc).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-08 12:58             ` Alexander Graf
@ 2017-02-09 20:57               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 20:57 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Mark Rutland, rob.herring, devicetree, Will Deacon,
	virtualization, linux-arm-kernel

On Wed, Feb 08, 2017 at 01:58:10PM +0100, Alexander Graf wrote:
> On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
> > On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> > > On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 7e38ed79c3fc..961af25b385c 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -20,6 +20,7 @@
> > > > >  #include <linux/virtio_ring.h>
> > > > >  #include <linux/virtio_config.h>
> > > > >  #include <linux/device.h>
> > > > > +#include <linux/property.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <linux/module.h>
> > > > >  #include <linux/hrtimer.h>
> > > > > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > > > >  		return true;
> > > > > 
> > > > >  	/*
> > > > > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > > > > -	 * so always use them with legacy devices.
> > > > > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > > > > +	 * thing, so always use them with legacy devices. However, using
> > > > > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > > > > +	 * forgotten to tell us, is going to break badly; since this situation
> > > > > +	 * already exists in the wild, maintain the old behaviour there.
> > > > >  	 */
> > > > > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > > > > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > > > > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> > > > >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> > > > > 
> > > > >  	return false;
> > > > 
> > > > This is exactly what I feared.
> > > 
> > > Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
> > > is used) and it also works on the fastmodel if you disable cache-modelling
> > > (which is needed to make the thing run at a usable pace) so we didn't spot
> > > this in testing.
> > > 
> > > > Could we identify fastboot and do the special dance just for it?
> > > 
> > > [assuming you mean fastmodel instead of fastboot]
> > > 
> > > > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > > > here and deviating from what every other legacy device without exception
> > > > did for years. If this means fastboot will need to update to virtio 1,
> > > > all the better.
> > > 
> > > The problem still exists with virtio 1, unless we require that the
> > > "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> > > is advertised by the device (which is what I suggested in my reply).
> > 
> > I'm not ignoring that, but I need to understand that part a bit better.
> > I'll reply to that patch in a day or two after looking at how _CCA is
> > supposed to work.
> > 
> > > We can't detect the fastmodel,
> > 
> > Surely, it puts a hardware id somewhere? I think you mean
> > fastmodel isn't always affected, right?
> > 
> > > but we could implicitly treat virtio-mmio
> > > devices as cache-coherent regardless of the "dma-coherent" flag. I already
> > > prototyped this, but I suspect the devicetree people will push back (and
> > > there's a similar patch needed for ACPI).
> > > 
> > > See below. Do you prefer this approach?
> > > 
> > > Will
> > > 
> > > --->8
> > 
> > I'd like to see basically
> > 
> > if (fastmodel)
> > 	a pile of special work-arounds
> > else
> > 	not less hacky but more common virtio work-arounds
> > 
> > :)
> > 
> > And then I can apply whatever comes from @arm.com and not
> > worry about breaking actual hardware.
> 
> I'm actually seeing the exact same breakage in QEMU right now, so it's not
> fast model related at all. In QEMU we also don't properly set the
> dma-coherent flag, so we run into cache coherency problems.
> 
> 
> Alex

But with latest revert QEMU should now work, right?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 20:57               ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-09 20:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 08, 2017 at 01:58:10PM +0100, Alexander Graf wrote:
> On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
> > On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
> > > On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 7e38ed79c3fc..961af25b385c 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -20,6 +20,7 @@
> > > > >  #include <linux/virtio_ring.h>
> > > > >  #include <linux/virtio_config.h>
> > > > >  #include <linux/device.h>
> > > > > +#include <linux/property.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <linux/module.h>
> > > > >  #include <linux/hrtimer.h>
> > > > > @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
> > > > >  		return true;
> > > > > 
> > > > >  	/*
> > > > > -	 * On ARM-based machines, the DMA ops will do the right thing,
> > > > > -	 * so always use them with legacy devices.
> > > > > +	 * On ARM-based machines, the coherent DMA ops will do the right
> > > > > +	 * thing, so always use them with legacy devices. However, using
> > > > > +	 * non-coherent DMA when the host *is* actually coherent, but has
> > > > > +	 * forgotten to tell us, is going to break badly; since this situation
> > > > > +	 * already exists in the wild, maintain the old behaviour there.
> > > > >  	 */
> > > > > -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
> > > > > +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
> > > > > +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
> > > > >  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> > > > > 
> > > > >  	return false;
> > > > 
> > > > This is exactly what I feared.
> > > 
> > > Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
> > > is used) and it also works on the fastmodel if you disable cache-modelling
> > > (which is needed to make the thing run at a usable pace) so we didn't spot
> > > this in testing.
> > > 
> > > > Could we identify fastboot and do the special dance just for it?
> > > 
> > > [assuming you mean fastmodel instead of fastboot]
> > > 
> > > > I'd like to do that instead. It's fastboot doing the unreasonable thing
> > > > here and deviating from what every other legacy device without exception
> > > > did for years. If this means fastboot will need to update to virtio 1,
> > > > all the better.
> > > 
> > > The problem still exists with virtio 1, unless we require that the
> > > "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
> > > is advertised by the device (which is what I suggested in my reply).
> > 
> > I'm not ignoring that, but I need to understand that part a bit better.
> > I'll reply to that patch in a day or two after looking at how _CCA is
> > supposed to work.
> > 
> > > We can't detect the fastmodel,
> > 
> > Surely, it puts a hardware id somewhere? I think you mean
> > fastmodel isn't always affected, right?
> > 
> > > but we could implicitly treat virtio-mmio
> > > devices as cache-coherent regardless of the "dma-coherent" flag. I already
> > > prototyped this, but I suspect the devicetree people will push back (and
> > > there's a similar patch needed for ACPI).
> > > 
> > > See below. Do you prefer this approach?
> > > 
> > > Will
> > > 
> > > --->8
> > 
> > I'd like to see basically
> > 
> > if (fastmodel)
> > 	a pile of special work-arounds
> > else
> > 	not less hacky but more common virtio work-arounds
> > 
> > :)
> > 
> > And then I can apply whatever comes from @arm.com and not
> > worry about breaking actual hardware.
> 
> I'm actually seeing the exact same breakage in QEMU right now, so it's not
> fast model related at all. In QEMU we also don't properly set the
> dma-coherent flag, so we run into cache coherency problems.
> 
> 
> Alex

But with latest revert QEMU should now work, right?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 20:57               ` Michael S. Tsirkin
@ 2017-02-09 22:20                 ` Alexander Graf
  -1 siblings, 0 replies; 59+ messages in thread
From: Alexander Graf @ 2017-02-09 22:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Mark Rutland, rob.herring, devicetree, Will Deacon,
	virtualization, linux-arm-kernel



On 09/02/2017 21:57, Michael S. Tsirkin wrote:
> On Wed, Feb 08, 2017 at 01:58:10PM +0100, Alexander Graf wrote:
>> On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>>> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
>>>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>>>>> index 7e38ed79c3fc..961af25b385c 100644
>>>>>> --- a/drivers/virtio/virtio_ring.c
>>>>>> +++ b/drivers/virtio/virtio_ring.c
>>>>>> @@ -20,6 +20,7 @@
>>>>>>  #include <linux/virtio_ring.h>
>>>>>>  #include <linux/virtio_config.h>
>>>>>>  #include <linux/device.h>
>>>>>> +#include <linux/property.h>
>>>>>>  #include <linux/slab.h>
>>>>>>  #include <linux/module.h>
>>>>>>  #include <linux/hrtimer.h>
>>>>>> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>>>>  		return true;
>>>>>>
>>>>>>  	/*
>>>>>> -	 * On ARM-based machines, the DMA ops will do the right thing,
>>>>>> -	 * so always use them with legacy devices.
>>>>>> +	 * On ARM-based machines, the coherent DMA ops will do the right
>>>>>> +	 * thing, so always use them with legacy devices. However, using
>>>>>> +	 * non-coherent DMA when the host *is* actually coherent, but has
>>>>>> +	 * forgotten to tell us, is going to break badly; since this situation
>>>>>> +	 * already exists in the wild, maintain the old behaviour there.
>>>>>>  	 */
>>>>>> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
>>>>>> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
>>>>>> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>>>>>>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>>>>>>
>>>>>>  	return false;
>>>>>
>>>>> This is exactly what I feared.
>>>>
>>>> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
>>>> is used) and it also works on the fastmodel if you disable cache-modelling
>>>> (which is needed to make the thing run at a usable pace) so we didn't spot
>>>> this in testing.
>>>>
>>>>> Could we identify fastboot and do the special dance just for it?
>>>>
>>>> [assuming you mean fastmodel instead of fastboot]
>>>>
>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>>> here and deviating from what every other legacy device without exception
>>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>>> all the better.
>>>>
>>>> The problem still exists with virtio 1, unless we require that the
>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>>> is advertised by the device (which is what I suggested in my reply).
>>>
>>> I'm not ignoring that, but I need to understand that part a bit better.
>>> I'll reply to that patch in a day or two after looking at how _CCA is
>>> supposed to work.
>>>
>>>> We can't detect the fastmodel,
>>>
>>> Surely, it puts a hardware id somewhere? I think you mean
>>> fastmodel isn't always affected, right?
>>>
>>>> but we could implicitly treat virtio-mmio
>>>> devices as cache-coherent regardless of the "dma-coherent" flag. I already
>>>> prototyped this, but I suspect the devicetree people will push back (and
>>>> there's a similar patch needed for ACPI).
>>>>
>>>> See below. Do you prefer this approach?
>>>>
>>>> Will
>>>>
>>>> --->8
>>>
>>> I'd like to see basically
>>>
>>> if (fastmodel)
>>> 	a pile of special work-arounds
>>> else
>>> 	not less hacky but more common virtio work-arounds
>>>
>>> :)
>>>
>>> And then I can apply whatever comes from @arm.com and not
>>> worry about breaking actual hardware.
>>
>> I'm actually seeing the exact same breakage in QEMU right now, so it's not
>> fast model related at all. In QEMU we also don't properly set the
>> dma-coherent flag, so we run into cache coherency problems.
>>
>>
>> Alex
>
> But with latest revert QEMU should now work, right?

Yes, with the revert QEMU works fine. But please keep in mind that it's 
not just the fast model that's broken here :).


Alex

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-09 22:20                 ` Alexander Graf
  0 siblings, 0 replies; 59+ messages in thread
From: Alexander Graf @ 2017-02-09 22:20 UTC (permalink / raw)
  To: linux-arm-kernel



On 09/02/2017 21:57, Michael S. Tsirkin wrote:
> On Wed, Feb 08, 2017 at 01:58:10PM +0100, Alexander Graf wrote:
>> On 02/01/2017 08:19 PM, Michael S. Tsirkin wrote:
>>> On Wed, Feb 01, 2017 at 06:27:09PM +0000, Will Deacon wrote:
>>>> On Wed, Feb 01, 2017 at 08:09:21PM +0200, Michael S. Tsirkin wrote:
>>>>> On Wed, Feb 01, 2017 at 12:25:57PM +0000, Robin Murphy wrote:
>>>>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>>>>> index 7e38ed79c3fc..961af25b385c 100644
>>>>>> --- a/drivers/virtio/virtio_ring.c
>>>>>> +++ b/drivers/virtio/virtio_ring.c
>>>>>> @@ -20,6 +20,7 @@
>>>>>>  #include <linux/virtio_ring.h>
>>>>>>  #include <linux/virtio_config.h>
>>>>>>  #include <linux/device.h>
>>>>>> +#include <linux/property.h>
>>>>>>  #include <linux/slab.h>
>>>>>>  #include <linux/module.h>
>>>>>>  #include <linux/hrtimer.h>
>>>>>> @@ -160,10 +161,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>>>>>>  		return true;
>>>>>>
>>>>>>  	/*
>>>>>> -	 * On ARM-based machines, the DMA ops will do the right thing,
>>>>>> -	 * so always use them with legacy devices.
>>>>>> +	 * On ARM-based machines, the coherent DMA ops will do the right
>>>>>> +	 * thing, so always use them with legacy devices. However, using
>>>>>> +	 * non-coherent DMA when the host *is* actually coherent, but has
>>>>>> +	 * forgotten to tell us, is going to break badly; since this situation
>>>>>> +	 * already exists in the wild, maintain the old behaviour there.
>>>>>>  	 */
>>>>>> -	if (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64))
>>>>>> +	if ((IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_ARM64)) &&
>>>>>> +	    device_get_dma_attr(&vdev->dev) == DEV_DMA_COHERENT)
>>>>>>  		return !virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
>>>>>>
>>>>>>  	return false;
>>>>>
>>>>> This is exactly what I feared.
>>>>
>>>> Yes, sorry about this. It works fine for virtio-pci (where "dma-coherent"
>>>> is used) and it also works on the fastmodel if you disable cache-modelling
>>>> (which is needed to make the thing run at a usable pace) so we didn't spot
>>>> this in testing.
>>>>
>>>>> Could we identify fastboot and do the special dance just for it?
>>>>
>>>> [assuming you mean fastmodel instead of fastboot]
>>>>
>>>>> I'd like to do that instead. It's fastboot doing the unreasonable thing
>>>>> here and deviating from what every other legacy device without exception
>>>>> did for years. If this means fastboot will need to update to virtio 1,
>>>>> all the better.
>>>>
>>>> The problem still exists with virtio 1, unless we require that the
>>>> "dma-coherent" property is set/unset correctly when VIRTIO_F_IOMMU_PLATFORM
>>>> is advertised by the device (which is what I suggested in my reply).
>>>
>>> I'm not ignoring that, but I need to understand that part a bit better.
>>> I'll reply to that patch in a day or two after looking at how _CCA is
>>> supposed to work.
>>>
>>>> We can't detect the fastmodel,
>>>
>>> Surely, it puts a hardware id somewhere? I think you mean
>>> fastmodel isn't always affected, right?
>>>
>>>> but we could implicitly treat virtio-mmio
>>>> devices as cache-coherent regardless of the "dma-coherent" flag. I already
>>>> prototyped this, but I suspect the devicetree people will push back (and
>>>> there's a similar patch needed for ACPI).
>>>>
>>>> See below. Do you prefer this approach?
>>>>
>>>> Will
>>>>
>>>> --->8
>>>
>>> I'd like to see basically
>>>
>>> if (fastmodel)
>>> 	a pile of special work-arounds
>>> else
>>> 	not less hacky but more common virtio work-arounds
>>>
>>> :)
>>>
>>> And then I can apply whatever comes from @arm.com and not
>>> worry about breaking actual hardware.
>>
>> I'm actually seeing the exact same breakage in QEMU right now, so it's not
>> fast model related at all. In QEMU we also don't properly set the
>> dma-coherent flag, so we run into cache coherency problems.
>>
>>
>> Alex
>
> But with latest revert QEMU should now work, right?

Yes, with the revert QEMU works fine. But please keep in mind that it's 
not just the fast model that's broken here :).


Alex

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-09 18:31                           ` Will Deacon
@ 2017-02-10 17:16                             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-10 17:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> On ARM (and other archs such as
> Power), having a mismatch between a cacheable and a non-cacheable mapping
> can result in a loss of coherency between the two (for example, if the
> non-cacheable gues accesses bypass the cache, but the cacheable host
> accesses allocate in the cache).

I guess it's an optimization to avoid cache snoops for non-cacheable accesses?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-10 17:16                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 59+ messages in thread
From: Michael S. Tsirkin @ 2017-02-10 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> On ARM (and other archs such as
> Power), having a mismatch between a cacheable and a non-cacheable mapping
> can result in a loss of coherency between the two (for example, if the
> non-cacheable gues accesses bypass the cache, but the cacheable host
> accesses allocate in the cache).

I guess it's an optimization to avoid cache snoops for non-cacheable accesses?

-- 
MST

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] virtio: Try to untangle DMA coherency
  2017-02-10 17:16                             ` Michael S. Tsirkin
@ 2017-02-13 11:57                               ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-13 11:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, devicetree, pawel.moll, virtualization, robh+dt,
	Robin Murphy, linux-arm-kernel

On Fri, Feb 10, 2017 at 07:16:10PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> > On ARM (and other archs such as
> > Power), having a mismatch between a cacheable and a non-cacheable mapping
> > can result in a loss of coherency between the two (for example, if the
> > non-cacheable gues accesses bypass the cache, but the cacheable host
> > accesses allocate in the cache).
> 
> I guess it's an optimization to avoid cache snoops for non-cacheable accesses?

The architecture doesn't rationalise the decision, but a micro-architecture
could indeed implement the optimisation you suggest (and we do observe the
loss of coherency in practice).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] virtio: Try to untangle DMA coherency
@ 2017-02-13 11:57                               ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-02-13 11:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 10, 2017 at 07:16:10PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 09, 2017 at 06:31:18PM +0000, Will Deacon wrote:
> > On ARM (and other archs such as
> > Power), having a mismatch between a cacheable and a non-cacheable mapping
> > can result in a loss of coherency between the two (for example, if the
> > non-cacheable gues accesses bypass the cache, but the cacheable host
> > accesses allocate in the cache).
> 
> I guess it's an optimization to avoid cache snoops for non-cacheable accesses?

The architecture doesn't rationalise the decision, but a micro-architecture
could indeed implement the optimisation you suggest (and we do observe the
loss of coherency in practice).

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2017-02-13 11:57 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-01 12:25 [PATCH] virtio: Try to untangle DMA coherency Robin Murphy
2017-02-01 12:25 ` Robin Murphy
2017-02-01 14:57 ` Will Deacon
2017-02-01 14:57   ` Will Deacon
     [not found]   ` <20170201145711.GC8177-5wv7dgnIgG8@public.gmane.org>
2017-02-01 17:58     ` Rob Herring
2017-02-01 17:58       ` Rob Herring
2017-02-01 17:58   ` Rob Herring
     [not found] ` <8a6494f6409c20b4609cd6bdcdd751f68b5c0564.1485951731.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2017-02-01 18:09   ` Michael S. Tsirkin
2017-02-01 18:09     ` Michael S. Tsirkin
2017-02-01 18:27     ` Will Deacon
2017-02-01 18:27       ` Will Deacon
2017-02-01 18:27       ` Will Deacon
2017-02-01 19:19       ` Michael S. Tsirkin
     [not found]       ` <20170201182659.GM8177-5wv7dgnIgG8@public.gmane.org>
2017-02-01 19:19         ` Michael S. Tsirkin
2017-02-01 19:19           ` Michael S. Tsirkin
     [not found]           ` <20170201210648-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-02-02 11:26             ` Will Deacon
2017-02-02 11:26               ` Will Deacon
2017-02-02 13:34               ` Robin Murphy
2017-02-02 13:34                 ` Robin Murphy
2017-02-02 16:16                 ` Michael S. Tsirkin
2017-02-02 16:30                 ` Michael S. Tsirkin
     [not found]                 ` <bb041fbe-3fde-0dc1-ae28-cca5acfe9dc5-5wv7dgnIgG8@public.gmane.org>
2017-02-02 16:16                   ` Michael S. Tsirkin
2017-02-02 16:16                     ` Michael S. Tsirkin
2017-02-02 16:39                     ` Marc Zyngier
     [not found]                     ` <20170202181357-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2017-02-02 16:39                       ` Marc Zyngier
2017-02-02 16:39                         ` Marc Zyngier
2017-02-02 16:44                         ` Michael S. Tsirkin
     [not found]                         ` <d9732885-7cd2-461c-6812-37581294b47f-5wv7dgnIgG8@public.gmane.org>
2017-02-02 16:44                           ` Michael S. Tsirkin
2017-02-02 16:44                             ` Michael S. Tsirkin
2017-02-02 16:30                   ` Michael S. Tsirkin
2017-02-02 16:30                     ` Michael S. Tsirkin
2017-02-02 16:40                     ` Will Deacon
2017-02-02 16:40                       ` Will Deacon
2017-02-02 16:45                       ` Michael S. Tsirkin
     [not found]                       ` <20170202164049.GI13839-5wv7dgnIgG8@public.gmane.org>
2017-02-02 16:45                         ` Michael S. Tsirkin
2017-02-02 16:45                           ` Michael S. Tsirkin
2017-02-09 18:17                       ` Michael S. Tsirkin
2017-02-09 18:17                         ` Michael S. Tsirkin
2017-02-09 18:31                         ` Will Deacon
2017-02-09 18:31                           ` Will Deacon
2017-02-09 18:49                           ` Michael S. Tsirkin
2017-02-09 18:49                             ` Michael S. Tsirkin
2017-02-09 18:54                             ` Ard Biesheuvel
2017-02-09 18:54                             ` Ard Biesheuvel
2017-02-09 18:54                               ` Ard Biesheuvel
2017-02-09 18:56                             ` Will Deacon
2017-02-09 18:56                               ` Will Deacon
2017-02-10 17:16                           ` Michael S. Tsirkin
2017-02-10 17:16                             ` Michael S. Tsirkin
2017-02-13 11:57                             ` Will Deacon
2017-02-13 11:57                               ` Will Deacon
2017-02-02 11:26           ` Will Deacon
2017-02-08 12:58           ` Alexander Graf
2017-02-08 12:58             ` Alexander Graf
2017-02-09 20:57             ` Michael S. Tsirkin
2017-02-09 20:57               ` Michael S. Tsirkin
2017-02-09 22:20               ` Alexander Graf
2017-02-09 22:20                 ` Alexander Graf
2017-02-01 18:09 ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.