All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-05 16:15 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:15 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
by a platform as bypassing elements in the DMA coherent CPU cache
hierarchy. A driver can command a device to set this bit on some of its
transactions as a micro-optimization.

However, the driver is now responsible to synchronize the CPU cache with
the DMA that bypassed it. On x86 this is done through the wbinvd
instruction, and the i915 GPU driver is the only Linux DMA driver that
calls it.

The problem comes that KVM on x86 will normally disable the wbinvd
instruction in the guest and render it a NOP. As the driver running in the
guest is not aware the wbinvd doesn't work it may still cause the device
to set the no-snoop bit and the platform will bypass the CPU cache.
Without a working wbinvd there is no way to re-synchronize the CPU cache
and the driver in the VM has data corruption.

Thus, we see a general direction on x86 that the IOMMU HW is able to block
the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
to NOP the wbinvd without causing any data corruption.

This control for Intel IOMMU was exposed by using IOMMU_CACHE and
IOMMU_CAP_CACHE_COHERENCY, however these two values now have multiple
meanings and usages beyond blocking no-snoop and the whole thing has
become confused.

Change it so that:
 - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
   device. It is used by the DMA API and set when the DMA API will not be
   doing manual cache coherency operations.

 - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
   device

 - The new optional domain op enforce_cache_coherency() will cause the
   entire domain to block no-snoop requests - ie there is no way for any
   device attached to the domain to opt out of the IOMMU_CACHE behavior.

An iommu driver should implement enforce_cache_coherency() so that by
default domains allow the no-snoop optimization. This leaves it available
to kernel drivers like i915. VFIO will call enforce_cache_coherency()
before establishing any mappings and the domain should then permanently
block no-snoop.

If enforce_cache_coherency() fails VFIO will communicate back through to
KVM into the arch code via kvm_arch_register_noncoherent_dma()
(only implemented by x86) which triggers a working wbinvd to be made
available to the VM.

While other arches are certainly welcome to implement
enforce_cache_coherency(), it is not clear there is any benefit in doing
so.

After this series there are only two calls left to iommu_capable() with a
bus argument which should help Robin's work here.

This is on github: https://github.com/jgunthorpe/linux/commits/intel_no_snoop

Cc: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Jason Gunthorpe (5):
  iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
    dev_is_dma_coherent()
  vfio: Require that devices support DMA cache coherence
  iommu: Introduce the domain op enforce_cache_coherency()
  vfio: Move the Intel no-snoop control off of IOMMU_CACHE
  iommu: Delete IOMMU_CAP_CACHE_COHERENCY

 drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
 drivers/iommu/amd/iommu.c                   |  9 +++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
 drivers/iommu/fsl_pamu_domain.c             |  6 -----
 drivers/iommu/intel/iommu.c                 | 15 ++++++++---
 drivers/iommu/s390-iommu.c                  |  2 --
 drivers/vfio/vfio.c                         |  6 +++++
 drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
 drivers/vhost/vdpa.c                        |  3 ++-
 include/linux/intel-iommu.h                 |  1 +
 include/linux/iommu.h                       |  6 +++--
 13 files changed, 58 insertions(+), 50 deletions(-)


base-commit: 3123109284176b1532874591f7c81f3837bbdc17
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-05 16:15 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:15 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
by a platform as bypassing elements in the DMA coherent CPU cache
hierarchy. A driver can command a device to set this bit on some of its
transactions as a micro-optimization.

However, the driver is now responsible to synchronize the CPU cache with
the DMA that bypassed it. On x86 this is done through the wbinvd
instruction, and the i915 GPU driver is the only Linux DMA driver that
calls it.

The problem comes that KVM on x86 will normally disable the wbinvd
instruction in the guest and render it a NOP. As the driver running in the
guest is not aware the wbinvd doesn't work it may still cause the device
to set the no-snoop bit and the platform will bypass the CPU cache.
Without a working wbinvd there is no way to re-synchronize the CPU cache
and the driver in the VM has data corruption.

Thus, we see a general direction on x86 that the IOMMU HW is able to block
the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
to NOP the wbinvd without causing any data corruption.

This control for Intel IOMMU was exposed by using IOMMU_CACHE and
IOMMU_CAP_CACHE_COHERENCY, however these two values now have multiple
meanings and usages beyond blocking no-snoop and the whole thing has
become confused.

Change it so that:
 - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
   device. It is used by the DMA API and set when the DMA API will not be
   doing manual cache coherency operations.

 - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
   device

 - The new optional domain op enforce_cache_coherency() will cause the
   entire domain to block no-snoop requests - ie there is no way for any
   device attached to the domain to opt out of the IOMMU_CACHE behavior.

An iommu driver should implement enforce_cache_coherency() so that by
default domains allow the no-snoop optimization. This leaves it available
to kernel drivers like i915. VFIO will call enforce_cache_coherency()
before establishing any mappings and the domain should then permanently
block no-snoop.

If enforce_cache_coherency() fails VFIO will communicate back through to
KVM into the arch code via kvm_arch_register_noncoherent_dma()
(only implemented by x86) which triggers a working wbinvd to be made
available to the VM.

While other arches are certainly welcome to implement
enforce_cache_coherency(), it is not clear there is any benefit in doing
so.

After this series there are only two calls left to iommu_capable() with a
bus argument which should help Robin's work here.

This is on github: https://github.com/jgunthorpe/linux/commits/intel_no_snoop

Cc: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Jason Gunthorpe (5):
  iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
    dev_is_dma_coherent()
  vfio: Require that devices support DMA cache coherence
  iommu: Introduce the domain op enforce_cache_coherency()
  vfio: Move the Intel no-snoop control off of IOMMU_CACHE
  iommu: Delete IOMMU_CAP_CACHE_COHERENCY

 drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
 drivers/iommu/amd/iommu.c                   |  9 +++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
 drivers/iommu/fsl_pamu_domain.c             |  6 -----
 drivers/iommu/intel/iommu.c                 | 15 ++++++++---
 drivers/iommu/s390-iommu.c                  |  2 --
 drivers/vfio/vfio.c                         |  6 +++++
 drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
 drivers/vhost/vdpa.c                        |  3 ++-
 include/linux/intel-iommu.h                 |  1 +
 include/linux/iommu.h                       |  6 +++--
 13 files changed, 58 insertions(+), 50 deletions(-)


base-commit: 3123109284176b1532874591f7c81f3837bbdc17
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-05 16:15 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:15 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
by a platform as bypassing elements in the DMA coherent CPU cache
hierarchy. A driver can command a device to set this bit on some of its
transactions as a micro-optimization.

However, the driver is now responsible to synchronize the CPU cache with
the DMA that bypassed it. On x86 this is done through the wbinvd
instruction, and the i915 GPU driver is the only Linux DMA driver that
calls it.

The problem comes that KVM on x86 will normally disable the wbinvd
instruction in the guest and render it a NOP. As the driver running in the
guest is not aware the wbinvd doesn't work it may still cause the device
to set the no-snoop bit and the platform will bypass the CPU cache.
Without a working wbinvd there is no way to re-synchronize the CPU cache
and the driver in the VM has data corruption.

Thus, we see a general direction on x86 that the IOMMU HW is able to block
the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
to NOP the wbinvd without causing any data corruption.

This control for Intel IOMMU was exposed by using IOMMU_CACHE and
IOMMU_CAP_CACHE_COHERENCY, however these two values now have multiple
meanings and usages beyond blocking no-snoop and the whole thing has
become confused.

Change it so that:
 - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
   device. It is used by the DMA API and set when the DMA API will not be
   doing manual cache coherency operations.

 - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
   device

 - The new optional domain op enforce_cache_coherency() will cause the
   entire domain to block no-snoop requests - ie there is no way for any
   device attached to the domain to opt out of the IOMMU_CACHE behavior.

An iommu driver should implement enforce_cache_coherency() so that by
default domains allow the no-snoop optimization. This leaves it available
to kernel drivers like i915. VFIO will call enforce_cache_coherency()
before establishing any mappings and the domain should then permanently
block no-snoop.

If enforce_cache_coherency() fails VFIO will communicate back through to
KVM into the arch code via kvm_arch_register_noncoherent_dma()
(only implemented by x86) which triggers a working wbinvd to be made
available to the VM.

While other arches are certainly welcome to implement
enforce_cache_coherency(), it is not clear there is any benefit in doing
so.

After this series there are only two calls left to iommu_capable() with a
bus argument which should help Robin's work here.

This is on github: https://github.com/jgunthorpe/linux/commits/intel_no_snoop

Cc: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Jason Gunthorpe (5):
  iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
    dev_is_dma_coherent()
  vfio: Require that devices support DMA cache coherence
  iommu: Introduce the domain op enforce_cache_coherency()
  vfio: Move the Intel no-snoop control off of IOMMU_CACHE
  iommu: Delete IOMMU_CAP_CACHE_COHERENCY

 drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
 drivers/iommu/amd/iommu.c                   |  9 +++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
 drivers/iommu/fsl_pamu_domain.c             |  6 -----
 drivers/iommu/intel/iommu.c                 | 15 ++++++++---
 drivers/iommu/s390-iommu.c                  |  2 --
 drivers/vfio/vfio.c                         |  6 +++++
 drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
 drivers/vhost/vdpa.c                        |  3 ++-
 include/linux/intel-iommu.h                 |  1 +
 include/linux/iommu.h                       |  6 +++--
 13 files changed, 58 insertions(+), 50 deletions(-)


base-commit: 3123109284176b1532874591f7c81f3837bbdc17
-- 
2.35.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
@ 2022-04-05 16:16   ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
way to do this is via dev_is_dma_coherent() like the DMA API does. If
IOMMU_CACHE is not supported then these drivers won't work as they don't
call any coherency-restoring routines around their DMAs.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
 drivers/vhost/vdpa.c                     |  3 ++-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 760b254ba42d6b..24d118198ac756 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -42,6 +42,7 @@
 #include <linux/list.h>
 #include <linux/pci.h>
 #include <rdma/ib_verbs.h>
+#include <linux/dma-map-ops.h>
 
 #include "usnic_log.h"
 #include "usnic_uiom.h"
@@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	struct usnic_uiom_dev *uiom_dev;
 	int err;
 
+	if (!dev_is_dma_coherent(dev)) {
+		usnic_err("IOMMU of %s does not support cache coherency\n",
+				dev_name(dev));
+		return -EINVAL;
+	}
+
 	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
 	if (!uiom_dev)
 		return -ENOMEM;
@@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	if (err)
 		goto out_free_dev;
 
-	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
-		usnic_err("IOMMU of %s does not support cache coherency\n",
-				dev_name(dev));
-		err = -EINVAL;
-		goto out_detach_device;
-	}
-
 	spin_lock(&pd->lock);
 	list_add_tail(&uiom_dev->link, &pd->devs);
 	pd->dev_cnt++;
@@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 
 	return 0;
 
-out_detach_device:
-	iommu_detach_device(pd->domain, dev);
 out_free_dev:
 	kfree(uiom_dev);
 	return err;
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4c2f0bd062856a..05ea5800febc37 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -22,6 +22,7 @@
 #include <linux/vdpa.h>
 #include <linux/nospec.h>
 #include <linux/vhost.h>
+#include <linux/dma-map-ops.h>
 
 #include "vhost.h"
 
@@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
 	if (!bus)
 		return -EFAULT;
 
-	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
+	if (!dev_is_dma_coherent(dma_dev))
 		return -ENOTSUPP;
 
 	v->domain = iommu_domain_alloc(bus);
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
way to do this is via dev_is_dma_coherent() like the DMA API does. If
IOMMU_CACHE is not supported then these drivers won't work as they don't
call any coherency-restoring routines around their DMAs.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
 drivers/vhost/vdpa.c                     |  3 ++-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 760b254ba42d6b..24d118198ac756 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -42,6 +42,7 @@
 #include <linux/list.h>
 #include <linux/pci.h>
 #include <rdma/ib_verbs.h>
+#include <linux/dma-map-ops.h>
 
 #include "usnic_log.h"
 #include "usnic_uiom.h"
@@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	struct usnic_uiom_dev *uiom_dev;
 	int err;
 
+	if (!dev_is_dma_coherent(dev)) {
+		usnic_err("IOMMU of %s does not support cache coherency\n",
+				dev_name(dev));
+		return -EINVAL;
+	}
+
 	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
 	if (!uiom_dev)
 		return -ENOMEM;
@@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	if (err)
 		goto out_free_dev;
 
-	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
-		usnic_err("IOMMU of %s does not support cache coherency\n",
-				dev_name(dev));
-		err = -EINVAL;
-		goto out_detach_device;
-	}
-
 	spin_lock(&pd->lock);
 	list_add_tail(&uiom_dev->link, &pd->devs);
 	pd->dev_cnt++;
@@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 
 	return 0;
 
-out_detach_device:
-	iommu_detach_device(pd->domain, dev);
 out_free_dev:
 	kfree(uiom_dev);
 	return err;
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4c2f0bd062856a..05ea5800febc37 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -22,6 +22,7 @@
 #include <linux/vdpa.h>
 #include <linux/nospec.h>
 #include <linux/vhost.h>
+#include <linux/dma-map-ops.h>
 
 #include "vhost.h"
 
@@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
 	if (!bus)
 		return -EFAULT;
 
-	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
+	if (!dev_is_dma_coherent(dma_dev))
 		return -ENOTSUPP;
 
 	v->domain = iommu_domain_alloc(bus);
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
way to do this is via dev_is_dma_coherent() like the DMA API does. If
IOMMU_CACHE is not supported then these drivers won't work as they don't
call any coherency-restoring routines around their DMAs.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
 drivers/vhost/vdpa.c                     |  3 ++-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 760b254ba42d6b..24d118198ac756 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -42,6 +42,7 @@
 #include <linux/list.h>
 #include <linux/pci.h>
 #include <rdma/ib_verbs.h>
+#include <linux/dma-map-ops.h>
 
 #include "usnic_log.h"
 #include "usnic_uiom.h"
@@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	struct usnic_uiom_dev *uiom_dev;
 	int err;
 
+	if (!dev_is_dma_coherent(dev)) {
+		usnic_err("IOMMU of %s does not support cache coherency\n",
+				dev_name(dev));
+		return -EINVAL;
+	}
+
 	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
 	if (!uiom_dev)
 		return -ENOMEM;
@@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 	if (err)
 		goto out_free_dev;
 
-	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
-		usnic_err("IOMMU of %s does not support cache coherency\n",
-				dev_name(dev));
-		err = -EINVAL;
-		goto out_detach_device;
-	}
-
 	spin_lock(&pd->lock);
 	list_add_tail(&uiom_dev->link, &pd->devs);
 	pd->dev_cnt++;
@@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
 
 	return 0;
 
-out_detach_device:
-	iommu_detach_device(pd->domain, dev);
 out_free_dev:
 	kfree(uiom_dev);
 	return err;
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 4c2f0bd062856a..05ea5800febc37 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -22,6 +22,7 @@
 #include <linux/vdpa.h>
 #include <linux/nospec.h>
 #include <linux/vhost.h>
+#include <linux/dma-map-ops.h>
 
 #include "vhost.h"
 
@@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
 	if (!bus)
 		return -EFAULT;
 
-	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
+	if (!dev_is_dma_coherent(dma_dev))
 		return -ENOTSUPP;
 
 	v->domain = iommu_domain_alloc(bus);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 2/5] vfio: Require that devices support DMA cache coherence
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
@ 2022-04-05 16:16   ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
supported.

IOMMU_CACHE means that normal DMAs do not require any additional coherency
mechanism and is the basic uAPI that VFIO exposes to userspace. For
instance VFIO applications like DPDK will not work if additional coherency
operations are required.

Therefore check dev_is_dma_coherent() before allowing a device to join a
domain. This will block device/platform/iommu combinations from using VFIO
that do not support cache coherent DMA.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index a4555014bd1e72..2a3aa3e742d943 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -32,6 +32,7 @@
 #include <linux/vfio.h>
 #include <linux/wait.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-map-ops.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	if (IS_ERR(device))
 		return PTR_ERR(device);
 
+	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
+		ret = -ENODEV;
+		goto err_device_put;
+	}
+
 	if (!try_module_get(device->dev->driver->owner)) {
 		ret = -ENODEV;
 		goto err_device_put;
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
supported.

IOMMU_CACHE means that normal DMAs do not require any additional coherency
mechanism and is the basic uAPI that VFIO exposes to userspace. For
instance VFIO applications like DPDK will not work if additional coherency
operations are required.

Therefore check dev_is_dma_coherent() before allowing a device to join a
domain. This will block device/platform/iommu combinations from using VFIO
that do not support cache coherent DMA.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index a4555014bd1e72..2a3aa3e742d943 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -32,6 +32,7 @@
 #include <linux/vfio.h>
 #include <linux/wait.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-map-ops.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	if (IS_ERR(device))
 		return PTR_ERR(device);
 
+	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
+		ret = -ENODEV;
+		goto err_device_put;
+	}
+
 	if (!try_module_get(device->dev->driver->owner)) {
 		ret = -ENODEV;
 		goto err_device_put;
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
supported.

IOMMU_CACHE means that normal DMAs do not require any additional coherency
mechanism and is the basic uAPI that VFIO exposes to userspace. For
instance VFIO applications like DPDK will not work if additional coherency
operations are required.

Therefore check dev_is_dma_coherent() before allowing a device to join a
domain. This will block device/platform/iommu combinations from using VFIO
that do not support cache coherent DMA.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index a4555014bd1e72..2a3aa3e742d943 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -32,6 +32,7 @@
 #include <linux/vfio.h>
 #include <linux/wait.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-map-ops.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	if (IS_ERR(device))
 		return PTR_ERR(device);
 
+	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
+		ret = -ENODEV;
+		goto err_device_put;
+	}
+
 	if (!try_module_get(device->dev->driver->owner)) {
 		ret = -ENODEV;
 		goto err_device_put;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
@ 2022-04-05 16:16   ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.

Currently only Intel and AMD IOMMUs are known to support this
feature. They both implement it as an IOPTE bit, that when set, will cause
PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
the no-snoop bit was clear.

The new API is triggered by calling enforce_cache_coherency() before
mapping any IOVA to the domain which globally switches on no-snoop
blocking. This allows other implementations that might block no-snoop
globally and outside the IOPTE - AMD also documents such an HW capability.

Leave AMD out of sync with Intel and have it block no-snoop even for
in-kernel users. This can be trivially resolved in a follow up patch.

Only VFIO will call this new API.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c   |  7 +++++++
 drivers/iommu/intel/iommu.c | 14 +++++++++++++-
 include/linux/intel-iommu.h |  1 +
 include/linux/iommu.h       |  4 ++++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e61..e500b487eb3429 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
 	return 0;
 }
 
+static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	/* IOMMU_PTE_FC is always set */
+	return true;
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
 		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
 		.iotlb_sync	= amd_iommu_iotlb_sync,
 		.free		= amd_iommu_domain_free,
+		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index df5c62ecf942b8..f08611a6cc4799 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
+	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
+	    dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
@@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
 	return phys;
 }
 
+static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+	if (!dmar_domain->iommu_snooping)
+		return false;
+	dmar_domain->enforce_no_snoop = true;
+	return true;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
 		.iotlb_sync		= intel_iommu_tlb_sync,
 		.iova_to_phys		= intel_iommu_iova_to_phys,
 		.free			= intel_iommu_domain_free,
+		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2f9891cb3d0014..1f930c0c225d94 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -540,6 +540,7 @@ struct dmar_domain {
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
 	u8 iommu_snooping: 1;		/* indicate snooping control feature */
+	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */
 	struct iova_domain iovad;	/* iova's that belong to this domain */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1ac..fe4f24c469c373 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -272,6 +272,9 @@ struct iommu_ops {
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *            queue
  * @iova_to_phys: translate iova to physical address
+ * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
+ *                           including no-snoop TLPs on PCIe or other platform
+ *                           specific mechanisms.
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @free: Release the domain after use.
@@ -300,6 +303,7 @@ struct iommu_domain_ops {
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
 				    dma_addr_t iova);
 
+	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
 	int (*enable_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.

Currently only Intel and AMD IOMMUs are known to support this
feature. They both implement it as an IOPTE bit, that when set, will cause
PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
the no-snoop bit was clear.

The new API is triggered by calling enforce_cache_coherency() before
mapping any IOVA to the domain which globally switches on no-snoop
blocking. This allows other implementations that might block no-snoop
globally and outside the IOPTE - AMD also documents such an HW capability.

Leave AMD out of sync with Intel and have it block no-snoop even for
in-kernel users. This can be trivially resolved in a follow up patch.

Only VFIO will call this new API.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c   |  7 +++++++
 drivers/iommu/intel/iommu.c | 14 +++++++++++++-
 include/linux/intel-iommu.h |  1 +
 include/linux/iommu.h       |  4 ++++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e61..e500b487eb3429 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
 	return 0;
 }
 
+static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	/* IOMMU_PTE_FC is always set */
+	return true;
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
 		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
 		.iotlb_sync	= amd_iommu_iotlb_sync,
 		.free		= amd_iommu_domain_free,
+		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index df5c62ecf942b8..f08611a6cc4799 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
+	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
+	    dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
@@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
 	return phys;
 }
 
+static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+	if (!dmar_domain->iommu_snooping)
+		return false;
+	dmar_domain->enforce_no_snoop = true;
+	return true;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
 		.iotlb_sync		= intel_iommu_tlb_sync,
 		.iova_to_phys		= intel_iommu_iova_to_phys,
 		.free			= intel_iommu_domain_free,
+		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2f9891cb3d0014..1f930c0c225d94 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -540,6 +540,7 @@ struct dmar_domain {
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
 	u8 iommu_snooping: 1;		/* indicate snooping control feature */
+	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */
 	struct iova_domain iovad;	/* iova's that belong to this domain */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1ac..fe4f24c469c373 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -272,6 +272,9 @@ struct iommu_ops {
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *            queue
  * @iova_to_phys: translate iova to physical address
+ * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
+ *                           including no-snoop TLPs on PCIe or other platform
+ *                           specific mechanisms.
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @free: Release the domain after use.
@@ -300,6 +303,7 @@ struct iommu_domain_ops {
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
 				    dma_addr_t iova);
 
+	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
 	int (*enable_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.

Currently only Intel and AMD IOMMUs are known to support this
feature. They both implement it as an IOPTE bit, that when set, will cause
PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
the no-snoop bit was clear.

The new API is triggered by calling enforce_cache_coherency() before
mapping any IOVA to the domain which globally switches on no-snoop
blocking. This allows other implementations that might block no-snoop
globally and outside the IOPTE - AMD also documents such an HW capability.

Leave AMD out of sync with Intel and have it block no-snoop even for
in-kernel users. This can be trivially resolved in a follow up patch.

Only VFIO will call this new API.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c   |  7 +++++++
 drivers/iommu/intel/iommu.c | 14 +++++++++++++-
 include/linux/intel-iommu.h |  1 +
 include/linux/iommu.h       |  4 ++++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e61..e500b487eb3429 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
 	return 0;
 }
 
+static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	/* IOMMU_PTE_FC is always set */
+	return true;
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
 		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
 		.iotlb_sync	= amd_iommu_iotlb_sync,
 		.free		= amd_iommu_domain_free,
+		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index df5c62ecf942b8..f08611a6cc4799 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
+	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
+	    dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
@@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
 	return phys;
 }
 
+static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+	if (!dmar_domain->iommu_snooping)
+		return false;
+	dmar_domain->enforce_no_snoop = true;
+	return true;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
 		.iotlb_sync		= intel_iommu_tlb_sync,
 		.iova_to_phys		= intel_iommu_iova_to_phys,
 		.free			= intel_iommu_domain_free,
+		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
 	}
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2f9891cb3d0014..1f930c0c225d94 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -540,6 +540,7 @@ struct dmar_domain {
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
 	u8 iommu_snooping: 1;		/* indicate snooping control feature */
+	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */
 	struct iova_domain iovad;	/* iova's that belong to this domain */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1ac..fe4f24c469c373 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -272,6 +272,9 @@ struct iommu_ops {
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *            queue
  * @iova_to_phys: translate iova to physical address
+ * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
+ *                           including no-snoop TLPs on PCIe or other platform
+ *                           specific mechanisms.
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @free: Release the domain after use.
@@ -300,6 +303,7 @@ struct iommu_domain_ops {
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
 				    dma_addr_t iova);
 
+	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
 	int (*enable_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 4/5] vfio: Move the Intel no-snoop control off of IOMMU_CACHE
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
@ 2022-04-05 16:16   ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should be cache
coherent" and is used by the DMA API. The definition allows for special
non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe
TLPs - so long as this behavior is opt-in by the device driver.

The flag is used by the DMA API and is always available if
dev_is_dma_coherent() is set.

For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to be
cache coherent' which has the practical effect of causing the IOMMU to
ignore the no-snoop bit in a PCIe TLP.

x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag.

Instead use the new domain op enforce_cache_coherency() which causes every
IOPTE created in the domain to have the no-snoop blocking behavior.

Reconfigure VFIO to always use IOMMU_CACHE and call
enforce_cache_coherency() to operate the special Intel behavior.

Remove the IOMMU_CACHE test from Intel IOMMU.

Ultimately VFIO plumbs the result of enforce_cache_coherency() back into
the x86 platform code through kvm_arch_register_noncoherent_dma() which
controls if the WBINVD instruction is available in the guest. No other
arch implements kvm_arch_register_noncoherent_dma().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/intel/iommu.c     |  3 +--
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f08611a6cc4799..0ca43671d934e9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,8 +4422,7 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
-	    dmar_domain->enforce_no_snoop)
+	if (dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9394aa9444c10c..c13b9290e35759 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -84,8 +84,8 @@ struct vfio_domain {
 	struct iommu_domain	*domain;
 	struct list_head	next;
 	struct list_head	group_list;
-	int			prot;		/* IOMMU_CACHE */
-	bool			fgsp;		/* Fine-grained super pages */
+	bool			fgsp : 1;	/* Fine-grained super pages */
+	bool			enforce_cache_coherency : 1;
 };
 
 struct vfio_dma {
@@ -1461,7 +1461,7 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT,
-				npage << PAGE_SHIFT, prot | d->prot);
+				npage << PAGE_SHIFT, prot | IOMMU_CACHE);
 		if (ret)
 			goto unwind;
 
@@ -1771,7 +1771,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			}
 
 			ret = iommu_map(domain->domain, iova, phys,
-					size, dma->prot | domain->prot);
+					size, dma->prot | IOMMU_CACHE);
 			if (ret) {
 				if (!dma->iommu_mapped) {
 					vfio_unpin_pages_remote(dma, iova,
@@ -1859,7 +1859,7 @@ static void vfio_test_domain_fgsp(struct vfio_domain *domain)
 		return;
 
 	ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
-			IOMMU_READ | IOMMU_WRITE | domain->prot);
+			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
 	if (!ret) {
 		size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
 
@@ -2267,8 +2267,15 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
-		domain->prot |= IOMMU_CACHE;
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (domain->domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			domain->domain->ops->enforce_cache_coherency(
+				domain->domain);
 
 	/*
 	 * Try to match an existing compatible domain.  We don't want to
@@ -2279,7 +2286,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
-		    d->prot == domain->prot) {
+		    d->enforce_cache_coherency ==
+			    domain->enforce_cache_coherency) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
@@ -2611,14 +2619,14 @@ static void vfio_iommu_type1_release(void *iommu_data)
 	kfree(iommu);
 }
 
-static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
+static int vfio_domains_have_enforce_cache_coherency(struct vfio_iommu *iommu)
 {
 	struct vfio_domain *domain;
 	int ret = 1;
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(domain, &iommu->domain_list, next) {
-		if (!(domain->prot & IOMMU_CACHE)) {
+		if (!(domain->enforce_cache_coherency)) {
 			ret = 0;
 			break;
 		}
@@ -2641,7 +2649,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
 	case VFIO_DMA_CC_IOMMU:
 		if (!iommu)
 			return 0;
-		return vfio_domains_have_iommu_cache(iommu);
+		return vfio_domains_have_enforce_cache_coherency(iommu);
 	default:
 		return 0;
 	}
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 4/5] vfio: Move the Intel no-snoop control off of IOMMU_CACHE
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should be cache
coherent" and is used by the DMA API. The definition allows for special
non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe
TLPs - so long as this behavior is opt-in by the device driver.

The flag is used by the DMA API and is always available if
dev_is_dma_coherent() is set.

For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to be
cache coherent' which has the practical effect of causing the IOMMU to
ignore the no-snoop bit in a PCIe TLP.

x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag.

Instead use the new domain op enforce_cache_coherency() which causes every
IOPTE created in the domain to have the no-snoop blocking behavior.

Reconfigure VFIO to always use IOMMU_CACHE and call
enforce_cache_coherency() to operate the special Intel behavior.

Remove the IOMMU_CACHE test from Intel IOMMU.

Ultimately VFIO plumbs the result of enforce_cache_coherency() back into
the x86 platform code through kvm_arch_register_noncoherent_dma() which
controls if the WBINVD instruction is available in the guest. No other
arch implements kvm_arch_register_noncoherent_dma().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/intel/iommu.c     |  3 +--
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f08611a6cc4799..0ca43671d934e9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,8 +4422,7 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
-	    dmar_domain->enforce_no_snoop)
+	if (dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9394aa9444c10c..c13b9290e35759 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -84,8 +84,8 @@ struct vfio_domain {
 	struct iommu_domain	*domain;
 	struct list_head	next;
 	struct list_head	group_list;
-	int			prot;		/* IOMMU_CACHE */
-	bool			fgsp;		/* Fine-grained super pages */
+	bool			fgsp : 1;	/* Fine-grained super pages */
+	bool			enforce_cache_coherency : 1;
 };
 
 struct vfio_dma {
@@ -1461,7 +1461,7 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT,
-				npage << PAGE_SHIFT, prot | d->prot);
+				npage << PAGE_SHIFT, prot | IOMMU_CACHE);
 		if (ret)
 			goto unwind;
 
@@ -1771,7 +1771,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			}
 
 			ret = iommu_map(domain->domain, iova, phys,
-					size, dma->prot | domain->prot);
+					size, dma->prot | IOMMU_CACHE);
 			if (ret) {
 				if (!dma->iommu_mapped) {
 					vfio_unpin_pages_remote(dma, iova,
@@ -1859,7 +1859,7 @@ static void vfio_test_domain_fgsp(struct vfio_domain *domain)
 		return;
 
 	ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
-			IOMMU_READ | IOMMU_WRITE | domain->prot);
+			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
 	if (!ret) {
 		size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
 
@@ -2267,8 +2267,15 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
-		domain->prot |= IOMMU_CACHE;
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (domain->domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			domain->domain->ops->enforce_cache_coherency(
+				domain->domain);
 
 	/*
 	 * Try to match an existing compatible domain.  We don't want to
@@ -2279,7 +2286,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
-		    d->prot == domain->prot) {
+		    d->enforce_cache_coherency ==
+			    domain->enforce_cache_coherency) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
@@ -2611,14 +2619,14 @@ static void vfio_iommu_type1_release(void *iommu_data)
 	kfree(iommu);
 }
 
-static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
+static int vfio_domains_have_enforce_cache_coherency(struct vfio_iommu *iommu)
 {
 	struct vfio_domain *domain;
 	int ret = 1;
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(domain, &iommu->domain_list, next) {
-		if (!(domain->prot & IOMMU_CACHE)) {
+		if (!(domain->enforce_cache_coherency)) {
 			ret = 0;
 			break;
 		}
@@ -2641,7 +2649,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
 	case VFIO_DMA_CC_IOMMU:
 		if (!iommu)
 			return 0;
-		return vfio_domains_have_iommu_cache(iommu);
+		return vfio_domains_have_enforce_cache_coherency(iommu);
 	default:
 		return 0;
 	}
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 4/5] vfio: Move the Intel no-snoop control off of IOMMU_CACHE
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should be cache
coherent" and is used by the DMA API. The definition allows for special
non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe
TLPs - so long as this behavior is opt-in by the device driver.

The flag is used by the DMA API and is always available if
dev_is_dma_coherent() is set.

For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to be
cache coherent' which has the practical effect of causing the IOMMU to
ignore the no-snoop bit in a PCIe TLP.

x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag.

Instead use the new domain op enforce_cache_coherency() which causes every
IOPTE created in the domain to have the no-snoop blocking behavior.

Reconfigure VFIO to always use IOMMU_CACHE and call
enforce_cache_coherency() to operate the special Intel behavior.

Remove the IOMMU_CACHE test from Intel IOMMU.

Ultimately VFIO plumbs the result of enforce_cache_coherency() back into
the x86 platform code through kvm_arch_register_noncoherent_dma() which
controls if the WBINVD instruction is available in the guest. No other
arch implements kvm_arch_register_noncoherent_dma().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/intel/iommu.c     |  3 +--
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f08611a6cc4799..0ca43671d934e9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4422,8 +4422,7 @@ static int intel_iommu_map(struct iommu_domain *domain,
 		prot |= DMA_PTE_READ;
 	if (iommu_prot & IOMMU_WRITE)
 		prot |= DMA_PTE_WRITE;
-	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
-	    dmar_domain->enforce_no_snoop)
+	if (dmar_domain->enforce_no_snoop)
 		prot |= DMA_PTE_SNP;
 
 	max_addr = iova + size;
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9394aa9444c10c..c13b9290e35759 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -84,8 +84,8 @@ struct vfio_domain {
 	struct iommu_domain	*domain;
 	struct list_head	next;
 	struct list_head	group_list;
-	int			prot;		/* IOMMU_CACHE */
-	bool			fgsp;		/* Fine-grained super pages */
+	bool			fgsp : 1;	/* Fine-grained super pages */
+	bool			enforce_cache_coherency : 1;
 };
 
 struct vfio_dma {
@@ -1461,7 +1461,7 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT,
-				npage << PAGE_SHIFT, prot | d->prot);
+				npage << PAGE_SHIFT, prot | IOMMU_CACHE);
 		if (ret)
 			goto unwind;
 
@@ -1771,7 +1771,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			}
 
 			ret = iommu_map(domain->domain, iova, phys,
-					size, dma->prot | domain->prot);
+					size, dma->prot | IOMMU_CACHE);
 			if (ret) {
 				if (!dma->iommu_mapped) {
 					vfio_unpin_pages_remote(dma, iova,
@@ -1859,7 +1859,7 @@ static void vfio_test_domain_fgsp(struct vfio_domain *domain)
 		return;
 
 	ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
-			IOMMU_READ | IOMMU_WRITE | domain->prot);
+			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
 	if (!ret) {
 		size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
 
@@ -2267,8 +2267,15 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		goto out_detach;
 	}
 
-	if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
-		domain->prot |= IOMMU_CACHE;
+	/*
+	 * If the IOMMU can block non-coherent operations (ie PCIe TLPs with
+	 * no-snoop set) then VFIO always turns this feature on because on Intel
+	 * platforms it optimizes KVM to disable wbinvd emulation.
+	 */
+	if (domain->domain->ops->enforce_cache_coherency)
+		domain->enforce_cache_coherency =
+			domain->domain->ops->enforce_cache_coherency(
+				domain->domain);
 
 	/*
 	 * Try to match an existing compatible domain.  We don't want to
@@ -2279,7 +2286,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	 */
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		if (d->domain->ops == domain->domain->ops &&
-		    d->prot == domain->prot) {
+		    d->enforce_cache_coherency ==
+			    domain->enforce_cache_coherency) {
 			iommu_detach_group(domain->domain, group->iommu_group);
 			if (!iommu_attach_group(d->domain,
 						group->iommu_group)) {
@@ -2611,14 +2619,14 @@ static void vfio_iommu_type1_release(void *iommu_data)
 	kfree(iommu);
 }
 
-static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
+static int vfio_domains_have_enforce_cache_coherency(struct vfio_iommu *iommu)
 {
 	struct vfio_domain *domain;
 	int ret = 1;
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(domain, &iommu->domain_list, next) {
-		if (!(domain->prot & IOMMU_CACHE)) {
+		if (!(domain->enforce_cache_coherency)) {
 			ret = 0;
 			break;
 		}
@@ -2641,7 +2649,7 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu,
 	case VFIO_DMA_CC_IOMMU:
 		if (!iommu)
 			return 0;
-		return vfio_domains_have_iommu_cache(iommu);
+		return vfio_domains_have_enforce_cache_coherency(iommu);
 	default:
 		return 0;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 5/5] iommu: Delete IOMMU_CAP_CACHE_COHERENCY
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
@ 2022-04-05 16:16   ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

Nothing reads this value anymore, remove it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   | 2 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       | 6 ------
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     | 6 ------
 drivers/iommu/fsl_pamu_domain.c             | 6 ------
 drivers/iommu/intel/iommu.c                 | 2 --
 drivers/iommu/s390-iommu.c                  | 2 --
 include/linux/iommu.h                       | 2 --
 8 files changed, 28 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e500b487eb3429..f144eb9fea8e31 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2156,8 +2156,6 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 static bool amd_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return (irq_remapping_enabled == 1);
 	case IOMMU_CAP_NOEXEC:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8fd1..c4115015a633a7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1986,8 +1986,6 @@ static const struct iommu_flush_ops arm_smmu_flush_ops = {
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc13..166bf45ac3d444 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1333,12 +1333,6 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd64f..fc72465fc63d50 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -496,12 +496,6 @@ static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
 static bool qcom_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 69a4a62dc3b9c7..9f80285dab2951 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -177,11 +177,6 @@ static phys_addr_t fsl_pamu_iova_to_phys(struct iommu_domain *domain,
 	return iova;
 }
 
-static bool fsl_pamu_capable(enum iommu_cap cap)
-{
-	return cap == IOMMU_CAP_CACHE_COHERENCY;
-}
-
 static void fsl_pamu_domain_free(struct iommu_domain *domain)
 {
 	struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
@@ -451,7 +446,6 @@ static void fsl_pamu_release_device(struct device *dev)
 }
 
 static const struct iommu_ops fsl_pamu_ops = {
-	.capable	= fsl_pamu_capable,
 	.domain_alloc	= fsl_pamu_domain_alloc,
 	.probe_device	= fsl_pamu_probe_device,
 	.release_device	= fsl_pamu_release_device,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0ca43671d934e9..e5062461ab0640 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4557,8 +4557,6 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
-	if (cap == IOMMU_CAP_CACHE_COHERENCY)
-		return domain_update_iommu_snooping(NULL);
 	if (cap == IOMMU_CAP_INTR_REMAP)
 		return irq_remapping_enabled == 1;
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 3833e86c6e7b8a..3dbf9663246552 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -42,8 +42,6 @@ static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
 static bool s390_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return true;
 	default:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fe4f24c469c373..aebb73bda19797 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,8 +103,6 @@ static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
 }
 
 enum iommu_cap {
-	IOMMU_CAP_CACHE_COHERENCY,	/* IOMMU can enforce cache coherent DMA
-					   transactions */
 	IOMMU_CAP_INTR_REMAP,		/* IOMMU supports interrupt isolation */
 	IOMMU_CAP_NOEXEC,		/* IOMMU_NOEXEC flag */
 };
-- 
2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 5/5] iommu: Delete IOMMU_CAP_CACHE_COHERENCY
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

Nothing reads this value anymore, remove it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   | 2 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       | 6 ------
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     | 6 ------
 drivers/iommu/fsl_pamu_domain.c             | 6 ------
 drivers/iommu/intel/iommu.c                 | 2 --
 drivers/iommu/s390-iommu.c                  | 2 --
 include/linux/iommu.h                       | 2 --
 8 files changed, 28 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e500b487eb3429..f144eb9fea8e31 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2156,8 +2156,6 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 static bool amd_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return (irq_remapping_enabled == 1);
 	case IOMMU_CAP_NOEXEC:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8fd1..c4115015a633a7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1986,8 +1986,6 @@ static const struct iommu_flush_ops arm_smmu_flush_ops = {
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc13..166bf45ac3d444 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1333,12 +1333,6 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd64f..fc72465fc63d50 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -496,12 +496,6 @@ static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
 static bool qcom_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 69a4a62dc3b9c7..9f80285dab2951 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -177,11 +177,6 @@ static phys_addr_t fsl_pamu_iova_to_phys(struct iommu_domain *domain,
 	return iova;
 }
 
-static bool fsl_pamu_capable(enum iommu_cap cap)
-{
-	return cap == IOMMU_CAP_CACHE_COHERENCY;
-}
-
 static void fsl_pamu_domain_free(struct iommu_domain *domain)
 {
 	struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
@@ -451,7 +446,6 @@ static void fsl_pamu_release_device(struct device *dev)
 }
 
 static const struct iommu_ops fsl_pamu_ops = {
-	.capable	= fsl_pamu_capable,
 	.domain_alloc	= fsl_pamu_domain_alloc,
 	.probe_device	= fsl_pamu_probe_device,
 	.release_device	= fsl_pamu_release_device,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0ca43671d934e9..e5062461ab0640 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4557,8 +4557,6 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
-	if (cap == IOMMU_CAP_CACHE_COHERENCY)
-		return domain_update_iommu_snooping(NULL);
 	if (cap == IOMMU_CAP_INTR_REMAP)
 		return irq_remapping_enabled == 1;
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 3833e86c6e7b8a..3dbf9663246552 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -42,8 +42,6 @@ static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
 static bool s390_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return true;
 	default:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fe4f24c469c373..aebb73bda19797 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,8 +103,6 @@ static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
 }
 
 enum iommu_cap {
-	IOMMU_CAP_CACHE_COHERENCY,	/* IOMMU can enforce cache coherent DMA
-					   transactions */
 	IOMMU_CAP_INTR_REMAP,		/* IOMMU supports interrupt isolation */
 	IOMMU_CAP_NOEXEC,		/* IOMMU_NOEXEC flag */
 };
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH 5/5] iommu: Delete IOMMU_CAP_CACHE_COHERENCY
@ 2022-04-05 16:16   ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 16:16 UTC (permalink / raw)
  To: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

Nothing reads this value anymore, remove it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c                   | 2 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c       | 6 ------
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     | 6 ------
 drivers/iommu/fsl_pamu_domain.c             | 6 ------
 drivers/iommu/intel/iommu.c                 | 2 --
 drivers/iommu/s390-iommu.c                  | 2 --
 include/linux/iommu.h                       | 2 --
 8 files changed, 28 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e500b487eb3429..f144eb9fea8e31 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2156,8 +2156,6 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 static bool amd_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return (irq_remapping_enabled == 1);
 	case IOMMU_CAP_NOEXEC:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8fd1..c4115015a633a7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1986,8 +1986,6 @@ static const struct iommu_flush_ops arm_smmu_flush_ops = {
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc13..166bf45ac3d444 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1333,12 +1333,6 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 static bool arm_smmu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd64f..fc72465fc63d50 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -496,12 +496,6 @@ static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
 static bool qcom_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		/*
-		 * Return true here as the SMMU can always send out coherent
-		 * requests.
-		 */
-		return true;
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 69a4a62dc3b9c7..9f80285dab2951 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -177,11 +177,6 @@ static phys_addr_t fsl_pamu_iova_to_phys(struct iommu_domain *domain,
 	return iova;
 }
 
-static bool fsl_pamu_capable(enum iommu_cap cap)
-{
-	return cap == IOMMU_CAP_CACHE_COHERENCY;
-}
-
 static void fsl_pamu_domain_free(struct iommu_domain *domain)
 {
 	struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
@@ -451,7 +446,6 @@ static void fsl_pamu_release_device(struct device *dev)
 }
 
 static const struct iommu_ops fsl_pamu_ops = {
-	.capable	= fsl_pamu_capable,
 	.domain_alloc	= fsl_pamu_domain_alloc,
 	.probe_device	= fsl_pamu_probe_device,
 	.release_device	= fsl_pamu_release_device,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0ca43671d934e9..e5062461ab0640 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4557,8 +4557,6 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
-	if (cap == IOMMU_CAP_CACHE_COHERENCY)
-		return domain_update_iommu_snooping(NULL);
 	if (cap == IOMMU_CAP_INTR_REMAP)
 		return irq_remapping_enabled == 1;
 
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 3833e86c6e7b8a..3dbf9663246552 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -42,8 +42,6 @@ static struct s390_domain *to_s390_domain(struct iommu_domain *dom)
 static bool s390_iommu_capable(enum iommu_cap cap)
 {
 	switch (cap) {
-	case IOMMU_CAP_CACHE_COHERENCY:
-		return true;
 	case IOMMU_CAP_INTR_REMAP:
 		return true;
 	default:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fe4f24c469c373..aebb73bda19797 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -103,8 +103,6 @@ static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
 }
 
 enum iommu_cap {
-	IOMMU_CAP_CACHE_COHERENCY,	/* IOMMU can enforce cache coherent DMA
-					   transactions */
 	IOMMU_CAP_INTR_REMAP,		/* IOMMU supports interrupt isolation */
 	IOMMU_CAP_NOEXEC,		/* IOMMU_NOEXEC flag */
 };
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
  2022-04-05 16:16   ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-05 19:10     ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, Tian, Kevin,
	linux-arm-msm, linux-arm-kernel, netdev, Cornelia Huck,
	Christian Benvenuti, Robin Murphy

On Tue,  5 Apr 2022 13:16:01 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> supported.
> 
> IOMMU_CACHE means that normal DMAs do not require any additional coherency
> mechanism and is the basic uAPI that VFIO exposes to userspace. For
> instance VFIO applications like DPDK will not work if additional coherency
> operations are required.
> 
> Therefore check dev_is_dma_coherent() before allowing a device to join a
> domain. This will block device/platform/iommu combinations from using VFIO
> that do not support cache coherent DMA.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index a4555014bd1e72..2a3aa3e742d943 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -32,6 +32,7 @@
>  #include <linux/vfio.h>
>  #include <linux/wait.h>
>  #include <linux/sched/signal.h>
> +#include <linux/dma-map-ops.h>
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION	"0.3"
> @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
>  	if (IS_ERR(device))
>  		return PTR_ERR(device);
>  
> +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> +		ret = -ENODEV;
> +		goto err_device_put;
> +	}
> +

Failing at the point where the user is trying to gain access to the
device seems a little late in the process and opaque, wouldn't we
rather have vfio bus drivers fail to probe such devices?  I'd expect
this to occur in the vfio_register_group_dev() path.  Thanks,

Alex

>  	if (!try_module_get(device->dev->driver->owner)) {
>  		ret = -ENODEV;
>  		goto err_device_put;

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 19:10     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, Rob Clark, Suravee Suthikulpanit,
	Christian Benvenuti, Robin Murphy, Lu Baolu

On Tue,  5 Apr 2022 13:16:01 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> supported.
> 
> IOMMU_CACHE means that normal DMAs do not require any additional coherency
> mechanism and is the basic uAPI that VFIO exposes to userspace. For
> instance VFIO applications like DPDK will not work if additional coherency
> operations are required.
> 
> Therefore check dev_is_dma_coherent() before allowing a device to join a
> domain. This will block device/platform/iommu combinations from using VFIO
> that do not support cache coherent DMA.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index a4555014bd1e72..2a3aa3e742d943 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -32,6 +32,7 @@
>  #include <linux/vfio.h>
>  #include <linux/wait.h>
>  #include <linux/sched/signal.h>
> +#include <linux/dma-map-ops.h>
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION	"0.3"
> @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
>  	if (IS_ERR(device))
>  		return PTR_ERR(device);
>  
> +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> +		ret = -ENODEV;
> +		goto err_device_put;
> +	}
> +

Failing at the point where the user is trying to gain access to the
device seems a little late in the process and opaque, wouldn't we
rather have vfio bus drivers fail to probe such devices?  I'd expect
this to occur in the vfio_register_group_dev() path.  Thanks,

Alex

>  	if (!try_module_get(device->dev->driver->owner)) {
>  		ret = -ENODEV;
>  		goto err_device_put;

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 19:10     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue,  5 Apr 2022 13:16:01 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> supported.
> 
> IOMMU_CACHE means that normal DMAs do not require any additional coherency
> mechanism and is the basic uAPI that VFIO exposes to userspace. For
> instance VFIO applications like DPDK will not work if additional coherency
> operations are required.
> 
> Therefore check dev_is_dma_coherent() before allowing a device to join a
> domain. This will block device/platform/iommu combinations from using VFIO
> that do not support cache coherent DMA.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index a4555014bd1e72..2a3aa3e742d943 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -32,6 +32,7 @@
>  #include <linux/vfio.h>
>  #include <linux/wait.h>
>  #include <linux/sched/signal.h>
> +#include <linux/dma-map-ops.h>
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION	"0.3"
> @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
>  	if (IS_ERR(device))
>  		return PTR_ERR(device);
>  
> +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> +		ret = -ENODEV;
> +		goto err_device_put;
> +	}
> +

Failing at the point where the user is trying to gain access to the
device seems a little late in the process and opaque, wouldn't we
rather have vfio bus drivers fail to probe such devices?  I'd expect
this to occur in the vfio_register_group_dev() path.  Thanks,

Alex

>  	if (!try_module_get(device->dev->driver->owner)) {
>  		ret = -ENODEV;
>  		goto err_device_put;


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 19:10     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue,  5 Apr 2022 13:16:01 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> supported.
> 
> IOMMU_CACHE means that normal DMAs do not require any additional coherency
> mechanism and is the basic uAPI that VFIO exposes to userspace. For
> instance VFIO applications like DPDK will not work if additional coherency
> operations are required.
> 
> Therefore check dev_is_dma_coherent() before allowing a device to join a
> domain. This will block device/platform/iommu combinations from using VFIO
> that do not support cache coherent DMA.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index a4555014bd1e72..2a3aa3e742d943 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -32,6 +32,7 @@
>  #include <linux/vfio.h>
>  #include <linux/wait.h>
>  #include <linux/sched/signal.h>
> +#include <linux/dma-map-ops.h>
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION	"0.3"
> @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
>  	if (IS_ERR(device))
>  		return PTR_ERR(device);
>  
> +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> +		ret = -ENODEV;
> +		goto err_device_put;
> +	}
> +

Failing at the point where the user is trying to gain access to the
device seems a little late in the process and opaque, wouldn't we
rather have vfio bus drivers fail to probe such devices?  I'd expect
this to occur in the vfio_register_group_dev() path.  Thanks,

Alex

>  	if (!try_module_get(device->dev->driver->owner)) {
>  		ret = -ENODEV;
>  		goto err_device_put;


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
  2022-04-05 19:10     ` Alex Williamson
  (?)
@ 2022-04-05 19:29       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 19:29 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, Tian, Kevin,
	linux-arm-msm, linux-arm-kernel, netdev, Cornelia Huck,
	Christian Benvenuti, Robin Murphy

On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> On Tue,  5 Apr 2022 13:16:01 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> > supported.
> > 
> > IOMMU_CACHE means that normal DMAs do not require any additional coherency
> > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > instance VFIO applications like DPDK will not work if additional coherency
> > operations are required.
> > 
> > Therefore check dev_is_dma_coherent() before allowing a device to join a
> > domain. This will block device/platform/iommu combinations from using VFIO
> > that do not support cache coherent DMA.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/vfio/vfio.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index a4555014bd1e72..2a3aa3e742d943 100644
> > +++ b/drivers/vfio/vfio.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/vfio.h>
> >  #include <linux/wait.h>
> >  #include <linux/sched/signal.h>
> > +#include <linux/dma-map-ops.h>
> >  #include "vfio.h"
> >  
> >  #define DRIVER_VERSION	"0.3"
> > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
> >  	if (IS_ERR(device))
> >  		return PTR_ERR(device);
> >  
> > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> > +		ret = -ENODEV;
> > +		goto err_device_put;
> > +	}
> > +
> 
> Failing at the point where the user is trying to gain access to the
> device seems a little late in the process and opaque, wouldn't we
> rather have vfio bus drivers fail to probe such devices?  I'd expect
> this to occur in the vfio_register_group_dev() path.  Thanks,

Yes, that is a good point.

So like this:

 int vfio_register_group_dev(struct vfio_device *device)
 {
+       if (!dev_is_dma_coherent(device->dev))
+               return -EINVAL;
+
        return __vfio_register_dev(device,
                vfio_group_find_or_alloc(device->dev));
 }

I fixed it up.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 19:29       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 19:29 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> On Tue,  5 Apr 2022 13:16:01 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> > supported.
> > 
> > IOMMU_CACHE means that normal DMAs do not require any additional coherency
> > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > instance VFIO applications like DPDK will not work if additional coherency
> > operations are required.
> > 
> > Therefore check dev_is_dma_coherent() before allowing a device to join a
> > domain. This will block device/platform/iommu combinations from using VFIO
> > that do not support cache coherent DMA.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/vfio/vfio.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index a4555014bd1e72..2a3aa3e742d943 100644
> > +++ b/drivers/vfio/vfio.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/vfio.h>
> >  #include <linux/wait.h>
> >  #include <linux/sched/signal.h>
> > +#include <linux/dma-map-ops.h>
> >  #include "vfio.h"
> >  
> >  #define DRIVER_VERSION	"0.3"
> > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
> >  	if (IS_ERR(device))
> >  		return PTR_ERR(device);
> >  
> > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> > +		ret = -ENODEV;
> > +		goto err_device_put;
> > +	}
> > +
> 
> Failing at the point where the user is trying to gain access to the
> device seems a little late in the process and opaque, wouldn't we
> rather have vfio bus drivers fail to probe such devices?  I'd expect
> this to occur in the vfio_register_group_dev() path.  Thanks,

Yes, that is a good point.

So like this:

 int vfio_register_group_dev(struct vfio_device *device)
 {
+       if (!dev_is_dma_coherent(device->dev))
+               return -EINVAL;
+
        return __vfio_register_dev(device,
                vfio_group_find_or_alloc(device->dev));
 }

I fixed it up.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-05 19:29       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 19:29 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> On Tue,  5 Apr 2022 13:16:01 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> > supported.
> > 
> > IOMMU_CACHE means that normal DMAs do not require any additional coherency
> > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > instance VFIO applications like DPDK will not work if additional coherency
> > operations are required.
> > 
> > Therefore check dev_is_dma_coherent() before allowing a device to join a
> > domain. This will block device/platform/iommu combinations from using VFIO
> > that do not support cache coherent DMA.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/vfio/vfio.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index a4555014bd1e72..2a3aa3e742d943 100644
> > +++ b/drivers/vfio/vfio.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/vfio.h>
> >  #include <linux/wait.h>
> >  #include <linux/sched/signal.h>
> > +#include <linux/dma-map-ops.h>
> >  #include "vfio.h"
> >  
> >  #define DRIVER_VERSION	"0.3"
> > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
> >  	if (IS_ERR(device))
> >  		return PTR_ERR(device);
> >  
> > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> > +		ret = -ENODEV;
> > +		goto err_device_put;
> > +	}
> > +
> 
> Failing at the point where the user is trying to gain access to the
> device seems a little late in the process and opaque, wouldn't we
> rather have vfio bus drivers fail to probe such devices?  I'd expect
> this to occur in the vfio_register_group_dev() path.  Thanks,

Yes, that is a good point.

So like this:

 int vfio_register_group_dev(struct vfio_device *device)
 {
+       if (!dev_is_dma_coherent(device->dev))
+               return -EINVAL;
+
        return __vfio_register_dev(device,
                vfio_group_find_or_alloc(device->dev));
 }

I fixed it up.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 16:16   ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-05 19:50     ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, Tian, Kevin,
	linux-arm-msm, linux-arm-kernel, netdev, Cornelia Huck,
	Christian Benvenuti, Robin Murphy

On Tue,  5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/amd/iommu.c   |  7 +++++++
>  drivers/iommu/intel/iommu.c | 14 +++++++++++++-
>  include/linux/intel-iommu.h |  1 +
>  include/linux/iommu.h       |  4 ++++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
>  	return 0;
>  }
>  
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	/* IOMMU_PTE_FC is always set */
> +	return true;
> +}
> +
>  const struct iommu_ops amd_iommu_ops = {
>  	.capable = amd_iommu_capable,
>  	.domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
>  		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
>  		.iotlb_sync	= amd_iommu_iotlb_sync,
>  		.free		= amd_iommu_domain_free,
> +		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
>  		prot |= DMA_PTE_READ;
>  	if (iommu_prot & IOMMU_WRITE)
>  		prot |= DMA_PTE_WRITE;
> -	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> +	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> +	    dmar_domain->enforce_no_snoop)
>  		prot |= DMA_PTE_SNP;
>  
>  	max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
>  	return phys;
>  }
>  
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> +	if (!dmar_domain->iommu_snooping)
> +		return false;
> +	dmar_domain->enforce_no_snoop = true;
> +	return true;
> +}

Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults?  It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.

There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced".  For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.

If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?

Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes?  Thanks,

Alex

> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>  	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
>  		.iotlb_sync		= intel_iommu_tlb_sync,
>  		.iova_to_phys		= intel_iommu_iova_to_phys,
>  		.free			= intel_iommu_domain_free,
> +		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
>  	u8 has_iotlb_device: 1;
>  	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
>  	u8 iommu_snooping: 1;		/* indicate snooping control feature */
> +	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
>  
>  	struct list_head devices;	/* all devices' list */
>  	struct iova_domain iovad;	/* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>   *            queue
>   * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + *                           including no-snoop TLPs on PCIe or other platform
> + *                           specific mechanisms.
>   * @enable_nesting: Enable nesting
>   * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
>   * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
>  	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
>  				    dma_addr_t iova);
>  
> +	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
>  	int (*enable_nesting)(struct iommu_domain *domain);
>  	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>  				  unsigned long quirks);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 19:50     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, Rob Clark, Suravee Suthikulpanit,
	Christian Benvenuti, Robin Murphy, Lu Baolu

On Tue,  5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/amd/iommu.c   |  7 +++++++
>  drivers/iommu/intel/iommu.c | 14 +++++++++++++-
>  include/linux/intel-iommu.h |  1 +
>  include/linux/iommu.h       |  4 ++++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
>  	return 0;
>  }
>  
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	/* IOMMU_PTE_FC is always set */
> +	return true;
> +}
> +
>  const struct iommu_ops amd_iommu_ops = {
>  	.capable = amd_iommu_capable,
>  	.domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
>  		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
>  		.iotlb_sync	= amd_iommu_iotlb_sync,
>  		.free		= amd_iommu_domain_free,
> +		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
>  		prot |= DMA_PTE_READ;
>  	if (iommu_prot & IOMMU_WRITE)
>  		prot |= DMA_PTE_WRITE;
> -	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> +	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> +	    dmar_domain->enforce_no_snoop)
>  		prot |= DMA_PTE_SNP;
>  
>  	max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
>  	return phys;
>  }
>  
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> +	if (!dmar_domain->iommu_snooping)
> +		return false;
> +	dmar_domain->enforce_no_snoop = true;
> +	return true;
> +}

Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults?  It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.

There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced".  For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.

If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?

Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes?  Thanks,

Alex

> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>  	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
>  		.iotlb_sync		= intel_iommu_tlb_sync,
>  		.iova_to_phys		= intel_iommu_iova_to_phys,
>  		.free			= intel_iommu_domain_free,
> +		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
>  	u8 has_iotlb_device: 1;
>  	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
>  	u8 iommu_snooping: 1;		/* indicate snooping control feature */
> +	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
>  
>  	struct list_head devices;	/* all devices' list */
>  	struct iova_domain iovad;	/* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>   *            queue
>   * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + *                           including no-snoop TLPs on PCIe or other platform
> + *                           specific mechanisms.
>   * @enable_nesting: Enable nesting
>   * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
>   * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
>  	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
>  				    dma_addr_t iova);
>  
> +	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
>  	int (*enable_nesting)(struct iommu_domain *domain);
>  	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>  				  unsigned long quirks);

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 19:50     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue,  5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/amd/iommu.c   |  7 +++++++
>  drivers/iommu/intel/iommu.c | 14 +++++++++++++-
>  include/linux/intel-iommu.h |  1 +
>  include/linux/iommu.h       |  4 ++++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
>  	return 0;
>  }
>  
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	/* IOMMU_PTE_FC is always set */
> +	return true;
> +}
> +
>  const struct iommu_ops amd_iommu_ops = {
>  	.capable = amd_iommu_capable,
>  	.domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
>  		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
>  		.iotlb_sync	= amd_iommu_iotlb_sync,
>  		.free		= amd_iommu_domain_free,
> +		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
>  		prot |= DMA_PTE_READ;
>  	if (iommu_prot & IOMMU_WRITE)
>  		prot |= DMA_PTE_WRITE;
> -	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> +	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> +	    dmar_domain->enforce_no_snoop)
>  		prot |= DMA_PTE_SNP;
>  
>  	max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
>  	return phys;
>  }
>  
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> +	if (!dmar_domain->iommu_snooping)
> +		return false;
> +	dmar_domain->enforce_no_snoop = true;
> +	return true;
> +}

Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults?  It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.

There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced".  For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.

If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?

Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes?  Thanks,

Alex

> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>  	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
>  		.iotlb_sync		= intel_iommu_tlb_sync,
>  		.iova_to_phys		= intel_iommu_iova_to_phys,
>  		.free			= intel_iommu_domain_free,
> +		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
>  	u8 has_iotlb_device: 1;
>  	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
>  	u8 iommu_snooping: 1;		/* indicate snooping control feature */
> +	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
>  
>  	struct list_head devices;	/* all devices' list */
>  	struct iova_domain iovad;	/* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>   *            queue
>   * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + *                           including no-snoop TLPs on PCIe or other platform
> + *                           specific mechanisms.
>   * @enable_nesting: Enable nesting
>   * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
>   * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
>  	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
>  				    dma_addr_t iova);
>  
> +	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
>  	int (*enable_nesting)(struct iommu_domain *domain);
>  	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>  				  unsigned long quirks);


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 19:50     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-05 19:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue,  5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/amd/iommu.c   |  7 +++++++
>  drivers/iommu/intel/iommu.c | 14 +++++++++++++-
>  include/linux/intel-iommu.h |  1 +
>  include/linux/iommu.h       |  4 ++++
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
>  	return 0;
>  }
>  
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	/* IOMMU_PTE_FC is always set */
> +	return true;
> +}
> +
>  const struct iommu_ops amd_iommu_ops = {
>  	.capable = amd_iommu_capable,
>  	.domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
>  		.flush_iotlb_all = amd_iommu_flush_iotlb_all,
>  		.iotlb_sync	= amd_iommu_iotlb_sync,
>  		.free		= amd_iommu_domain_free,
> +		.enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
>  		prot |= DMA_PTE_READ;
>  	if (iommu_prot & IOMMU_WRITE)
>  		prot |= DMA_PTE_WRITE;
> -	if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> +	if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> +	    dmar_domain->enforce_no_snoop)
>  		prot |= DMA_PTE_SNP;
>  
>  	max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
>  	return phys;
>  }
>  
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> +	if (!dmar_domain->iommu_snooping)
> +		return false;
> +	dmar_domain->enforce_no_snoop = true;
> +	return true;
> +}

Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults?  It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.

There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced".  For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.

If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?

Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes?  Thanks,

Alex

> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>  	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
>  		.iotlb_sync		= intel_iommu_tlb_sync,
>  		.iova_to_phys		= intel_iommu_iova_to_phys,
>  		.free			= intel_iommu_domain_free,
> +		.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
>  	}
>  };
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
>  	u8 has_iotlb_device: 1;
>  	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
>  	u8 iommu_snooping: 1;		/* indicate snooping control feature */
> +	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
>  
>  	struct list_head devices;	/* all devices' list */
>  	struct iova_domain iovad;	/* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>   *            queue
>   * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + *                           including no-snoop TLPs on PCIe or other platform
> + *                           specific mechanisms.
>   * @enable_nesting: Enable nesting
>   * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
>   * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
>  	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
>  				    dma_addr_t iova);
>  
> +	bool (*enforce_cache_coherency)(struct iommu_domain *domain);
>  	int (*enable_nesting)(struct iommu_domain *domain);
>  	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>  				  unsigned long quirks);


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 19:50     ` Alex Williamson
  (?)
@ 2022-04-05 22:57       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-05 22:57 UTC (permalink / raw)
  To: Alex Williamson, Tian, Kevin, Lu Baolu
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, linux-arm-msm,
	linux-arm-kernel, netdev, Cornelia Huck, Robin Murphy,
	Christian Benvenuti

On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> >  
> > +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> > +{
> > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +
> > +	if (!dmar_domain->iommu_snooping)
> > +		return false;
> > +	dmar_domain->enforce_no_snoop = true;
> > +	return true;
> > +}
> 
> Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> support it, ie. reserved register bit set in pte faults?  

The way the Intel driver is setup that is not possible. Currently it
does:

 static bool intel_iommu_capable(enum iommu_cap cap)
 {
	if (cap == IOMMU_CAP_CACHE_COHERENCY)
		return domain_update_iommu_snooping(NULL);

Which is a global property unrelated to any device.

Thus either all devices and all domains support iommu_snooping, or
none do.

It is unclear because for some reason the driver recalculates this
almost constant value on every device attach..

> There's also a disconnect, maybe just in the naming or documentation,
> but if I call enforce_cache_coherency for a domain, that seems like the
> domain should retain those semantics regardless of how it's
> modified,

Right, this is how I would expect it to work.

> ie. "enforced".  For example, if I tried to perform the above operation,
> I should get a failure attaching the device that brings in the less
> capable DMAR because the domain has been set to enforce this
> feature.

We don't have any code causing a failure like this because no driver
needs it.

> Maybe this should be something like set_no_snoop_squashing with the
> above semantics, it needs to be re-applied whenever the domain:device
> composition changes?  Thanks,

If we get a real driver that needs non-uniformity here we can revisit
what to do. There are a couple of good options depending on exactly
what the HW behavior is.

Is it more clear if I fold in the below? It helps show that the
decision to use DMA_PTE_SNP is a global choice based on
domain_update_iommu_snooping():

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e5062461ab0640..fc789a9d955645 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -641,7 +641,6 @@ static unsigned long domain_super_pgsize_bitmap(struct dmar_domain *domain)
 static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
 	domain_update_iommu_coherency(domain);
-	domain->iommu_snooping = domain_update_iommu_snooping(NULL);
 	domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);
 
 	/*
@@ -4283,7 +4282,6 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->agaw = width_to_agaw(adjust_width);
 
 	domain->iommu_coherency = false;
-	domain->iommu_snooping = false;
 	domain->iommu_superpage = 0;
 	domain->max_addr = 0;
 
@@ -4549,7 +4547,7 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 {
 	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
 
-	if (!dmar_domain->iommu_snooping)
+	if (!domain_update_iommu_snooping(NULL))
 		return false;
 	dmar_domain->enforce_no_snoop = true;
 	return true;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 1f930c0c225d94..bc39f633efdf03 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -539,7 +539,6 @@ struct dmar_domain {
 
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
-	u8 iommu_snooping: 1;		/* indicate snooping control feature */
 	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 22:57       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 22:57 UTC (permalink / raw)
  To: Alex Williamson, Tian, Kevin, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> >  
> > +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> > +{
> > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +
> > +	if (!dmar_domain->iommu_snooping)
> > +		return false;
> > +	dmar_domain->enforce_no_snoop = true;
> > +	return true;
> > +}
> 
> Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> support it, ie. reserved register bit set in pte faults?  

The way the Intel driver is setup that is not possible. Currently it
does:

 static bool intel_iommu_capable(enum iommu_cap cap)
 {
	if (cap == IOMMU_CAP_CACHE_COHERENCY)
		return domain_update_iommu_snooping(NULL);

Which is a global property unrelated to any device.

Thus either all devices and all domains support iommu_snooping, or
none do.

It is unclear because for some reason the driver recalculates this
almost constant value on every device attach..

> There's also a disconnect, maybe just in the naming or documentation,
> but if I call enforce_cache_coherency for a domain, that seems like the
> domain should retain those semantics regardless of how it's
> modified,

Right, this is how I would expect it to work.

> ie. "enforced".  For example, if I tried to perform the above operation,
> I should get a failure attaching the device that brings in the less
> capable DMAR because the domain has been set to enforce this
> feature.

We don't have any code causing a failure like this because no driver
needs it.

> Maybe this should be something like set_no_snoop_squashing with the
> above semantics, it needs to be re-applied whenever the domain:device
> composition changes?  Thanks,

If we get a real driver that needs non-uniformity here we can revisit
what to do. There are a couple of good options depending on exactly
what the HW behavior is.

Is it more clear if I fold in the below? It helps show that the
decision to use DMA_PTE_SNP is a global choice based on
domain_update_iommu_snooping():

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e5062461ab0640..fc789a9d955645 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -641,7 +641,6 @@ static unsigned long domain_super_pgsize_bitmap(struct dmar_domain *domain)
 static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
 	domain_update_iommu_coherency(domain);
-	domain->iommu_snooping = domain_update_iommu_snooping(NULL);
 	domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);
 
 	/*
@@ -4283,7 +4282,6 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->agaw = width_to_agaw(adjust_width);
 
 	domain->iommu_coherency = false;
-	domain->iommu_snooping = false;
 	domain->iommu_superpage = 0;
 	domain->max_addr = 0;
 
@@ -4549,7 +4547,7 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 {
 	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
 
-	if (!dmar_domain->iommu_snooping)
+	if (!domain_update_iommu_snooping(NULL))
 		return false;
 	dmar_domain->enforce_no_snoop = true;
 	return true;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 1f930c0c225d94..bc39f633efdf03 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -539,7 +539,6 @@ struct dmar_domain {
 
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
-	u8 iommu_snooping: 1;		/* indicate snooping control feature */
 	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 22:57       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-05 22:57 UTC (permalink / raw)
  To: Alex Williamson, Tian, Kevin, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> >  
> > +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> > +{
> > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +
> > +	if (!dmar_domain->iommu_snooping)
> > +		return false;
> > +	dmar_domain->enforce_no_snoop = true;
> > +	return true;
> > +}
> 
> Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> support it, ie. reserved register bit set in pte faults?  

The way the Intel driver is setup that is not possible. Currently it
does:

 static bool intel_iommu_capable(enum iommu_cap cap)
 {
	if (cap == IOMMU_CAP_CACHE_COHERENCY)
		return domain_update_iommu_snooping(NULL);

Which is a global property unrelated to any device.

Thus either all devices and all domains support iommu_snooping, or
none do.

It is unclear because for some reason the driver recalculates this
almost constant value on every device attach..

> There's also a disconnect, maybe just in the naming or documentation,
> but if I call enforce_cache_coherency for a domain, that seems like the
> domain should retain those semantics regardless of how it's
> modified,

Right, this is how I would expect it to work.

> ie. "enforced".  For example, if I tried to perform the above operation,
> I should get a failure attaching the device that brings in the less
> capable DMAR because the domain has been set to enforce this
> feature.

We don't have any code causing a failure like this because no driver
needs it.

> Maybe this should be something like set_no_snoop_squashing with the
> above semantics, it needs to be re-applied whenever the domain:device
> composition changes?  Thanks,

If we get a real driver that needs non-uniformity here we can revisit
what to do. There are a couple of good options depending on exactly
what the HW behavior is.

Is it more clear if I fold in the below? It helps show that the
decision to use DMA_PTE_SNP is a global choice based on
domain_update_iommu_snooping():

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e5062461ab0640..fc789a9d955645 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -641,7 +641,6 @@ static unsigned long domain_super_pgsize_bitmap(struct dmar_domain *domain)
 static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
 	domain_update_iommu_coherency(domain);
-	domain->iommu_snooping = domain_update_iommu_snooping(NULL);
 	domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);
 
 	/*
@@ -4283,7 +4282,6 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
 	domain->agaw = width_to_agaw(adjust_width);
 
 	domain->iommu_coherency = false;
-	domain->iommu_snooping = false;
 	domain->iommu_superpage = 0;
 	domain->max_addr = 0;
 
@@ -4549,7 +4547,7 @@ static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 {
 	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
 
-	if (!dmar_domain->iommu_snooping)
+	if (!domain_update_iommu_snooping(NULL))
 		return false;
 	dmar_domain->enforce_no_snoop = true;
 	return true;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 1f930c0c225d94..bc39f633efdf03 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -539,7 +539,6 @@ struct dmar_domain {
 
 	u8 has_iotlb_device: 1;
 	u8 iommu_coherency: 1;		/* indicate coherency of iommu access */
-	u8 iommu_snooping: 1;		/* indicate snooping control feature */
 	u8 enforce_no_snoop : 1;        /* Create IOPTEs with snoop control */
 
 	struct list_head devices;	/* all devices' list */

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 22:57       ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-05 23:31         ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-05 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, linux-arm-msm,
	linux-arm-kernel, netdev, Cornelia Huck, Robin Murphy,
	Christian Benvenuti

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 6:58 AM
> 
> On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > >
> > > +static bool intel_iommu_enforce_cache_coherency(struct
> iommu_domain *domain)
> > > +{
> > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > +
> > > +	if (!dmar_domain->iommu_snooping)
> > > +		return false;
> > > +	dmar_domain->enforce_no_snoop = true;
> > > +	return true;
> > > +}
> >
> > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > support it, ie. reserved register bit set in pte faults?
> 
> The way the Intel driver is setup that is not possible. Currently it
> does:
> 
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
> 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> 		return domain_update_iommu_snooping(NULL);
> 
> Which is a global property unrelated to any device.
> 
> Thus either all devices and all domains support iommu_snooping, or
> none do.
> 
> It is unclear because for some reason the driver recalculates this
> almost constant value on every device attach..

The reason is simply because iommu capability is a global flag 😉

when the cap is removed by this series I don't think we need keep that
global recalculation thing.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 23:31         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-05 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, Rob Clark, Suravee Suthikulpanit, Robin Murphy,
	Christian Benvenuti

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 6:58 AM
> 
> On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > >
> > > +static bool intel_iommu_enforce_cache_coherency(struct
> iommu_domain *domain)
> > > +{
> > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > +
> > > +	if (!dmar_domain->iommu_snooping)
> > > +		return false;
> > > +	dmar_domain->enforce_no_snoop = true;
> > > +	return true;
> > > +}
> >
> > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > support it, ie. reserved register bit set in pte faults?
> 
> The way the Intel driver is setup that is not possible. Currently it
> does:
> 
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
> 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> 		return domain_update_iommu_snooping(NULL);
> 
> Which is a global property unrelated to any device.
> 
> Thus either all devices and all domains support iommu_snooping, or
> none do.
> 
> It is unclear because for some reason the driver recalculates this
> almost constant value on every device attach..

The reason is simply because iommu capability is a global flag 😉

when the cap is removed by this series I don't think we need keep that
global recalculation thing.

Thanks
Kevin
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 23:31         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-05 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 6:58 AM
> 
> On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > >
> > > +static bool intel_iommu_enforce_cache_coherency(struct
> iommu_domain *domain)
> > > +{
> > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > +
> > > +	if (!dmar_domain->iommu_snooping)
> > > +		return false;
> > > +	dmar_domain->enforce_no_snoop = true;
> > > +	return true;
> > > +}
> >
> > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > support it, ie. reserved register bit set in pte faults?
> 
> The way the Intel driver is setup that is not possible. Currently it
> does:
> 
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
> 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> 		return domain_update_iommu_snooping(NULL);
> 
> Which is a global property unrelated to any device.
> 
> Thus either all devices and all domains support iommu_snooping, or
> none do.
> 
> It is unclear because for some reason the driver recalculates this
> almost constant value on every device attach..

The reason is simply because iommu capability is a global flag 😉

when the cap is removed by this series I don't think we need keep that
global recalculation thing.

Thanks
Kevin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-05 23:31         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-05 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 6:58 AM
> 
> On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > >
> > > +static bool intel_iommu_enforce_cache_coherency(struct
> iommu_domain *domain)
> > > +{
> > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > +
> > > +	if (!dmar_domain->iommu_snooping)
> > > +		return false;
> > > +	dmar_domain->enforce_no_snoop = true;
> > > +	return true;
> > > +}
> >
> > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > support it, ie. reserved register bit set in pte faults?
> 
> The way the Intel driver is setup that is not possible. Currently it
> does:
> 
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
> 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> 		return domain_update_iommu_snooping(NULL);
> 
> Which is a global property unrelated to any device.
> 
> Thus either all devices and all domains support iommu_snooping, or
> none do.
> 
> It is unclear because for some reason the driver recalculates this
> almost constant value on every device attach..

The reason is simply because iommu capability is a global flag 😉

when the cap is removed by this series I don't think we need keep that
global recalculation thing.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 22:57       ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06  0:08         ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  0:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, linux-arm-msm,
	linux-arm-kernel, netdev, Cornelia Huck, Robin Murphy,
	Christian Benvenuti

> From: Tian, Kevin
> Sent: Wednesday, April 6, 2022 7:32 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 6:58 AM
> >
> > On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > > >
> > > > +static bool intel_iommu_enforce_cache_coherency(struct
> > iommu_domain *domain)
> > > > +{
> > > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > > +
> > > > +	if (!dmar_domain->iommu_snooping)
> > > > +		return false;
> > > > +	dmar_domain->enforce_no_snoop = true;
> > > > +	return true;
> > > > +}
> > >
> > > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > > support it, ie. reserved register bit set in pte faults?
> >
> > The way the Intel driver is setup that is not possible. Currently it
> > does:
> >
> >  static bool intel_iommu_capable(enum iommu_cap cap)
> >  {
> > 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> > 		return domain_update_iommu_snooping(NULL);
> >
> > Which is a global property unrelated to any device.
> >
> > Thus either all devices and all domains support iommu_snooping, or
> > none do.
> >
> > It is unclear because for some reason the driver recalculates this
> > almost constant value on every device attach..
> 
> The reason is simply because iommu capability is a global flag 😉

...and intel iommu driver supports hotplug-ed iommu. But in reality this
recalculation is almost a no-op because the only iommu that doesn't
support force snoop is for Intel GPU and built-in in the platform.

> 
> when the cap is removed by this series I don't think we need keep that
> global recalculation thing.
> 
> Thanks
> Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  0:08         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  0:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, Rob Clark, Suravee Suthikulpanit, Robin Murphy,
	Christian Benvenuti

> From: Tian, Kevin
> Sent: Wednesday, April 6, 2022 7:32 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 6:58 AM
> >
> > On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > > >
> > > > +static bool intel_iommu_enforce_cache_coherency(struct
> > iommu_domain *domain)
> > > > +{
> > > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > > +
> > > > +	if (!dmar_domain->iommu_snooping)
> > > > +		return false;
> > > > +	dmar_domain->enforce_no_snoop = true;
> > > > +	return true;
> > > > +}
> > >
> > > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > > support it, ie. reserved register bit set in pte faults?
> >
> > The way the Intel driver is setup that is not possible. Currently it
> > does:
> >
> >  static bool intel_iommu_capable(enum iommu_cap cap)
> >  {
> > 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> > 		return domain_update_iommu_snooping(NULL);
> >
> > Which is a global property unrelated to any device.
> >
> > Thus either all devices and all domains support iommu_snooping, or
> > none do.
> >
> > It is unclear because for some reason the driver recalculates this
> > almost constant value on every device attach..
> 
> The reason is simply because iommu capability is a global flag 😉

...and intel iommu driver supports hotplug-ed iommu. But in reality this
recalculation is almost a no-op because the only iommu that doesn't
support force snoop is for Intel GPU and built-in in the platform.

> 
> when the cap is removed by this series I don't think we need keep that
> global recalculation thing.
> 
> Thanks
> Kevin
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  0:08         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  0:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Tian, Kevin
> Sent: Wednesday, April 6, 2022 7:32 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 6:58 AM
> >
> > On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > > >
> > > > +static bool intel_iommu_enforce_cache_coherency(struct
> > iommu_domain *domain)
> > > > +{
> > > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > > +
> > > > +	if (!dmar_domain->iommu_snooping)
> > > > +		return false;
> > > > +	dmar_domain->enforce_no_snoop = true;
> > > > +	return true;
> > > > +}
> > >
> > > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > > support it, ie. reserved register bit set in pte faults?
> >
> > The way the Intel driver is setup that is not possible. Currently it
> > does:
> >
> >  static bool intel_iommu_capable(enum iommu_cap cap)
> >  {
> > 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> > 		return domain_update_iommu_snooping(NULL);
> >
> > Which is a global property unrelated to any device.
> >
> > Thus either all devices and all domains support iommu_snooping, or
> > none do.
> >
> > It is unclear because for some reason the driver recalculates this
> > almost constant value on every device attach..
> 
> The reason is simply because iommu capability is a global flag 😉

...and intel iommu driver supports hotplug-ed iommu. But in reality this
recalculation is almost a no-op because the only iommu that doesn't
support force snoop is for Intel GPU and built-in in the platform.

> 
> when the cap is removed by this series I don't think we need keep that
> global recalculation thing.
> 
> Thanks
> Kevin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  0:08         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  0:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu
  Cc: Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Tian, Kevin
> Sent: Wednesday, April 6, 2022 7:32 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 6:58 AM
> >
> > On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > > >
> > > > +static bool intel_iommu_enforce_cache_coherency(struct
> > iommu_domain *domain)
> > > > +{
> > > > +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > > +
> > > > +	if (!dmar_domain->iommu_snooping)
> > > > +		return false;
> > > > +	dmar_domain->enforce_no_snoop = true;
> > > > +	return true;
> > > > +}
> > >
> > > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > > support it, ie. reserved register bit set in pte faults?
> >
> > The way the Intel driver is setup that is not possible. Currently it
> > does:
> >
> >  static bool intel_iommu_capable(enum iommu_cap cap)
> >  {
> > 	if (cap == IOMMU_CAP_CACHE_COHERENCY)
> > 		return domain_update_iommu_snooping(NULL);
> >
> > Which is a global property unrelated to any device.
> >
> > Thus either all devices and all domains support iommu_snooping, or
> > none do.
> >
> > It is unclear because for some reason the driver recalculates this
> > almost constant value on every device attach..
> 
> The reason is simply because iommu capability is a global flag 😉

...and intel iommu driver supports hotplug-ed iommu. But in reality this
recalculation is almost a no-op because the only iommu that doesn't
support force snoop is for Intel GPU and built-in in the platform.

> 
> when the cap is removed by this series I don't think we need keep that
> global recalculation thing.
> 
> Thanks
> Kevin

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-05 16:16   ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06  5:30     ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06  5:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	Robin Murphy, Lu Baolu

On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>  #include <linux/list.h>
>  #include <linux/pci.h>
>  #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>  
>  #include "usnic_log.h"
>  #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>  	struct usnic_uiom_dev *uiom_dev;
>  	int err;
>  
> +	if (!dev_is_dma_coherent(dev)) {

Which part of the comment at the top of dma-map-ops.h is not clear
enough to you?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06  5:30     ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06  5:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, Tian, Kevin, linux-arm-msm,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>  #include <linux/list.h>
>  #include <linux/pci.h>
>  #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>  
>  #include "usnic_log.h"
>  #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>  	struct usnic_uiom_dev *uiom_dev;
>  	int err;
>  
> +	if (!dev_is_dma_coherent(dev)) {

Which part of the comment at the top of dma-map-ops.h is not clear
enough to you?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06  5:30     ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06  5:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>  #include <linux/list.h>
>  #include <linux/pci.h>
>  #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>  
>  #include "usnic_log.h"
>  #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>  	struct usnic_uiom_dev *uiom_dev;
>  	int err;
>  
> +	if (!dev_is_dma_coherent(dev)) {

Which part of the comment at the top of dma-map-ops.h is not clear
enough to you?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06  5:30     ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06  5:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig, Tian, Kevin

On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>  #include <linux/list.h>
>  #include <linux/pci.h>
>  #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>  
>  #include "usnic_log.h"
>  #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>  	struct usnic_uiom_dev *uiom_dev;
>  	int err;
>  
> +	if (!dev_is_dma_coherent(dev)) {

Which part of the comment at the top of dma-map-ops.h is not clear
enough to you?

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
  2022-04-05 16:15 ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06  6:52   ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  6:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> by a platform as bypassing elements in the DMA coherent CPU cache
> hierarchy. A driver can command a device to set this bit on some of its
> transactions as a micro-optimization.
> 
> However, the driver is now responsible to synchronize the CPU cache with
> the DMA that bypassed it. On x86 this is done through the wbinvd
> instruction, and the i915 GPU driver is the only Linux DMA driver that
> calls it.

More accurately x86 supports both unprivileged clflush instructions
to invalidate one cacheline and a privileged wbinvd instruction to
invalidate the entire cache. Replacing 'this is done' with 'this may
be done' is clearer.

> 
> The problem comes that KVM on x86 will normally disable the wbinvd
> instruction in the guest and render it a NOP. As the driver running in the
> guest is not aware the wbinvd doesn't work it may still cause the device
> to set the no-snoop bit and the platform will bypass the CPU cache.
> Without a working wbinvd there is no way to re-synchronize the CPU cache
> and the driver in the VM has data corruption.
> 
> Thus, we see a general direction on x86 that the IOMMU HW is able to block
> the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> to NOP the wbinvd without causing any data corruption.
> 
> This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> multiple
> meanings and usages beyond blocking no-snoop and the whole thing has
> become confused.

Also point out your finding about AMD IOMMU?

> 
> Change it so that:
>  - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
>    device. It is used by the DMA API and set when the DMA API will not be
>    doing manual cache coherency operations.
> 
>  - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
>    device
> 
>  - The new optional domain op enforce_cache_coherency() will cause the
>    entire domain to block no-snoop requests - ie there is no way for any
>    device attached to the domain to opt out of the IOMMU_CACHE behavior.
> 
> An iommu driver should implement enforce_cache_coherency() so that by
> default domains allow the no-snoop optimization. This leaves it available
> to kernel drivers like i915. VFIO will call enforce_cache_coherency()
> before establishing any mappings and the domain should then permanently
> block no-snoop.
> 
> If enforce_cache_coherency() fails VFIO will communicate back through to
> KVM into the arch code via kvm_arch_register_noncoherent_dma()
> (only implemented by x86) which triggers a working wbinvd to be made
> available to the VM.
> 
> While other arches are certainly welcome to implement
> enforce_cache_coherency(), it is not clear there is any benefit in doing
> so.
> 
> After this series there are only two calls left to iommu_capable() with a
> bus argument which should help Robin's work here.
> 
> This is on github:
> https://github.com/jgunthorpe/linux/commits/intel_no_snoop
> 
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Jason Gunthorpe (5):
>   iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
>     dev_is_dma_coherent()
>   vfio: Require that devices support DMA cache coherence
>   iommu: Introduce the domain op enforce_cache_coherency()
>   vfio: Move the Intel no-snoop control off of IOMMU_CACHE
>   iommu: Delete IOMMU_CAP_CACHE_COHERENCY
> 
>  drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
>  drivers/iommu/amd/iommu.c                   |  9 +++++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
>  drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
>  drivers/iommu/fsl_pamu_domain.c             |  6 -----
>  drivers/iommu/intel/iommu.c                 | 15 ++++++++---
>  drivers/iommu/s390-iommu.c                  |  2 --
>  drivers/vfio/vfio.c                         |  6 +++++
>  drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
>  drivers/vhost/vdpa.c                        |  3 ++-
>  include/linux/intel-iommu.h                 |  1 +
>  include/linux/iommu.h                       |  6 +++--
>  13 files changed, 58 insertions(+), 50 deletions(-)
> 
> 
> base-commit: 3123109284176b1532874591f7c81f3837bbdc17
> --
> 2.35.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-06  6:52   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  6:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> by a platform as bypassing elements in the DMA coherent CPU cache
> hierarchy. A driver can command a device to set this bit on some of its
> transactions as a micro-optimization.
> 
> However, the driver is now responsible to synchronize the CPU cache with
> the DMA that bypassed it. On x86 this is done through the wbinvd
> instruction, and the i915 GPU driver is the only Linux DMA driver that
> calls it.

More accurately x86 supports both unprivileged clflush instructions
to invalidate one cacheline and a privileged wbinvd instruction to
invalidate the entire cache. Replacing 'this is done' with 'this may
be done' is clearer.

> 
> The problem comes that KVM on x86 will normally disable the wbinvd
> instruction in the guest and render it a NOP. As the driver running in the
> guest is not aware the wbinvd doesn't work it may still cause the device
> to set the no-snoop bit and the platform will bypass the CPU cache.
> Without a working wbinvd there is no way to re-synchronize the CPU cache
> and the driver in the VM has data corruption.
> 
> Thus, we see a general direction on x86 that the IOMMU HW is able to block
> the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> to NOP the wbinvd without causing any data corruption.
> 
> This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> multiple
> meanings and usages beyond blocking no-snoop and the whole thing has
> become confused.

Also point out your finding about AMD IOMMU?

> 
> Change it so that:
>  - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
>    device. It is used by the DMA API and set when the DMA API will not be
>    doing manual cache coherency operations.
> 
>  - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
>    device
> 
>  - The new optional domain op enforce_cache_coherency() will cause the
>    entire domain to block no-snoop requests - ie there is no way for any
>    device attached to the domain to opt out of the IOMMU_CACHE behavior.
> 
> An iommu driver should implement enforce_cache_coherency() so that by
> default domains allow the no-snoop optimization. This leaves it available
> to kernel drivers like i915. VFIO will call enforce_cache_coherency()
> before establishing any mappings and the domain should then permanently
> block no-snoop.
> 
> If enforce_cache_coherency() fails VFIO will communicate back through to
> KVM into the arch code via kvm_arch_register_noncoherent_dma()
> (only implemented by x86) which triggers a working wbinvd to be made
> available to the VM.
> 
> While other arches are certainly welcome to implement
> enforce_cache_coherency(), it is not clear there is any benefit in doing
> so.
> 
> After this series there are only two calls left to iommu_capable() with a
> bus argument which should help Robin's work here.
> 
> This is on github:
> https://github.com/jgunthorpe/linux/commits/intel_no_snoop
> 
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Jason Gunthorpe (5):
>   iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
>     dev_is_dma_coherent()
>   vfio: Require that devices support DMA cache coherence
>   iommu: Introduce the domain op enforce_cache_coherency()
>   vfio: Move the Intel no-snoop control off of IOMMU_CACHE
>   iommu: Delete IOMMU_CAP_CACHE_COHERENCY
> 
>  drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
>  drivers/iommu/amd/iommu.c                   |  9 +++++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
>  drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
>  drivers/iommu/fsl_pamu_domain.c             |  6 -----
>  drivers/iommu/intel/iommu.c                 | 15 ++++++++---
>  drivers/iommu/s390-iommu.c                  |  2 --
>  drivers/vfio/vfio.c                         |  6 +++++
>  drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
>  drivers/vhost/vdpa.c                        |  3 ++-
>  include/linux/intel-iommu.h                 |  1 +
>  include/linux/iommu.h                       |  6 +++--
>  13 files changed, 58 insertions(+), 50 deletions(-)
> 
> 
> base-commit: 3123109284176b1532874591f7c81f3837bbdc17
> --
> 2.35.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-06  6:52   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  6:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> by a platform as bypassing elements in the DMA coherent CPU cache
> hierarchy. A driver can command a device to set this bit on some of its
> transactions as a micro-optimization.
> 
> However, the driver is now responsible to synchronize the CPU cache with
> the DMA that bypassed it. On x86 this is done through the wbinvd
> instruction, and the i915 GPU driver is the only Linux DMA driver that
> calls it.

More accurately x86 supports both unprivileged clflush instructions
to invalidate one cacheline and a privileged wbinvd instruction to
invalidate the entire cache. Replacing 'this is done' with 'this may
be done' is clearer.

> 
> The problem comes that KVM on x86 will normally disable the wbinvd
> instruction in the guest and render it a NOP. As the driver running in the
> guest is not aware the wbinvd doesn't work it may still cause the device
> to set the no-snoop bit and the platform will bypass the CPU cache.
> Without a working wbinvd there is no way to re-synchronize the CPU cache
> and the driver in the VM has data corruption.
> 
> Thus, we see a general direction on x86 that the IOMMU HW is able to block
> the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> to NOP the wbinvd without causing any data corruption.
> 
> This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> multiple
> meanings and usages beyond blocking no-snoop and the whole thing has
> become confused.

Also point out your finding about AMD IOMMU?

> 
> Change it so that:
>  - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
>    device. It is used by the DMA API and set when the DMA API will not be
>    doing manual cache coherency operations.
> 
>  - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
>    device
> 
>  - The new optional domain op enforce_cache_coherency() will cause the
>    entire domain to block no-snoop requests - ie there is no way for any
>    device attached to the domain to opt out of the IOMMU_CACHE behavior.
> 
> An iommu driver should implement enforce_cache_coherency() so that by
> default domains allow the no-snoop optimization. This leaves it available
> to kernel drivers like i915. VFIO will call enforce_cache_coherency()
> before establishing any mappings and the domain should then permanently
> block no-snoop.
> 
> If enforce_cache_coherency() fails VFIO will communicate back through to
> KVM into the arch code via kvm_arch_register_noncoherent_dma()
> (only implemented by x86) which triggers a working wbinvd to be made
> available to the VM.
> 
> While other arches are certainly welcome to implement
> enforce_cache_coherency(), it is not clear there is any benefit in doing
> so.
> 
> After this series there are only two calls left to iommu_capable() with a
> bus argument which should help Robin's work here.
> 
> This is on github:
> https://github.com/jgunthorpe/linux/commits/intel_no_snoop
> 
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Jason Gunthorpe (5):
>   iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
>     dev_is_dma_coherent()
>   vfio: Require that devices support DMA cache coherence
>   iommu: Introduce the domain op enforce_cache_coherency()
>   vfio: Move the Intel no-snoop control off of IOMMU_CACHE
>   iommu: Delete IOMMU_CAP_CACHE_COHERENCY
> 
>  drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
>  drivers/iommu/amd/iommu.c                   |  9 +++++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
>  drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
>  drivers/iommu/fsl_pamu_domain.c             |  6 -----
>  drivers/iommu/intel/iommu.c                 | 15 ++++++++---
>  drivers/iommu/s390-iommu.c                  |  2 --
>  drivers/vfio/vfio.c                         |  6 +++++
>  drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
>  drivers/vhost/vdpa.c                        |  3 ++-
>  include/linux/intel-iommu.h                 |  1 +
>  include/linux/iommu.h                       |  6 +++--
>  13 files changed, 58 insertions(+), 50 deletions(-)
> 
> 
> base-commit: 3123109284176b1532874591f7c81f3837bbdc17
> --
> 2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-06  6:52   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  6:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> by a platform as bypassing elements in the DMA coherent CPU cache
> hierarchy. A driver can command a device to set this bit on some of its
> transactions as a micro-optimization.
> 
> However, the driver is now responsible to synchronize the CPU cache with
> the DMA that bypassed it. On x86 this is done through the wbinvd
> instruction, and the i915 GPU driver is the only Linux DMA driver that
> calls it.

More accurately x86 supports both unprivileged clflush instructions
to invalidate one cacheline and a privileged wbinvd instruction to
invalidate the entire cache. Replacing 'this is done' with 'this may
be done' is clearer.

> 
> The problem comes that KVM on x86 will normally disable the wbinvd
> instruction in the guest and render it a NOP. As the driver running in the
> guest is not aware the wbinvd doesn't work it may still cause the device
> to set the no-snoop bit and the platform will bypass the CPU cache.
> Without a working wbinvd there is no way to re-synchronize the CPU cache
> and the driver in the VM has data corruption.
> 
> Thus, we see a general direction on x86 that the IOMMU HW is able to block
> the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> to NOP the wbinvd without causing any data corruption.
> 
> This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> multiple
> meanings and usages beyond blocking no-snoop and the whole thing has
> become confused.

Also point out your finding about AMD IOMMU?

> 
> Change it so that:
>  - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
>    device. It is used by the DMA API and set when the DMA API will not be
>    doing manual cache coherency operations.
> 
>  - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the
>    device
> 
>  - The new optional domain op enforce_cache_coherency() will cause the
>    entire domain to block no-snoop requests - ie there is no way for any
>    device attached to the domain to opt out of the IOMMU_CACHE behavior.
> 
> An iommu driver should implement enforce_cache_coherency() so that by
> default domains allow the no-snoop optimization. This leaves it available
> to kernel drivers like i915. VFIO will call enforce_cache_coherency()
> before establishing any mappings and the domain should then permanently
> block no-snoop.
> 
> If enforce_cache_coherency() fails VFIO will communicate back through to
> KVM into the arch code via kvm_arch_register_noncoherent_dma()
> (only implemented by x86) which triggers a working wbinvd to be made
> available to the VM.
> 
> While other arches are certainly welcome to implement
> enforce_cache_coherency(), it is not clear there is any benefit in doing
> so.
> 
> After this series there are only two calls left to iommu_capable() with a
> bus argument which should help Robin's work here.
> 
> This is on github:
> https://github.com/jgunthorpe/linux/commits/intel_no_snoop
> 
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Jason Gunthorpe (5):
>   iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with
>     dev_is_dma_coherent()
>   vfio: Require that devices support DMA cache coherence
>   iommu: Introduce the domain op enforce_cache_coherency()
>   vfio: Move the Intel no-snoop control off of IOMMU_CACHE
>   iommu: Delete IOMMU_CAP_CACHE_COHERENCY
> 
>  drivers/infiniband/hw/usnic/usnic_uiom.c    | 16 +++++------
>  drivers/iommu/amd/iommu.c                   |  9 +++++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 --
>  drivers/iommu/arm/arm-smmu/arm-smmu.c       |  6 -----
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c     |  6 -----
>  drivers/iommu/fsl_pamu_domain.c             |  6 -----
>  drivers/iommu/intel/iommu.c                 | 15 ++++++++---
>  drivers/iommu/s390-iommu.c                  |  2 --
>  drivers/vfio/vfio.c                         |  6 +++++
>  drivers/vfio/vfio_iommu_type1.c             | 30 +++++++++++++--------
>  drivers/vhost/vdpa.c                        |  3 ++-
>  include/linux/intel-iommu.h                 |  1 +
>  include/linux/iommu.h                       |  6 +++--
>  13 files changed, 58 insertions(+), 50 deletions(-)
> 
> 
> base-commit: 3123109284176b1532874591f7c81f3837bbdc17
> --
> 2.35.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
  2022-04-05 19:29       ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06  7:02         ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, David Woodhouse, linux-arm-msm,
	linux-arm-kernel, netdev, Cornelia Huck, Christian Benvenuti,
	Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 3:29 AM
> 
> On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> > On Tue,  5 Apr 2022 13:16:01 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can
> be
> > > supported.
> > >
> > > IOMMU_CACHE means that normal DMAs do not require any additional
> coherency
> > > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > > instance VFIO applications like DPDK will not work if additional coherency
> > > operations are required.
> > >
> > > Therefore check dev_is_dma_coherent() before allowing a device to join
> a
> > > domain. This will block device/platform/iommu combinations from using
> VFIO
> > > that do not support cache coherent DMA.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/vfio/vfio.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index a4555014bd1e72..2a3aa3e742d943 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -32,6 +32,7 @@
> > >  #include <linux/vfio.h>
> > >  #include <linux/wait.h>
> > >  #include <linux/sched/signal.h>
> > > +#include <linux/dma-map-ops.h>
> > >  #include "vfio.h"
> > >
> > >  #define DRIVER_VERSION	"0.3"
> > > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> > >  	if (IS_ERR(device))
> > >  		return PTR_ERR(device);
> > >
> > > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device-
> >dev)) {
> > > +		ret = -ENODEV;
> > > +		goto err_device_put;
> > > +	}
> > > +
> >
> > Failing at the point where the user is trying to gain access to the
> > device seems a little late in the process and opaque, wouldn't we
> > rather have vfio bus drivers fail to probe such devices?  I'd expect
> > this to occur in the vfio_register_group_dev() path.  Thanks,
> 
> Yes, that is a good point.
> 
> So like this:
> 
>  int vfio_register_group_dev(struct vfio_device *device)
>  {
> +       if (!dev_is_dma_coherent(device->dev))
> +               return -EINVAL;
> +
>         return __vfio_register_dev(device,
>                 vfio_group_find_or_alloc(device->dev));
>  }
> 
> I fixed it up.
> 

if that is the case should it also apply to usnic and vdpa in the first
patch (i.e. fail the probe)?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-06  7:02         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, Rob Clark, Suravee Suthikulpanit,
	Christian Benvenuti, Robin Murphy, Lu Baolu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 3:29 AM
> 
> On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> > On Tue,  5 Apr 2022 13:16:01 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can
> be
> > > supported.
> > >
> > > IOMMU_CACHE means that normal DMAs do not require any additional
> coherency
> > > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > > instance VFIO applications like DPDK will not work if additional coherency
> > > operations are required.
> > >
> > > Therefore check dev_is_dma_coherent() before allowing a device to join
> a
> > > domain. This will block device/platform/iommu combinations from using
> VFIO
> > > that do not support cache coherent DMA.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/vfio/vfio.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index a4555014bd1e72..2a3aa3e742d943 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -32,6 +32,7 @@
> > >  #include <linux/vfio.h>
> > >  #include <linux/wait.h>
> > >  #include <linux/sched/signal.h>
> > > +#include <linux/dma-map-ops.h>
> > >  #include "vfio.h"
> > >
> > >  #define DRIVER_VERSION	"0.3"
> > > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> > >  	if (IS_ERR(device))
> > >  		return PTR_ERR(device);
> > >
> > > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device-
> >dev)) {
> > > +		ret = -ENODEV;
> > > +		goto err_device_put;
> > > +	}
> > > +
> >
> > Failing at the point where the user is trying to gain access to the
> > device seems a little late in the process and opaque, wouldn't we
> > rather have vfio bus drivers fail to probe such devices?  I'd expect
> > this to occur in the vfio_register_group_dev() path.  Thanks,
> 
> Yes, that is a good point.
> 
> So like this:
> 
>  int vfio_register_group_dev(struct vfio_device *device)
>  {
> +       if (!dev_is_dma_coherent(device->dev))
> +               return -EINVAL;
> +
>         return __vfio_register_dev(device,
>                 vfio_group_find_or_alloc(device->dev));
>  }
> 
> I fixed it up.
> 

if that is the case should it also apply to usnic and vdpa in the first
patch (i.e. fail the probe)?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-06  7:02         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 3:29 AM
> 
> On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> > On Tue,  5 Apr 2022 13:16:01 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can
> be
> > > supported.
> > >
> > > IOMMU_CACHE means that normal DMAs do not require any additional
> coherency
> > > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > > instance VFIO applications like DPDK will not work if additional coherency
> > > operations are required.
> > >
> > > Therefore check dev_is_dma_coherent() before allowing a device to join
> a
> > > domain. This will block device/platform/iommu combinations from using
> VFIO
> > > that do not support cache coherent DMA.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/vfio/vfio.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index a4555014bd1e72..2a3aa3e742d943 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -32,6 +32,7 @@
> > >  #include <linux/vfio.h>
> > >  #include <linux/wait.h>
> > >  #include <linux/sched/signal.h>
> > > +#include <linux/dma-map-ops.h>
> > >  #include "vfio.h"
> > >
> > >  #define DRIVER_VERSION	"0.3"
> > > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> > >  	if (IS_ERR(device))
> > >  		return PTR_ERR(device);
> > >
> > > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device-
> >dev)) {
> > > +		ret = -ENODEV;
> > > +		goto err_device_put;
> > > +	}
> > > +
> >
> > Failing at the point where the user is trying to gain access to the
> > device seems a little late in the process and opaque, wouldn't we
> > rather have vfio bus drivers fail to probe such devices?  I'd expect
> > this to occur in the vfio_register_group_dev() path.  Thanks,
> 
> Yes, that is a good point.
> 
> So like this:
> 
>  int vfio_register_group_dev(struct vfio_device *device)
>  {
> +       if (!dev_is_dma_coherent(device->dev))
> +               return -EINVAL;
> +
>         return __vfio_register_dev(device,
>                 vfio_group_find_or_alloc(device->dev));
>  }
> 
> I fixed it up.
> 

if that is the case should it also apply to usnic and vdpa in the first
patch (i.e. fail the probe)?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-06  7:02         ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 3:29 AM
> 
> On Tue, Apr 05, 2022 at 01:10:44PM -0600, Alex Williamson wrote:
> > On Tue,  5 Apr 2022 13:16:01 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can
> be
> > > supported.
> > >
> > > IOMMU_CACHE means that normal DMAs do not require any additional
> coherency
> > > mechanism and is the basic uAPI that VFIO exposes to userspace. For
> > > instance VFIO applications like DPDK will not work if additional coherency
> > > operations are required.
> > >
> > > Therefore check dev_is_dma_coherent() before allowing a device to join
> a
> > > domain. This will block device/platform/iommu combinations from using
> VFIO
> > > that do not support cache coherent DMA.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/vfio/vfio.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index a4555014bd1e72..2a3aa3e742d943 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -32,6 +32,7 @@
> > >  #include <linux/vfio.h>
> > >  #include <linux/wait.h>
> > >  #include <linux/sched/signal.h>
> > > +#include <linux/dma-map-ops.h>
> > >  #include "vfio.h"
> > >
> > >  #define DRIVER_VERSION	"0.3"
> > > @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> > >  	if (IS_ERR(device))
> > >  		return PTR_ERR(device);
> > >
> > > +	if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device-
> >dev)) {
> > > +		ret = -ENODEV;
> > > +		goto err_device_put;
> > > +	}
> > > +
> >
> > Failing at the point where the user is trying to gain access to the
> > device seems a little late in the process and opaque, wouldn't we
> > rather have vfio bus drivers fail to probe such devices?  I'd expect
> > this to occur in the vfio_register_group_dev() path.  Thanks,
> 
> Yes, that is a good point.
> 
> So like this:
> 
>  int vfio_register_group_dev(struct vfio_device *device)
>  {
> +       if (!dev_is_dma_coherent(device->dev))
> +               return -EINVAL;
> +
>         return __vfio_register_dev(device,
>                 vfio_group_find_or_alloc(device->dev));
>  }
> 
> I fixed it up.
> 

if that is the case should it also apply to usnic and vdpa in the first
patch (i.e. fail the probe)?

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-05 16:16   ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06  7:09     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.

Is it too restrictive? In theory vdpa may also implement a contract with
KVM and then wants to call this new API too?

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  7:09     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.

Is it too restrictive? In theory vdpa may also implement a contract with
KVM and then wants to call this new API too?

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  7:09     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.

Is it too restrictive? In theory vdpa may also implement a contract with
KVM and then wants to call this new API too?


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06  7:09     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-06  7:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Robin Murphy,
	Suravee Suthikulpanit, virtualization, Will Deacon
  Cc: Christoph Hellwig

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 6, 2022 12:16 AM
> 
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.

Is it too restrictive? In theory vdpa may also implement a contract with
KVM and then wants to call this new API too?


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06  5:30     ` Christoph Hellwig
  (?)
@ 2022-04-06 12:07       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 12:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 07:30:39AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> > diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > index 760b254ba42d6b..24d118198ac756 100644
> > +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > @@ -42,6 +42,7 @@
> >  #include <linux/list.h>
> >  #include <linux/pci.h>
> >  #include <rdma/ib_verbs.h>
> > +#include <linux/dma-map-ops.h>
> >  
> >  #include "usnic_log.h"
> >  #include "usnic_uiom.h"
> > @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
> >  	struct usnic_uiom_dev *uiom_dev;
> >  	int err;
> >  
> > +	if (!dev_is_dma_coherent(dev)) {
> 
> Which part of the comment at the top of dma-map-ops.h is not clear
> enough to you?

Didn't see it

I'll move dev_is_dma_coherent to device.h along with
device_iommu_mapped() and others then

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 12:07       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 12:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 07:30:39AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> > diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > index 760b254ba42d6b..24d118198ac756 100644
> > +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > @@ -42,6 +42,7 @@
> >  #include <linux/list.h>
> >  #include <linux/pci.h>
> >  #include <rdma/ib_verbs.h>
> > +#include <linux/dma-map-ops.h>
> >  
> >  #include "usnic_log.h"
> >  #include "usnic_uiom.h"
> > @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
> >  	struct usnic_uiom_dev *uiom_dev;
> >  	int err;
> >  
> > +	if (!dev_is_dma_coherent(dev)) {
> 
> Which part of the comment at the top of dma-map-ops.h is not clear
> enough to you?

Didn't see it

I'll move dev_is_dma_coherent to device.h along with
device_iommu_mapped() and others then

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 12:07       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 12:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 07:30:39AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> > diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > index 760b254ba42d6b..24d118198ac756 100644
> > +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> > @@ -42,6 +42,7 @@
> >  #include <linux/list.h>
> >  #include <linux/pci.h>
> >  #include <rdma/ib_verbs.h>
> > +#include <linux/dma-map-ops.h>
> >  
> >  #include "usnic_log.h"
> >  #include "usnic_uiom.h"
> > @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
> >  	struct usnic_uiom_dev *uiom_dev;
> >  	int err;
> >  
> > +	if (!dev_is_dma_coherent(dev)) {
> 
> Which part of the comment at the top of dma-map-ops.h is not clear
> enough to you?

Didn't see it

I'll move dev_is_dma_coherent to device.h along with
device_iommu_mapped() and others then

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
  2022-04-06  7:09     ` Tian, Kevin
  (?)
@ 2022-04-06 12:27       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 12:27 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 07:09:49AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> > and
> > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> > 
> > Currently only Intel and AMD IOMMUs are known to support this
> > feature. They both implement it as an IOPTE bit, that when set, will cause
> > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> > the no-snoop bit was clear.
> > 
> > The new API is triggered by calling enforce_cache_coherency() before
> > mapping any IOVA to the domain which globally switches on no-snoop
> > blocking. This allows other implementations that might block no-snoop
> > globally and outside the IOPTE - AMD also documents such an HW capability.
> > 
> > Leave AMD out of sync with Intel and have it block no-snoop even for
> > in-kernel users. This can be trivially resolved in a follow up patch.
> > 
> > Only VFIO will call this new API.
> 
> Is it too restrictive? In theory vdpa may also implement a contract with
> KVM and then wants to call this new API too?

I expect iommufd to handle this for all kernel users eventually.

Certainly vdpa should not be calling kvm functions.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06 12:27       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 12:27 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 07:09:49AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> > and
> > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> > 
> > Currently only Intel and AMD IOMMUs are known to support this
> > feature. They both implement it as an IOPTE bit, that when set, will cause
> > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> > the no-snoop bit was clear.
> > 
> > The new API is triggered by calling enforce_cache_coherency() before
> > mapping any IOVA to the domain which globally switches on no-snoop
> > blocking. This allows other implementations that might block no-snoop
> > globally and outside the IOPTE - AMD also documents such an HW capability.
> > 
> > Leave AMD out of sync with Intel and have it block no-snoop even for
> > in-kernel users. This can be trivially resolved in a follow up patch.
> > 
> > Only VFIO will call this new API.
> 
> Is it too restrictive? In theory vdpa may also implement a contract with
> KVM and then wants to call this new API too?

I expect iommufd to handle this for all kernel users eventually.

Certainly vdpa should not be calling kvm functions.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()
@ 2022-04-06 12:27       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 12:27 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 07:09:49AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY
> > and
> > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> > 
> > Currently only Intel and AMD IOMMUs are known to support this
> > feature. They both implement it as an IOPTE bit, that when set, will cause
> > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> > the no-snoop bit was clear.
> > 
> > The new API is triggered by calling enforce_cache_coherency() before
> > mapping any IOVA to the domain which globally switches on no-snoop
> > blocking. This allows other implementations that might block no-snoop
> > globally and outside the IOPTE - AMD also documents such an HW capability.
> > 
> > Leave AMD out of sync with Intel and have it block no-snoop even for
> > in-kernel users. This can be trivially resolved in a follow up patch.
> > 
> > Only VFIO will call this new API.
> 
> Is it too restrictive? In theory vdpa may also implement a contract with
> KVM and then wants to call this new API too?

I expect iommufd to handle this for all kernel users eventually.

Certainly vdpa should not be calling kvm functions.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 12:07       ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 13:51         ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 13:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, Tian, Kevin, linux-arm-msm,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> Didn't see it
> 
> I'll move dev_is_dma_coherent to device.h along with
> device_iommu_mapped() and others then

No.  It it is internal for a reason.  It also doesn't actually work
outside of the dma core.  E.g. for non-swiotlb ARM configs it will
not actually work.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:51         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 13:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	Robin Murphy, Lu Baolu

On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> Didn't see it
> 
> I'll move dev_is_dma_coherent to device.h along with
> device_iommu_mapped() and others then

No.  It it is internal for a reason.  It also doesn't actually work
outside of the dma core.  E.g. for non-swiotlb ARM configs it will
not actually work.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:51         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 13:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> Didn't see it
> 
> I'll move dev_is_dma_coherent to device.h along with
> device_iommu_mapped() and others then

No.  It it is internal for a reason.  It also doesn't actually work
outside of the dma core.  E.g. for non-swiotlb ARM configs it will
not actually work.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:51         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 13:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> Didn't see it
> 
> I'll move dev_is_dma_coherent to device.h along with
> device_iommu_mapped() and others then

No.  It it is internal for a reason.  It also doesn't actually work
outside of the dma core.  E.g. for non-swiotlb ARM configs it will
not actually work.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-05 16:16   ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 13:56     ` Robin Murphy
  -1 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 13:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Tian, Kevin, Christoph Hellwig

On 2022-04-05 17:16, Jason Gunthorpe wrote:
> vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> way to do this is via dev_is_dma_coherent()

Not necessarily...

Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
systems, there's the more interesting effectively-opposite scenario 
where an SMMU bridges non-coherent devices to a coherent interconnect. 
It's not something we take advantage of yet in Linux, and it can only be 
properly described in ACPI, but there do exist situations where 
IOMMU_CACHE is capable of making the device's traffic snoop, but 
dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
would still say non-coherent because they can't assume that the SMMU is 
enabled and programmed in just the right way.

I've also not thought too much about how things might look with S2FWB 
thrown into the mix in future...

Robin.

> like the DMA API does. If
> IOMMU_CACHE is not supported then these drivers won't work as they don't
> call any coherency-restoring routines around their DMAs.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
>   drivers/vhost/vdpa.c                     |  3 ++-
>   2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>   #include <linux/list.h>
>   #include <linux/pci.h>
>   #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "usnic_log.h"
>   #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	struct usnic_uiom_dev *uiom_dev;
>   	int err;
>   
> +	if (!dev_is_dma_coherent(dev)) {
> +		usnic_err("IOMMU of %s does not support cache coherency\n",
> +				dev_name(dev));
> +		return -EINVAL;
> +	}
> +
>   	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
>   	if (!uiom_dev)
>   		return -ENOMEM;
> @@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	if (err)
>   		goto out_free_dev;
>   
> -	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
> -		usnic_err("IOMMU of %s does not support cache coherency\n",
> -				dev_name(dev));
> -		err = -EINVAL;
> -		goto out_detach_device;
> -	}
> -
>   	spin_lock(&pd->lock);
>   	list_add_tail(&uiom_dev->link, &pd->devs);
>   	pd->dev_cnt++;
> @@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   
>   	return 0;
>   
> -out_detach_device:
> -	iommu_detach_device(pd->domain, dev);
>   out_free_dev:
>   	kfree(uiom_dev);
>   	return err;
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 4c2f0bd062856a..05ea5800febc37 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -22,6 +22,7 @@
>   #include <linux/vdpa.h>
>   #include <linux/nospec.h>
>   #include <linux/vhost.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "vhost.h"
>   
> @@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
>   	if (!bus)
>   		return -EFAULT;
>   
> -	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (!dev_is_dma_coherent(dma_dev))
>   		return -ENOTSUPP;
>   
>   	v->domain = iommu_domain_alloc(bus);
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:56     ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 13:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig

On 2022-04-05 17:16, Jason Gunthorpe wrote:
> vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> way to do this is via dev_is_dma_coherent()

Not necessarily...

Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
systems, there's the more interesting effectively-opposite scenario 
where an SMMU bridges non-coherent devices to a coherent interconnect. 
It's not something we take advantage of yet in Linux, and it can only be 
properly described in ACPI, but there do exist situations where 
IOMMU_CACHE is capable of making the device's traffic snoop, but 
dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
would still say non-coherent because they can't assume that the SMMU is 
enabled and programmed in just the right way.

I've also not thought too much about how things might look with S2FWB 
thrown into the mix in future...

Robin.

> like the DMA API does. If
> IOMMU_CACHE is not supported then these drivers won't work as they don't
> call any coherency-restoring routines around their DMAs.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
>   drivers/vhost/vdpa.c                     |  3 ++-
>   2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>   #include <linux/list.h>
>   #include <linux/pci.h>
>   #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "usnic_log.h"
>   #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	struct usnic_uiom_dev *uiom_dev;
>   	int err;
>   
> +	if (!dev_is_dma_coherent(dev)) {
> +		usnic_err("IOMMU of %s does not support cache coherency\n",
> +				dev_name(dev));
> +		return -EINVAL;
> +	}
> +
>   	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
>   	if (!uiom_dev)
>   		return -ENOMEM;
> @@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	if (err)
>   		goto out_free_dev;
>   
> -	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
> -		usnic_err("IOMMU of %s does not support cache coherency\n",
> -				dev_name(dev));
> -		err = -EINVAL;
> -		goto out_detach_device;
> -	}
> -
>   	spin_lock(&pd->lock);
>   	list_add_tail(&uiom_dev->link, &pd->devs);
>   	pd->dev_cnt++;
> @@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   
>   	return 0;
>   
> -out_detach_device:
> -	iommu_detach_device(pd->domain, dev);
>   out_free_dev:
>   	kfree(uiom_dev);
>   	return err;
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 4c2f0bd062856a..05ea5800febc37 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -22,6 +22,7 @@
>   #include <linux/vdpa.h>
>   #include <linux/nospec.h>
>   #include <linux/vhost.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "vhost.h"
>   
> @@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
>   	if (!bus)
>   		return -EFAULT;
>   
> -	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (!dev_is_dma_coherent(dma_dev))
>   		return -ENOTSUPP;
>   
>   	v->domain = iommu_domain_alloc(bus);
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:56     ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 13:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

On 2022-04-05 17:16, Jason Gunthorpe wrote:
> vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> way to do this is via dev_is_dma_coherent()

Not necessarily...

Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
systems, there's the more interesting effectively-opposite scenario 
where an SMMU bridges non-coherent devices to a coherent interconnect. 
It's not something we take advantage of yet in Linux, and it can only be 
properly described in ACPI, but there do exist situations where 
IOMMU_CACHE is capable of making the device's traffic snoop, but 
dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
would still say non-coherent because they can't assume that the SMMU is 
enabled and programmed in just the right way.

I've also not thought too much about how things might look with S2FWB 
thrown into the mix in future...

Robin.

> like the DMA API does. If
> IOMMU_CACHE is not supported then these drivers won't work as they don't
> call any coherency-restoring routines around their DMAs.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
>   drivers/vhost/vdpa.c                     |  3 ++-
>   2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>   #include <linux/list.h>
>   #include <linux/pci.h>
>   #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "usnic_log.h"
>   #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	struct usnic_uiom_dev *uiom_dev;
>   	int err;
>   
> +	if (!dev_is_dma_coherent(dev)) {
> +		usnic_err("IOMMU of %s does not support cache coherency\n",
> +				dev_name(dev));
> +		return -EINVAL;
> +	}
> +
>   	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
>   	if (!uiom_dev)
>   		return -ENOMEM;
> @@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	if (err)
>   		goto out_free_dev;
>   
> -	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
> -		usnic_err("IOMMU of %s does not support cache coherency\n",
> -				dev_name(dev));
> -		err = -EINVAL;
> -		goto out_detach_device;
> -	}
> -
>   	spin_lock(&pd->lock);
>   	list_add_tail(&uiom_dev->link, &pd->devs);
>   	pd->dev_cnt++;
> @@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   
>   	return 0;
>   
> -out_detach_device:
> -	iommu_detach_device(pd->domain, dev);
>   out_free_dev:
>   	kfree(uiom_dev);
>   	return err;
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 4c2f0bd062856a..05ea5800febc37 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -22,6 +22,7 @@
>   #include <linux/vdpa.h>
>   #include <linux/nospec.h>
>   #include <linux/vhost.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "vhost.h"
>   
> @@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
>   	if (!bus)
>   		return -EFAULT;
>   
> -	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (!dev_is_dma_coherent(dma_dev))
>   		return -ENOTSUPP;
>   
>   	v->domain = iommu_domain_alloc(bus);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 13:56     ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 13:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon
  Cc: Christoph Hellwig, Tian, Kevin

On 2022-04-05 17:16, Jason Gunthorpe wrote:
> vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> way to do this is via dev_is_dma_coherent()

Not necessarily...

Disregarding the complete disaster of PCIe No Snoop on Arm-Based 
systems, there's the more interesting effectively-opposite scenario 
where an SMMU bridges non-coherent devices to a coherent interconnect. 
It's not something we take advantage of yet in Linux, and it can only be 
properly described in ACPI, but there do exist situations where 
IOMMU_CACHE is capable of making the device's traffic snoop, but 
dev_is_dma_coherent() - and device_get_dma_attr() for external users - 
would still say non-coherent because they can't assume that the SMMU is 
enabled and programmed in just the right way.

I've also not thought too much about how things might look with S2FWB 
thrown into the mix in future...

Robin.

> like the DMA API does. If
> IOMMU_CACHE is not supported then these drivers won't work as they don't
> call any coherency-restoring routines around their DMAs.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++++---------
>   drivers/vhost/vdpa.c                     |  3 ++-
>   2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>   #include <linux/list.h>
>   #include <linux/pci.h>
>   #include <rdma/ib_verbs.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "usnic_log.h"
>   #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	struct usnic_uiom_dev *uiom_dev;
>   	int err;
>   
> +	if (!dev_is_dma_coherent(dev)) {
> +		usnic_err("IOMMU of %s does not support cache coherency\n",
> +				dev_name(dev));
> +		return -EINVAL;
> +	}
> +
>   	uiom_dev = kzalloc(sizeof(*uiom_dev), GFP_ATOMIC);
>   	if (!uiom_dev)
>   		return -ENOMEM;
> @@ -483,13 +490,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   	if (err)
>   		goto out_free_dev;
>   
> -	if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) {
> -		usnic_err("IOMMU of %s does not support cache coherency\n",
> -				dev_name(dev));
> -		err = -EINVAL;
> -		goto out_detach_device;
> -	}
> -
>   	spin_lock(&pd->lock);
>   	list_add_tail(&uiom_dev->link, &pd->devs);
>   	pd->dev_cnt++;
> @@ -497,8 +497,6 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd *pd, struct device *dev)
>   
>   	return 0;
>   
> -out_detach_device:
> -	iommu_detach_device(pd->domain, dev);
>   out_free_dev:
>   	kfree(uiom_dev);
>   	return err;
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 4c2f0bd062856a..05ea5800febc37 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -22,6 +22,7 @@
>   #include <linux/vdpa.h>
>   #include <linux/nospec.h>
>   #include <linux/vhost.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include "vhost.h"
>   
> @@ -929,7 +930,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
>   	if (!bus)
>   		return -EFAULT;
>   
> -	if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
> +	if (!dev_is_dma_coherent(dma_dev))
>   		return -ENOTSUPP;
>   
>   	v->domain = iommu_domain_alloc(bus);

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 13:51         ` Christoph Hellwig
  (?)
@ 2022-04-06 14:14           ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 14:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> > Didn't see it
> > 
> > I'll move dev_is_dma_coherent to device.h along with
> > device_iommu_mapped() and others then
> 
> No.  It it is internal for a reason.  It also doesn't actually work
> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
> not actually work.

Really? It is the only condition that dma_info_to_prot() tests to
decide of IOMMU_CACHE is used or not, so you are saying that there is
a condition where a device can be attached to an iommu_domain and
dev_is_dma_coherent() returns the wrong information? How does
dma-iommu.c safely use it then?

In any case I still need to do something about the places checking
IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
works. Any idea?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 14:14           ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 14:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> > Didn't see it
> > 
> > I'll move dev_is_dma_coherent to device.h along with
> > device_iommu_mapped() and others then
> 
> No.  It it is internal for a reason.  It also doesn't actually work
> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
> not actually work.

Really? It is the only condition that dma_info_to_prot() tests to
decide of IOMMU_CACHE is used or not, so you are saying that there is
a condition where a device can be attached to an iommu_domain and
dev_is_dma_coherent() returns the wrong information? How does
dma-iommu.c safely use it then?

In any case I still need to do something about the places checking
IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
works. Any idea?

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 14:14           ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 14:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
> > Didn't see it
> > 
> > I'll move dev_is_dma_coherent to device.h along with
> > device_iommu_mapped() and others then
> 
> No.  It it is internal for a reason.  It also doesn't actually work
> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
> not actually work.

Really? It is the only condition that dma_info_to_prot() tests to
decide of IOMMU_CACHE is used or not, so you are saying that there is
a condition where a device can be attached to an iommu_domain and
dev_is_dma_coherent() returns the wrong information? How does
dma-iommu.c safely use it then?

In any case I still need to do something about the places checking
IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
works. Any idea?

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 13:56     ` Robin Murphy
  (?)
@ 2022-04-06 14:24       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 14:24 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > way to do this is via dev_is_dma_coherent()
> 
> Not necessarily...
> 
> Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> there's the more interesting effectively-opposite scenario where an SMMU
> bridges non-coherent devices to a coherent interconnect. It's not something
> we take advantage of yet in Linux, and it can only be properly described in
> ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> the device's traffic snoop, but dev_is_dma_coherent() - and
> device_get_dma_attr() for external users - would still say non-coherent
> because they can't assume that the SMMU is enabled and programmed in just
> the right way.

Oh, I didn't know about device_get_dma_attr()..

Considering your future issue, maybe this:

/*
 * true if the given domain supports IOMMU_CACHE and when dev is attached to
 * that domain it will have coherent DMA and require no cache
 * maintenance when IOMMU_CACHE is used.
 */
bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
{
	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
}

? In future it could become a domain op and the SMMU driver could
figure out the situation you described?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 14:24       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 14:24 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > way to do this is via dev_is_dma_coherent()
> 
> Not necessarily...
> 
> Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> there's the more interesting effectively-opposite scenario where an SMMU
> bridges non-coherent devices to a coherent interconnect. It's not something
> we take advantage of yet in Linux, and it can only be properly described in
> ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> the device's traffic snoop, but dev_is_dma_coherent() - and
> device_get_dma_attr() for external users - would still say non-coherent
> because they can't assume that the SMMU is enabled and programmed in just
> the right way.

Oh, I didn't know about device_get_dma_attr()..

Considering your future issue, maybe this:

/*
 * true if the given domain supports IOMMU_CACHE and when dev is attached to
 * that domain it will have coherent DMA and require no cache
 * maintenance when IOMMU_CACHE is used.
 */
bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
{
	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
}

? In future it could become a domain op and the SMMU driver could
figure out the situation you described?

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 14:24       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 14:24 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > way to do this is via dev_is_dma_coherent()
> 
> Not necessarily...
> 
> Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> there's the more interesting effectively-opposite scenario where an SMMU
> bridges non-coherent devices to a coherent interconnect. It's not something
> we take advantage of yet in Linux, and it can only be properly described in
> ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> the device's traffic snoop, but dev_is_dma_coherent() - and
> device_get_dma_attr() for external users - would still say non-coherent
> because they can't assume that the SMMU is enabled and programmed in just
> the right way.

Oh, I didn't know about device_get_dma_attr()..

Considering your future issue, maybe this:

/*
 * true if the given domain supports IOMMU_CACHE and when dev is attached to
 * that domain it will have coherent DMA and require no cache
 * maintenance when IOMMU_CACHE is used.
 */
bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
{
	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
}

? In future it could become a domain op and the SMMU driver could
figure out the situation you described?

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 14:24       ` Jason Gunthorpe
  (?)
@ 2022-04-06 15:18         ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 15:18 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Wed, Apr 06, 2022 at 11:24:32AM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> > On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > > way to do this is via dev_is_dma_coherent()
> > 
> > Not necessarily...
> > 
> > Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> > there's the more interesting effectively-opposite scenario where an SMMU
> > bridges non-coherent devices to a coherent interconnect. It's not something
> > we take advantage of yet in Linux, and it can only be properly described in
> > ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> > the device's traffic snoop, but dev_is_dma_coherent() - and
> > device_get_dma_attr() for external users - would still say non-coherent
> > because they can't assume that the SMMU is enabled and programmed in just
> > the right way.
> 
> Oh, I didn't know about device_get_dma_attr()..
> 
> Considering your future issue, maybe this:
> 
> /*
>  * true if the given domain supports IOMMU_CACHE and when dev is attached to
>  * that domain it will have coherent DMA and require no cache
>  * maintenance when IOMMU_CACHE is used.
>  */
> bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
> {
> 	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
> }
> 
> ? In future it could become a domain op and the SMMU driver could
> figure out the situation you described?

I also spent some time looking at something like this:

struct iommu_domain *iommu_domain_alloc_coherent(struct device *device)
{
	if (device_get_dma_attr(device) == DEV_DMA_COHERENT)
		return NULL;
	return __iommu_domain_alloc(device->bus, IOMMU_DOMAIN_UNMANAGED);
}
EXPORT_SYMBOL_GPL(iommu_domain_alloc_coherent);

Which could evolve into to passing the flag down to the iommu driver
and then it could ensure SMMU is "programmed in just the right way"
or fail?

Could also go like this:

#define IOMMU_DOMAIN_FLAG_COHERENT 1
struct iommu_domain *iommu_device_alloc_domain(struct device *device,
                                               unsigned int flags)

A new alloc option is not so easy to fit VFIO into right now though.

Advices?

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:18         ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 15:18 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 11:24:32AM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> > On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > > way to do this is via dev_is_dma_coherent()
> > 
> > Not necessarily...
> > 
> > Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> > there's the more interesting effectively-opposite scenario where an SMMU
> > bridges non-coherent devices to a coherent interconnect. It's not something
> > we take advantage of yet in Linux, and it can only be properly described in
> > ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> > the device's traffic snoop, but dev_is_dma_coherent() - and
> > device_get_dma_attr() for external users - would still say non-coherent
> > because they can't assume that the SMMU is enabled and programmed in just
> > the right way.
> 
> Oh, I didn't know about device_get_dma_attr()..
> 
> Considering your future issue, maybe this:
> 
> /*
>  * true if the given domain supports IOMMU_CACHE and when dev is attached to
>  * that domain it will have coherent DMA and require no cache
>  * maintenance when IOMMU_CACHE is used.
>  */
> bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
> {
> 	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
> }
> 
> ? In future it could become a domain op and the SMMU driver could
> figure out the situation you described?

I also spent some time looking at something like this:

struct iommu_domain *iommu_domain_alloc_coherent(struct device *device)
{
	if (device_get_dma_attr(device) == DEV_DMA_COHERENT)
		return NULL;
	return __iommu_domain_alloc(device->bus, IOMMU_DOMAIN_UNMANAGED);
}
EXPORT_SYMBOL_GPL(iommu_domain_alloc_coherent);

Which could evolve into to passing the flag down to the iommu driver
and then it could ensure SMMU is "programmed in just the right way"
or fail?

Could also go like this:

#define IOMMU_DOMAIN_FLAG_COHERENT 1
struct iommu_domain *iommu_device_alloc_domain(struct device *device,
                                               unsigned int flags)

A new alloc option is not so easy to fit VFIO into right now though.

Advices?

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:18         ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 15:18 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 11:24:32AM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 02:56:56PM +0100, Robin Murphy wrote:
> > On 2022-04-05 17:16, Jason Gunthorpe wrote:
> > > vdpa and usnic are trying to test if IOMMU_CACHE is supported. The correct
> > > way to do this is via dev_is_dma_coherent()
> > 
> > Not necessarily...
> > 
> > Disregarding the complete disaster of PCIe No Snoop on Arm-Based systems,
> > there's the more interesting effectively-opposite scenario where an SMMU
> > bridges non-coherent devices to a coherent interconnect. It's not something
> > we take advantage of yet in Linux, and it can only be properly described in
> > ACPI, but there do exist situations where IOMMU_CACHE is capable of making
> > the device's traffic snoop, but dev_is_dma_coherent() - and
> > device_get_dma_attr() for external users - would still say non-coherent
> > because they can't assume that the SMMU is enabled and programmed in just
> > the right way.
> 
> Oh, I didn't know about device_get_dma_attr()..
> 
> Considering your future issue, maybe this:
> 
> /*
>  * true if the given domain supports IOMMU_CACHE and when dev is attached to
>  * that domain it will have coherent DMA and require no cache
>  * maintenance when IOMMU_CACHE is used.
>  */
> bool iommu_domain_supports_coherent_dma(struct iommu_domain *domain, struct device *dev)
> {
> 	return device_get_dma_attr(dev) == DEV_DMA_COHERENT;
> }
> 
> ? In future it could become a domain op and the SMMU driver could
> figure out the situation you described?

I also spent some time looking at something like this:

struct iommu_domain *iommu_domain_alloc_coherent(struct device *device)
{
	if (device_get_dma_attr(device) == DEV_DMA_COHERENT)
		return NULL;
	return __iommu_domain_alloc(device->bus, IOMMU_DOMAIN_UNMANAGED);
}
EXPORT_SYMBOL_GPL(iommu_domain_alloc_coherent);

Which could evolve into to passing the flag down to the iommu driver
and then it could ensure SMMU is "programmed in just the right way"
or fail?

Could also go like this:

#define IOMMU_DOMAIN_FLAG_COHERENT 1
struct iommu_domain *iommu_device_alloc_domain(struct device *device,
                                               unsigned int flags)

A new alloc option is not so easy to fit VFIO into right now though.

Advices?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 14:14           ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 15:47             ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, Tian, Kevin, linux-arm-msm,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

arm does not use dma-iommu.c
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:47             ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	Robin Murphy, Lu Baolu

On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

arm does not use dma-iommu.c
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:47             ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

arm does not use dma-iommu.c

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:47             ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Robin Murphy, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

arm does not use dma-iommu.c

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 14:14           ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 15:48             ` Robin Murphy
  -1 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer, Tian,
	Kevin, linux-arm-msm, Alex Williamson, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Christian Benvenuti, David Woodhouse

On 2022-04-06 15:14, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
>> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
>>> Didn't see it
>>>
>>> I'll move dev_is_dma_coherent to device.h along with
>>> device_iommu_mapped() and others then
>>
>> No.  It it is internal for a reason.  It also doesn't actually work
>> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
>> not actually work.
> 
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

The common iommu-dma layer happens to be part of the subset of the DMA 
core which *does* play the dev->dma_coherent game. 32-bit Arm has its 
own IOMMU DMA ops which do not. I don't know if the set of PowerPCs with 
CONFIG_NOT_COHERENT_CACHE intersects the set of PowerPCs that can do 
VFIO, but that would be another example if so.

> In any case I still need to do something about the places checking
> IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
> works. Any idea?

Can we improve the IOMMU drivers such that that *can* be the case 
(within a reasonable margin of error)? That's kind of where I was hoping 
to head with device_iommu_capable(), e.g. [1].

Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/53390e9505b3791adedc0974e251e5c7360e402e
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:48             ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, linux-s390, kvm, linux-rdma,
	Joerg Roedel, Rob Clark, Gerald Schaefer, linux-arm-msm,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Suravee Suthikulpanit, Christian Benvenuti, David Woodhouse,
	Lu Baolu

On 2022-04-06 15:14, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
>> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
>>> Didn't see it
>>>
>>> I'll move dev_is_dma_coherent to device.h along with
>>> device_iommu_mapped() and others then
>>
>> No.  It it is internal for a reason.  It also doesn't actually work
>> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
>> not actually work.
> 
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

The common iommu-dma layer happens to be part of the subset of the DMA 
core which *does* play the dev->dma_coherent game. 32-bit Arm has its 
own IOMMU DMA ops which do not. I don't know if the set of PowerPCs with 
CONFIG_NOT_COHERENT_CACHE intersects the set of PowerPCs that can do 
VFIO, but that would be another example if so.

> In any case I still need to do something about the places checking
> IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
> works. Any idea?

Can we improve the IOMMU drivers such that that *can* be the case 
(within a reasonable margin of error)? That's kind of where I was hoping 
to head with device_iommu_capable(), e.g. [1].

Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/53390e9505b3791adedc0974e251e5c7360e402e
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:48             ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On 2022-04-06 15:14, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
>> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
>>> Didn't see it
>>>
>>> I'll move dev_is_dma_coherent to device.h along with
>>> device_iommu_mapped() and others then
>>
>> No.  It it is internal for a reason.  It also doesn't actually work
>> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
>> not actually work.
> 
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

The common iommu-dma layer happens to be part of the subset of the DMA 
core which *does* play the dev->dma_coherent game. 32-bit Arm has its 
own IOMMU DMA ops which do not. I don't know if the set of PowerPCs with 
CONFIG_NOT_COHERENT_CACHE intersects the set of PowerPCs that can do 
VFIO, but that would be another example if so.

> In any case I still need to do something about the places checking
> IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
> works. Any idea?

Can we improve the IOMMU drivers such that that *can* be the case 
(within a reasonable margin of error)? That's kind of where I was hoping 
to head with device_iommu_capable(), e.g. [1].

Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/53390e9505b3791adedc0974e251e5c7360e402e

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:48             ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Suravee Suthikulpanit, virtualization,
	Will Deacon, Tian, Kevin

On 2022-04-06 15:14, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 03:51:50PM +0200, Christoph Hellwig wrote:
>> On Wed, Apr 06, 2022 at 09:07:30AM -0300, Jason Gunthorpe wrote:
>>> Didn't see it
>>>
>>> I'll move dev_is_dma_coherent to device.h along with
>>> device_iommu_mapped() and others then
>>
>> No.  It it is internal for a reason.  It also doesn't actually work
>> outside of the dma core.  E.g. for non-swiotlb ARM configs it will
>> not actually work.
> 
> Really? It is the only condition that dma_info_to_prot() tests to
> decide of IOMMU_CACHE is used or not, so you are saying that there is
> a condition where a device can be attached to an iommu_domain and
> dev_is_dma_coherent() returns the wrong information? How does
> dma-iommu.c safely use it then?

The common iommu-dma layer happens to be part of the subset of the DMA 
core which *does* play the dev->dma_coherent game. 32-bit Arm has its 
own IOMMU DMA ops which do not. I don't know if the set of PowerPCs with 
CONFIG_NOT_COHERENT_CACHE intersects the set of PowerPCs that can do 
VFIO, but that would be another example if so.

> In any case I still need to do something about the places checking
> IOMMU_CAP_CACHE_COHERENCY and thinking that means IOMMU_CACHE
> works. Any idea?

Can we improve the IOMMU drivers such that that *can* be the case 
(within a reasonable margin of error)? That's kind of where I was hoping 
to head with device_iommu_capable(), e.g. [1].

Robin.

[1] 
https://gitlab.arm.com/linux-arm/linux-rm/-/commit/53390e9505b3791adedc0974e251e5c7360e402e

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 15:47             ` Christoph Hellwig
  (?)
@ 2022-04-06 15:48               ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 05:47:23PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> > Really? It is the only condition that dma_info_to_prot() tests to
> > decide of IOMMU_CACHE is used or not, so you are saying that there is
> > a condition where a device can be attached to an iommu_domain and
> > dev_is_dma_coherent() returns the wrong information? How does
> > dma-iommu.c safely use it then?
> 
> arm does not use dma-iommu.c

Oh. I did not know that. Thanks

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:48               ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 05:47:23PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> > Really? It is the only condition that dma_info_to_prot() tests to
> > decide of IOMMU_CACHE is used or not, so you are saying that there is
> > a condition where a device can be attached to an iommu_domain and
> > dev_is_dma_coherent() returns the wrong information? How does
> > dma-iommu.c safely use it then?
> 
> arm does not use dma-iommu.c

Oh. I did not know that. Thanks

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:48               ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 15:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 05:47:23PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> > Really? It is the only condition that dma_info_to_prot() tests to
> > decide of IOMMU_CACHE is used or not, so you are saying that there is
> > a condition where a device can be attached to an iommu_domain and
> > dev_is_dma_coherent() returns the wrong information? How does
> > dma-iommu.c safely use it then?
> 
> arm does not use dma-iommu.c

Oh. I did not know that. Thanks

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 15:18         ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 15:50           ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, Tian, Kevin, linux-arm-msm,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > Oh, I didn't know about device_get_dma_attr()..

Which is completely broken for any non-OF, non-ACPI plaform.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:50           ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	Robin Murphy, Lu Baolu

On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > Oh, I didn't know about device_get_dma_attr()..

Which is completely broken for any non-OF, non-ACPI plaform.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:50           ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > Oh, I didn't know about device_get_dma_attr()..

Which is completely broken for any non-OF, non-ACPI plaform.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 15:50           ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig, Tian, Kevin

On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > Oh, I didn't know about device_get_dma_attr()..

Which is completely broken for any non-OF, non-ACPI plaform.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 15:50           ` Christoph Hellwig
  (?)
@ 2022-04-06 16:06             ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 16:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > Oh, I didn't know about device_get_dma_attr()..
> 
> Which is completely broken for any non-OF, non-ACPI plaform.

I saw that, but I spent some time searching and could not find an
iommu driver that would load independently of OF or ACPI. ie no IOMMU
platform drivers are created by board files. Things like Intel/AMD
discover only from ACPI, etc.

So it might be OK in the constrained case of having an iommu_domain
attached to the device being queried?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 16:06             ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 16:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > Oh, I didn't know about device_get_dma_attr()..
> 
> Which is completely broken for any non-OF, non-ACPI plaform.

I saw that, but I spent some time searching and could not find an
iommu driver that would load independently of OF or ACPI. ie no IOMMU
platform drivers are created by board files. Things like Intel/AMD
discover only from ACPI, etc.

So it might be OK in the constrained case of having an iommu_domain
attached to the device being queried?

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 16:06             ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 16:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > Oh, I didn't know about device_get_dma_attr()..
> 
> Which is completely broken for any non-OF, non-ACPI plaform.

I saw that, but I spent some time searching and could not find an
iommu driver that would load independently of OF or ACPI. ie no IOMMU
platform drivers are created by board files. Things like Intel/AMD
discover only from ACPI, etc.

So it might be OK in the constrained case of having an iommu_domain
attached to the device being queried?

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 16:06             ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-06 16:10               ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, Tian, Kevin, linux-arm-msm, Robin Murphy,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > Oh, I didn't know about device_get_dma_attr()..
> > 
> > Which is completely broken for any non-OF, non-ACPI plaform.
> 
> I saw that, but I spent some time searching and could not find an
> iommu driver that would load independently of OF or ACPI. ie no IOMMU
> platform drivers are created by board files. Things like Intel/AMD
> discover only from ACPI, etc.

s390?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 16:10               ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	linux-arm-msm, Robin Murphy, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	David Woodhouse, Lu Baolu

On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > Oh, I didn't know about device_get_dma_attr()..
> > 
> > Which is completely broken for any non-OF, non-ACPI plaform.
> 
> I saw that, but I spent some time searching and could not find an
> iommu driver that would load independently of OF or ACPI. ie no IOMMU
> platform drivers are created by board files. Things like Intel/AMD
> discover only from ACPI, etc.

s390?
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 16:10               ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Robin Murphy, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon,
	Tian, Kevin

On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > Oh, I didn't know about device_get_dma_attr()..
> > 
> > Which is completely broken for any non-OF, non-ACPI plaform.
> 
> I saw that, but I spent some time searching and could not find an
> iommu driver that would load independently of OF or ACPI. ie no IOMMU
> platform drivers are created by board files. Things like Intel/AMD
> discover only from ACPI, etc.

s390?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 16:10               ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-06 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Robin Murphy, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon,
	Tian, Kevin

On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > Oh, I didn't know about device_get_dma_attr()..
> > 
> > Which is completely broken for any non-OF, non-ACPI plaform.
> 
> I saw that, but I spent some time searching and could not find an
> iommu driver that would load independently of OF or ACPI. ie no IOMMU
> platform drivers are created by board files. Things like Intel/AMD
> discover only from ACPI, etc.

s390?

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 16:10               ` Christoph Hellwig
  (?)
@ 2022-04-06 17:17                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-06 17:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > Oh, I didn't know about device_get_dma_attr()..
> > > 
> > > Which is completely broken for any non-OF, non-ACPI plaform.
> > 
> > I saw that, but I spent some time searching and could not find an
> > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > platform drivers are created by board files. Things like Intel/AMD
> > discover only from ACPI, etc.
> 
> s390?

Ah, I missed looking in s390, hyperv and virtio.. 

hyperv is not creating iommu_domains, just IRQ remapping

virtio is using OF

And s390 indeed doesn't obviously have OF or ACPI parts..

This seems like it would be consistent with other things:

enum dev_dma_attr device_get_dma_attr(struct device *dev)
{
	const struct fwnode_handle *fwnode = dev_fwnode(dev);
	struct acpi_device *adev = to_acpi_device_node(fwnode);

	if (is_of_node(fwnode)) {
		if (of_dma_is_coherent(to_of_node(fwnode)))
			return DEV_DMA_COHERENT;
		return DEV_DMA_NON_COHERENT;
	} else if (adev) {
		return acpi_get_dma_attr(adev);
	}

	/* Platform is always DMA coherent */
	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
	    device_iommu_mapped(dev))
		return DEV_DMA_COHERENT;
	return DEV_DMA_NOT_SUPPORTED;
}
EXPORT_SYMBOL_GPL(device_get_dma_attr);

ie s390 has no of or acpi but the entire platform is known DMA
coherent at config time so allow it. Not sure we need the
device_iommu_mapped() or not.

We could alternatively use existing device_get_dma_attr() as a default
with an iommu wrapper and push the exception down through the iommu
driver and s390 can override it.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 17:17                 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 17:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > Oh, I didn't know about device_get_dma_attr()..
> > > 
> > > Which is completely broken for any non-OF, non-ACPI plaform.
> > 
> > I saw that, but I spent some time searching and could not find an
> > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > platform drivers are created by board files. Things like Intel/AMD
> > discover only from ACPI, etc.
> 
> s390?

Ah, I missed looking in s390, hyperv and virtio.. 

hyperv is not creating iommu_domains, just IRQ remapping

virtio is using OF

And s390 indeed doesn't obviously have OF or ACPI parts..

This seems like it would be consistent with other things:

enum dev_dma_attr device_get_dma_attr(struct device *dev)
{
	const struct fwnode_handle *fwnode = dev_fwnode(dev);
	struct acpi_device *adev = to_acpi_device_node(fwnode);

	if (is_of_node(fwnode)) {
		if (of_dma_is_coherent(to_of_node(fwnode)))
			return DEV_DMA_COHERENT;
		return DEV_DMA_NON_COHERENT;
	} else if (adev) {
		return acpi_get_dma_attr(adev);
	}

	/* Platform is always DMA coherent */
	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
	    device_iommu_mapped(dev))
		return DEV_DMA_COHERENT;
	return DEV_DMA_NOT_SUPPORTED;
}
EXPORT_SYMBOL_GPL(device_get_dma_attr);

ie s390 has no of or acpi but the entire platform is known DMA
coherent at config time so allow it. Not sure we need the
device_iommu_mapped() or not.

We could alternatively use existing device_get_dma_attr() as a default
with an iommu wrapper and push the exception down through the iommu
driver and s390 can override it.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-06 17:17                 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-06 17:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > Oh, I didn't know about device_get_dma_attr()..
> > > 
> > > Which is completely broken for any non-OF, non-ACPI plaform.
> > 
> > I saw that, but I spent some time searching and could not find an
> > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > platform drivers are created by board files. Things like Intel/AMD
> > discover only from ACPI, etc.
> 
> s390?

Ah, I missed looking in s390, hyperv and virtio.. 

hyperv is not creating iommu_domains, just IRQ remapping

virtio is using OF

And s390 indeed doesn't obviously have OF or ACPI parts..

This seems like it would be consistent with other things:

enum dev_dma_attr device_get_dma_attr(struct device *dev)
{
	const struct fwnode_handle *fwnode = dev_fwnode(dev);
	struct acpi_device *adev = to_acpi_device_node(fwnode);

	if (is_of_node(fwnode)) {
		if (of_dma_is_coherent(to_of_node(fwnode)))
			return DEV_DMA_COHERENT;
		return DEV_DMA_NON_COHERENT;
	} else if (adev) {
		return acpi_get_dma_attr(adev);
	}

	/* Platform is always DMA coherent */
	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
	    device_iommu_mapped(dev))
		return DEV_DMA_COHERENT;
	return DEV_DMA_NOT_SUPPORTED;
}
EXPORT_SYMBOL_GPL(device_get_dma_attr);

ie s390 has no of or acpi but the entire platform is known DMA
coherent at config time so allow it. Not sure we need the
device_iommu_mapped() or not.

We could alternatively use existing device_get_dma_attr() as a default
with an iommu wrapper and push the exception down through the iommu
driver and s390 can override it.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 17:17                 ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-07  7:18                   ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-07  7:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, April 7, 2022 1:17 AM
> 
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > >
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > >
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.

Intel discovers IOMMUs (and optionally ACPI namespace devices) from
ACPI, but there is no ACPI description for PCI devices i.e. the current
logic of device_get_dma_attr() cannot be used on PCI devices. 

> >
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio..
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

Probably not. If adding an iommu may change the coherency it would
come from specific IOPTE attributes for a device. The fact whether the
device is mapped by iommu doesn't tell that information.

> 
> We could alternatively use existing device_get_dma_attr() as a default
> with an iommu wrapper and push the exception down through the iommu
> driver and s390 can override it.
> 

if going this way probably device_get_dma_attr() should be renamed to
device_fwnode_get_dma_attr() instead to make it clearer?

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07  7:18                   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-07  7:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, linux-s390, kvm, linux-rdma,
	Joerg Roedel, Rob Clark, Gerald Schaefer, David Woodhouse,
	linux-arm-msm, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Suravee Suthikulpanit, Christian Benvenuti, Robin Murphy,
	Lu Baolu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, April 7, 2022 1:17 AM
> 
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > >
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > >
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.

Intel discovers IOMMUs (and optionally ACPI namespace devices) from
ACPI, but there is no ACPI description for PCI devices i.e. the current
logic of device_get_dma_attr() cannot be used on PCI devices. 

> >
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio..
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

Probably not. If adding an iommu may change the coherency it would
come from specific IOPTE attributes for a device. The fact whether the
device is mapped by iommu doesn't tell that information.

> 
> We could alternatively use existing device_get_dma_attr() as a default
> with an iommu wrapper and push the exception down through the iommu
> driver and s390 can override it.
> 

if going this way probably device_get_dma_attr() should be renamed to
device_fwnode_get_dma_attr() instead to make it clearer?

Thanks
Kevin
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07  7:18                   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-07  7:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, April 7, 2022 1:17 AM
> 
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > >
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > >
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.

Intel discovers IOMMUs (and optionally ACPI namespace devices) from
ACPI, but there is no ACPI description for PCI devices i.e. the current
logic of device_get_dma_attr() cannot be used on PCI devices. 

> >
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio..
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

Probably not. If adding an iommu may change the coherency it would
come from specific IOPTE attributes for a device. The fact whether the
device is mapped by iommu doesn't tell that information.

> 
> We could alternatively use existing device_get_dma_attr() as a default
> with an iommu wrapper and push the exception down through the iommu
> driver and s390 can override it.
> 

if going this way probably device_get_dma_attr() should be renamed to
device_fwnode_get_dma_attr() instead to make it clearer?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07  7:18                   ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2022-04-07  7:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, April 7, 2022 1:17 AM
> 
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > >
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > >
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.

Intel discovers IOMMUs (and optionally ACPI namespace devices) from
ACPI, but there is no ACPI description for PCI devices i.e. the current
logic of device_get_dma_attr() cannot be used on PCI devices. 

> >
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio..
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

Probably not. If adding an iommu may change the coherency it would
come from specific IOPTE attributes for a device. The fact whether the
device is mapped by iommu doesn't tell that information.

> 
> We could alternatively use existing device_get_dma_attr() as a default
> with an iommu wrapper and push the exception down through the iommu
> driver and s390 can override it.
> 

if going this way probably device_get_dma_attr() should be renamed to
device_fwnode_get_dma_attr() instead to make it clearer?

Thanks
Kevin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-06 17:17                 ` Jason Gunthorpe
  (?)
@ 2022-04-07  8:53                   ` Niklas Schnelle
  -1 siblings, 0 replies; 135+ messages in thread
From: Niklas Schnelle @ 2022-04-07  8:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, 2022-04-06 at 14:17 -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > 
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > 
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.
> > 
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio.. 
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

I only took a short look but I think the device_iommu_mapped() call in
there is wrong. On s390 PCI always goes through IOMMU hardware both
when using the IOMMU API and when using the DMA API and this hardware
is always coherent. This is even true for s390 guests in QEMU/KVM and
under the z/VM hypervisor. As far as I can see device_iommu_mapped()'s
check for dev->iommu_group would only work while the device is under
IOMMU API control not DMA API, no?

Also, while it is true that s390 *hardware* devices are always cache
coherent there is also the case that SWIOTLB is used for protected
virtualization and then cache flushing APIs must be used.


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07  8:53                   ` Niklas Schnelle
  0 siblings, 0 replies; 135+ messages in thread
From: Niklas Schnelle @ 2022-04-07  8:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, linux-s390, kvm, linux-rdma, Gerald Schaefer,
	David Woodhouse, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, 2022-04-06 at 14:17 -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > 
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > 
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.
> > 
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio.. 
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

I only took a short look but I think the device_iommu_mapped() call in
there is wrong. On s390 PCI always goes through IOMMU hardware both
when using the IOMMU API and when using the DMA API and this hardware
is always coherent. This is even true for s390 guests in QEMU/KVM and
under the z/VM hypervisor. As far as I can see device_iommu_mapped()'s
check for dev->iommu_group would only work while the device is under
IOMMU API control not DMA API, no?

Also, while it is true that s390 *hardware* devices are always cache
coherent there is also the case that SWIOTLB is used for protected
virtualization and then cache flushing APIs must be used.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07  8:53                   ` Niklas Schnelle
  0 siblings, 0 replies; 135+ messages in thread
From: Niklas Schnelle @ 2022-04-07  8:53 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Robin Murphy, Alex Williamson, Lu Baolu, Christian Benvenuti,
	Cornelia Huck, David Woodhouse, Gerald Schaefer, iommu,
	Jason Wang, Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm,
	linux-rdma, linux-s390, Matthew Rosato, Michael S. Tsirkin,
	Nelson Escobar, netdev, Rob Clark, Suravee Suthikulpanit,
	virtualization, Will Deacon, Tian, Kevin

On Wed, 2022-04-06 at 14:17 -0300, Jason Gunthorpe wrote:
> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > 
> > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > 
> > > I saw that, but I spent some time searching and could not find an
> > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > platform drivers are created by board files. Things like Intel/AMD
> > > discover only from ACPI, etc.
> > 
> > s390?
> 
> Ah, I missed looking in s390, hyperv and virtio.. 
> 
> hyperv is not creating iommu_domains, just IRQ remapping
> 
> virtio is using OF
> 
> And s390 indeed doesn't obviously have OF or ACPI parts..
> 
> This seems like it would be consistent with other things:
> 
> enum dev_dma_attr device_get_dma_attr(struct device *dev)
> {
> 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> 	struct acpi_device *adev = to_acpi_device_node(fwnode);
> 
> 	if (is_of_node(fwnode)) {
> 		if (of_dma_is_coherent(to_of_node(fwnode)))
> 			return DEV_DMA_COHERENT;
> 		return DEV_DMA_NON_COHERENT;
> 	} else if (adev) {
> 		return acpi_get_dma_attr(adev);
> 	}
> 
> 	/* Platform is always DMA coherent */
> 	if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
> 	    !IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) &&
> 	    device_iommu_mapped(dev))
> 		return DEV_DMA_COHERENT;
> 	return DEV_DMA_NOT_SUPPORTED;
> }
> EXPORT_SYMBOL_GPL(device_get_dma_attr);
> 
> ie s390 has no of or acpi but the entire platform is known DMA
> coherent at config time so allow it. Not sure we need the
> device_iommu_mapped() or not.

I only took a short look but I think the device_iommu_mapped() call in
there is wrong. On s390 PCI always goes through IOMMU hardware both
when using the IOMMU API and when using the DMA API and this hardware
is always coherent. This is even true for s390 guests in QEMU/KVM and
under the z/VM hypervisor. As far as I can see device_iommu_mapped()'s
check for dev->iommu_group would only work while the device is under
IOMMU API control not DMA API, no?

Also, while it is true that s390 *hardware* devices are always cache
coherent there is also the case that SWIOTLB is used for protected
virtualization and then cache flushing APIs must be used.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-07  7:18                   ` Tian, Kevin
  (?)
@ 2022-04-07 13:59                     ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-07 13:59 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, linux-arm-msm, Robin Murphy, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Thursday, April 7, 2022 1:17 AM
> > 
> > On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > >
> > > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > >
> > > > I saw that, but I spent some time searching and could not find an
> > > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > > platform drivers are created by board files. Things like Intel/AMD
> > > > discover only from ACPI, etc.
> 
> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
> ACPI, but there is no ACPI description for PCI devices i.e. the current
> logic of device_get_dma_attr() cannot be used on PCI devices. 

Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
DEV_DMA_NOT_SUPPORTED?

I think I should give up on this and just redefine the existing iommu
cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

> > We could alternatively use existing device_get_dma_attr() as a default
> > with an iommu wrapper and push the exception down through the iommu
> > driver and s390 can override it.
> > 
> 
> if going this way probably device_get_dma_attr() should be renamed to
> device_fwnode_get_dma_attr() instead to make it clearer?

I'm looking at the few users:

drivers/ata/ahci_ceva.c
drivers/ata/ahci_qoriq.c
 - These are ARM only drivers. They are trying to copy the dma-coherent
   property from its DT/ACPI definition to internal register settings
   which look like they tune how the AXI bus transactions are created.

   I'm guessing the SATA IP block's AXI interface can be configured to
   generate coherent or non-coherent requests and it has to be set
   in a way that is consistent with the SOC architecture and match
   what the DMA API expects the device will do.

drivers/crypto/ccp/sp-platform.c
 - Only used on ARM64 and also programs a HW register similar to the
   sata drivers. Refuses to work if the FW property is not present.

drivers/net/ethernet/amd/xgbe/xgbe-platform.c
 - Seems to be configuring another ARM AXI block

drivers/gpu/drm/panfrost/panfrost_drv.c
 - Robin's commit comment here is good, and one of the things this
   controls is if the coherent_walk is set for the io-pgtable-arm.c
   code which avoids DMA API calls

drivers/gpu/drm/tegra/uapi.c
 - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.

My take is that the drivers using this API are doing it to make sure
their HW blocks are setup in a way that is consistent with the DMA API
they are also using, and run in constrained embedded-style
environments that know the firmware support is present.

So in the end it does not seem suitable right now for linking to
IOMMU_CACHE..

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 13:59                     ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 13:59 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Christoph Hellwig, Robin Murphy, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Thursday, April 7, 2022 1:17 AM
> > 
> > On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > >
> > > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > >
> > > > I saw that, but I spent some time searching and could not find an
> > > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > > platform drivers are created by board files. Things like Intel/AMD
> > > > discover only from ACPI, etc.
> 
> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
> ACPI, but there is no ACPI description for PCI devices i.e. the current
> logic of device_get_dma_attr() cannot be used on PCI devices. 

Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
DEV_DMA_NOT_SUPPORTED?

I think I should give up on this and just redefine the existing iommu
cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

> > We could alternatively use existing device_get_dma_attr() as a default
> > with an iommu wrapper and push the exception down through the iommu
> > driver and s390 can override it.
> > 
> 
> if going this way probably device_get_dma_attr() should be renamed to
> device_fwnode_get_dma_attr() instead to make it clearer?

I'm looking at the few users:

drivers/ata/ahci_ceva.c
drivers/ata/ahci_qoriq.c
 - These are ARM only drivers. They are trying to copy the dma-coherent
   property from its DT/ACPI definition to internal register settings
   which look like they tune how the AXI bus transactions are created.

   I'm guessing the SATA IP block's AXI interface can be configured to
   generate coherent or non-coherent requests and it has to be set
   in a way that is consistent with the SOC architecture and match
   what the DMA API expects the device will do.

drivers/crypto/ccp/sp-platform.c
 - Only used on ARM64 and also programs a HW register similar to the
   sata drivers. Refuses to work if the FW property is not present.

drivers/net/ethernet/amd/xgbe/xgbe-platform.c
 - Seems to be configuring another ARM AXI block

drivers/gpu/drm/panfrost/panfrost_drv.c
 - Robin's commit comment here is good, and one of the things this
   controls is if the coherent_walk is set for the io-pgtable-arm.c
   code which avoids DMA API calls

drivers/gpu/drm/tegra/uapi.c
 - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.

My take is that the drivers using this API are doing it to make sure
their HW blocks are setup in a way that is consistent with the DMA API
they are also using, and run in constrained embedded-style
environments that know the firmware support is present.

So in the end it does not seem suitable right now for linking to
IOMMU_CACHE..

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 13:59                     ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 13:59 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Christoph Hellwig, Robin Murphy, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Thursday, April 7, 2022 1:17 AM
> > 
> > On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
> > > > > On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
> > > > > > > Oh, I didn't know about device_get_dma_attr()..
> > > > >
> > > > > Which is completely broken for any non-OF, non-ACPI plaform.
> > > >
> > > > I saw that, but I spent some time searching and could not find an
> > > > iommu driver that would load independently of OF or ACPI. ie no IOMMU
> > > > platform drivers are created by board files. Things like Intel/AMD
> > > > discover only from ACPI, etc.
> 
> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
> ACPI, but there is no ACPI description for PCI devices i.e. the current
> logic of device_get_dma_attr() cannot be used on PCI devices. 

Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
DEV_DMA_NOT_SUPPORTED?

I think I should give up on this and just redefine the existing iommu
cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

> > We could alternatively use existing device_get_dma_attr() as a default
> > with an iommu wrapper and push the exception down through the iommu
> > driver and s390 can override it.
> > 
> 
> if going this way probably device_get_dma_attr() should be renamed to
> device_fwnode_get_dma_attr() instead to make it clearer?

I'm looking at the few users:

drivers/ata/ahci_ceva.c
drivers/ata/ahci_qoriq.c
 - These are ARM only drivers. They are trying to copy the dma-coherent
   property from its DT/ACPI definition to internal register settings
   which look like they tune how the AXI bus transactions are created.

   I'm guessing the SATA IP block's AXI interface can be configured to
   generate coherent or non-coherent requests and it has to be set
   in a way that is consistent with the SOC architecture and match
   what the DMA API expects the device will do.

drivers/crypto/ccp/sp-platform.c
 - Only used on ARM64 and also programs a HW register similar to the
   sata drivers. Refuses to work if the FW property is not present.

drivers/net/ethernet/amd/xgbe/xgbe-platform.c
 - Seems to be configuring another ARM AXI block

drivers/gpu/drm/panfrost/panfrost_drv.c
 - Robin's commit comment here is good, and one of the things this
   controls is if the coherent_walk is set for the io-pgtable-arm.c
   code which avoids DMA API calls

drivers/gpu/drm/tegra/uapi.c
 - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.

My take is that the drivers using this API are doing it to make sure
their HW blocks are setup in a way that is consistent with the DMA API
they are also using, and run in constrained embedded-style
environments that know the firmware support is present.

So in the end it does not seem suitable right now for linking to
IOMMU_CACHE..

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
  2022-04-06  7:02         ` Tian, Kevin
  (?)
@ 2022-04-07 14:53           ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 14:53 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 07:02:36AM +0000, Tian, Kevin wrote:

> > So like this:
> > 
> >  int vfio_register_group_dev(struct vfio_device *device)
> >  {
> > +       if (!dev_is_dma_coherent(device->dev))
> > +               return -EINVAL;
> > +
> >         return __vfio_register_dev(device,
> >                 vfio_group_find_or_alloc(device->dev));
> >  }
> > 
> > I fixed it up.
> > 
> 
> if that is the case should it also apply to usnic and vdpa in the first
> patch (i.e. fail the probe)?

Ideally, but I don't want to mess with existing logic in these
drivers..

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-07 14:53           ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-07 14:53 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 07:02:36AM +0000, Tian, Kevin wrote:

> > So like this:
> > 
> >  int vfio_register_group_dev(struct vfio_device *device)
> >  {
> > +       if (!dev_is_dma_coherent(device->dev))
> > +               return -EINVAL;
> > +
> >         return __vfio_register_dev(device,
> >                 vfio_group_find_or_alloc(device->dev));
> >  }
> > 
> > I fixed it up.
> > 
> 
> if that is the case should it also apply to usnic and vdpa in the first
> patch (i.e. fail the probe)?

Ideally, but I don't want to mess with existing logic in these
drivers..

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence
@ 2022-04-07 14:53           ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 14:53 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 07:02:36AM +0000, Tian, Kevin wrote:

> > So like this:
> > 
> >  int vfio_register_group_dev(struct vfio_device *device)
> >  {
> > +       if (!dev_is_dma_coherent(device->dev))
> > +               return -EINVAL;
> > +
> >         return __vfio_register_dev(device,
> >                 vfio_group_find_or_alloc(device->dev));
> >  }
> > 
> > I fixed it up.
> > 
> 
> if that is the case should it also apply to usnic and vdpa in the first
> patch (i.e. fail the probe)?

Ideally, but I don't want to mess with existing logic in these
drivers..

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
  2022-04-06  6:52   ` Tian, Kevin
  (?)
@ 2022-04-07 14:56     ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 14:56 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 06:52:04AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> > by a platform as bypassing elements in the DMA coherent CPU cache
> > hierarchy. A driver can command a device to set this bit on some of its
> > transactions as a micro-optimization.
> > 
> > However, the driver is now responsible to synchronize the CPU cache with
> > the DMA that bypassed it. On x86 this is done through the wbinvd
> > instruction, and the i915 GPU driver is the only Linux DMA driver that
> > calls it.
> 
> More accurately x86 supports both unprivileged clflush instructions
> to invalidate one cacheline and a privileged wbinvd instruction to
> invalidate the entire cache. Replacing 'this is done' with 'this may
> be done' is clearer.
> 
> > 
> > The problem comes that KVM on x86 will normally disable the wbinvd
> > instruction in the guest and render it a NOP. As the driver running in the
> > guest is not aware the wbinvd doesn't work it may still cause the device
> > to set the no-snoop bit and the platform will bypass the CPU cache.
> > Without a working wbinvd there is no way to re-synchronize the CPU cache
> > and the driver in the VM has data corruption.
> > 
> > Thus, we see a general direction on x86 that the IOMMU HW is able to block
> > the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> > to NOP the wbinvd without causing any data corruption.
> > 
> > This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> > IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> > multiple
> > meanings and usages beyond blocking no-snoop and the whole thing has
> > become confused.
>
> Also point out your finding about AMD IOMMU?

Done, thanks

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-07 14:56     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-07 14:56 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, David Woodhouse, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, Robin Murphy

On Wed, Apr 06, 2022 at 06:52:04AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> > by a platform as bypassing elements in the DMA coherent CPU cache
> > hierarchy. A driver can command a device to set this bit on some of its
> > transactions as a micro-optimization.
> > 
> > However, the driver is now responsible to synchronize the CPU cache with
> > the DMA that bypassed it. On x86 this is done through the wbinvd
> > instruction, and the i915 GPU driver is the only Linux DMA driver that
> > calls it.
> 
> More accurately x86 supports both unprivileged clflush instructions
> to invalidate one cacheline and a privileged wbinvd instruction to
> invalidate the entire cache. Replacing 'this is done' with 'this may
> be done' is clearer.
> 
> > 
> > The problem comes that KVM on x86 will normally disable the wbinvd
> > instruction in the guest and render it a NOP. As the driver running in the
> > guest is not aware the wbinvd doesn't work it may still cause the device
> > to set the no-snoop bit and the platform will bypass the CPU cache.
> > Without a working wbinvd there is no way to re-synchronize the CPU cache
> > and the driver in the VM has data corruption.
> > 
> > Thus, we see a general direction on x86 that the IOMMU HW is able to block
> > the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> > to NOP the wbinvd without causing any data corruption.
> > 
> > This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> > IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> > multiple
> > meanings and usages beyond blocking no-snoop and the whole thing has
> > become confused.
>
> Also point out your finding about AMD IOMMU?

Done, thanks

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 0/5] Make the iommu driver no-snoop block feature consistent
@ 2022-04-07 14:56     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 14:56 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Christian Benvenuti, Cornelia Huck,
	David Woodhouse, Gerald Schaefer, iommu, Jason Wang,
	Joerg Roedel, kvm, linux-arm-kernel, linux-arm-msm, linux-rdma,
	linux-s390, Matthew Rosato, Michael S. Tsirkin, Nelson Escobar,
	netdev, Rob Clark, Robin Murphy, Suravee Suthikulpanit,
	virtualization, Will Deacon, Christoph Hellwig

On Wed, Apr 06, 2022 at 06:52:04AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> > 
> > PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> > by a platform as bypassing elements in the DMA coherent CPU cache
> > hierarchy. A driver can command a device to set this bit on some of its
> > transactions as a micro-optimization.
> > 
> > However, the driver is now responsible to synchronize the CPU cache with
> > the DMA that bypassed it. On x86 this is done through the wbinvd
> > instruction, and the i915 GPU driver is the only Linux DMA driver that
> > calls it.
> 
> More accurately x86 supports both unprivileged clflush instructions
> to invalidate one cacheline and a privileged wbinvd instruction to
> invalidate the entire cache. Replacing 'this is done' with 'this may
> be done' is clearer.
> 
> > 
> > The problem comes that KVM on x86 will normally disable the wbinvd
> > instruction in the guest and render it a NOP. As the driver running in the
> > guest is not aware the wbinvd doesn't work it may still cause the device
> > to set the no-snoop bit and the platform will bypass the CPU cache.
> > Without a working wbinvd there is no way to re-synchronize the CPU cache
> > and the driver in the VM has data corruption.
> > 
> > Thus, we see a general direction on x86 that the IOMMU HW is able to block
> > the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> > to NOP the wbinvd without causing any data corruption.
> > 
> > This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> > IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> > multiple
> > meanings and usages beyond blocking no-snoop and the whole thing has
> > become confused.
>
> Also point out your finding about AMD IOMMU?

Done, thanks

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-07 13:59                     ` Jason Gunthorpe
  (?)
  (?)
@ 2022-04-07 15:17                       ` Robin Murphy
  -1 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-07 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On 2022-04-07 14:59, Jason Gunthorpe wrote:
> On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Thursday, April 7, 2022 1:17 AM
>>>
>>> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
>>>> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
>>>>> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
>>>>>> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
>>>>>>>> Oh, I didn't know about device_get_dma_attr()..
>>>>>>
>>>>>> Which is completely broken for any non-OF, non-ACPI plaform.
>>>>>
>>>>> I saw that, but I spent some time searching and could not find an
>>>>> iommu driver that would load independently of OF or ACPI. ie no IOMMU
>>>>> platform drivers are created by board files. Things like Intel/AMD
>>>>> discover only from ACPI, etc.
>>
>> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
>> ACPI, but there is no ACPI description for PCI devices i.e. the current
>> logic of device_get_dma_attr() cannot be used on PCI devices.
> 
> Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
> DEV_DMA_NOT_SUPPORTED?

I think it _should_ return DEV_DMA_COHERENT on x86/IA-64 (unless a _CCA 
method was actually present to say otherwise), based on 
acpi_init_coherency(), but I only know for sure what happens on arm64.

> I think I should give up on this and just redefine the existing iommu
> cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

TBH I don't see any issue with current name, but I'd certainly be happy 
to nail down a specific definition for it, along the lines of "this 
means that IOMMU_CACHE mappings are generally coherent". That works for 
things like Arm's S2FWB making it OK to assign an otherwise-non-coherent 
device without extra hassle.

For the specific case of overriding PCIe No Snoop (which is more 
problematic from an Arm SMMU PoV) when assigning to a VM, would that not 
be easier solved by just having vfio-pci clear the "Enable No Snoop" 
control bit in the endpoint's PCIe capability?

>>> We could alternatively use existing device_get_dma_attr() as a default
>>> with an iommu wrapper and push the exception down through the iommu
>>> driver and s390 can override it.
>>>
>>
>> if going this way probably device_get_dma_attr() should be renamed to
>> device_fwnode_get_dma_attr() instead to make it clearer?
> 
> I'm looking at the few users:
> 
> drivers/ata/ahci_ceva.c
> drivers/ata/ahci_qoriq.c
>   - These are ARM only drivers. They are trying to copy the dma-coherent
>     property from its DT/ACPI definition to internal register settings
>     which look like they tune how the AXI bus transactions are created.
> 
>     I'm guessing the SATA IP block's AXI interface can be configured to
>     generate coherent or non-coherent requests and it has to be set
>     in a way that is consistent with the SOC architecture and match
>     what the DMA API expects the device will do.
> 
> drivers/crypto/ccp/sp-platform.c
>   - Only used on ARM64 and also programs a HW register similar to the
>     sata drivers. Refuses to work if the FW property is not present.
> 
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c
>   - Seems to be configuring another ARM AXI block
> 
> drivers/gpu/drm/panfrost/panfrost_drv.c
>   - Robin's commit comment here is good, and one of the things this
>     controls is if the coherent_walk is set for the io-pgtable-arm.c
>     code which avoids DMA API calls
> 
> drivers/gpu/drm/tegra/uapi.c
>   - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.
> 
> My take is that the drivers using this API are doing it to make sure
> their HW blocks are setup in a way that is consistent with the DMA API
> they are also using, and run in constrained embedded-style
> environments that know the firmware support is present.
> 
> So in the end it does not seem suitable right now for linking to
> IOMMU_CACHE..

That seems a pretty good summary - I think they're basically all 
"firmware told Linux I'm coherent so I'd better act coherent" cases, but 
that still doesn't necessarily mean that they're *forced* to respect 
that. One of the things on my to-do list is to try adding a 
DMA_ATTR_NO_SNOOP that can force DMA cache maintenance for coherent 
devices, primarily to hook up in Panfrost (where there is a bit of a 
performance to claw back on the coherent AmLogic SoCs by leaving certain 
buffers non-cacheable).

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:17                       ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-07 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On 2022-04-07 14:59, Jason Gunthorpe wrote:
> On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Thursday, April 7, 2022 1:17 AM
>>>
>>> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
>>>> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
>>>>> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
>>>>>> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
>>>>>>>> Oh, I didn't know about device_get_dma_attr()..
>>>>>>
>>>>>> Which is completely broken for any non-OF, non-ACPI plaform.
>>>>>
>>>>> I saw that, but I spent some time searching and could not find an
>>>>> iommu driver that would load independently of OF or ACPI. ie no IOMMU
>>>>> platform drivers are created by board files. Things like Intel/AMD
>>>>> discover only from ACPI, etc.
>>
>> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
>> ACPI, but there is no ACPI description for PCI devices i.e. the current
>> logic of device_get_dma_attr() cannot be used on PCI devices.
> 
> Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
> DEV_DMA_NOT_SUPPORTED?

I think it _should_ return DEV_DMA_COHERENT on x86/IA-64 (unless a _CCA 
method was actually present to say otherwise), based on 
acpi_init_coherency(), but I only know for sure what happens on arm64.

> I think I should give up on this and just redefine the existing iommu
> cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

TBH I don't see any issue with current name, but I'd certainly be happy 
to nail down a specific definition for it, along the lines of "this 
means that IOMMU_CACHE mappings are generally coherent". That works for 
things like Arm's S2FWB making it OK to assign an otherwise-non-coherent 
device without extra hassle.

For the specific case of overriding PCIe No Snoop (which is more 
problematic from an Arm SMMU PoV) when assigning to a VM, would that not 
be easier solved by just having vfio-pci clear the "Enable No Snoop" 
control bit in the endpoint's PCIe capability?

>>> We could alternatively use existing device_get_dma_attr() as a default
>>> with an iommu wrapper and push the exception down through the iommu
>>> driver and s390 can override it.
>>>
>>
>> if going this way probably device_get_dma_attr() should be renamed to
>> device_fwnode_get_dma_attr() instead to make it clearer?
> 
> I'm looking at the few users:
> 
> drivers/ata/ahci_ceva.c
> drivers/ata/ahci_qoriq.c
>   - These are ARM only drivers. They are trying to copy the dma-coherent
>     property from its DT/ACPI definition to internal register settings
>     which look like they tune how the AXI bus transactions are created.
> 
>     I'm guessing the SATA IP block's AXI interface can be configured to
>     generate coherent or non-coherent requests and it has to be set
>     in a way that is consistent with the SOC architecture and match
>     what the DMA API expects the device will do.
> 
> drivers/crypto/ccp/sp-platform.c
>   - Only used on ARM64 and also programs a HW register similar to the
>     sata drivers. Refuses to work if the FW property is not present.
> 
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c
>   - Seems to be configuring another ARM AXI block
> 
> drivers/gpu/drm/panfrost/panfrost_drv.c
>   - Robin's commit comment here is good, and one of the things this
>     controls is if the coherent_walk is set for the io-pgtable-arm.c
>     code which avoids DMA API calls
> 
> drivers/gpu/drm/tegra/uapi.c
>   - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.
> 
> My take is that the drivers using this API are doing it to make sure
> their HW blocks are setup in a way that is consistent with the DMA API
> they are also using, and run in constrained embedded-style
> environments that know the firmware support is present.
> 
> So in the end it does not seem suitable right now for linking to
> IOMMU_CACHE..

That seems a pretty good summary - I think they're basically all 
"firmware told Linux I'm coherent so I'd better act coherent" cases, but 
that still doesn't necessarily mean that they're *forced* to respect 
that. One of the things on my to-do list is to try adding a 
DMA_ATTR_NO_SNOOP that can force DMA cache maintenance for coherent 
devices, primarily to hook up in Panfrost (where there is a bit of a 
performance to claw back on the coherent AmLogic SoCs by leaving certain 
buffers non-cacheable).

Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:17                       ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-07 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Gerald Schaefer,
	linux-arm-msm, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Suravee Suthikulpanit, Christian Benvenuti, David Woodhouse,
	Lu Baolu

On 2022-04-07 14:59, Jason Gunthorpe wrote:
> On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Thursday, April 7, 2022 1:17 AM
>>>
>>> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
>>>> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
>>>>> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
>>>>>> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
>>>>>>>> Oh, I didn't know about device_get_dma_attr()..
>>>>>>
>>>>>> Which is completely broken for any non-OF, non-ACPI plaform.
>>>>>
>>>>> I saw that, but I spent some time searching and could not find an
>>>>> iommu driver that would load independently of OF or ACPI. ie no IOMMU
>>>>> platform drivers are created by board files. Things like Intel/AMD
>>>>> discover only from ACPI, etc.
>>
>> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
>> ACPI, but there is no ACPI description for PCI devices i.e. the current
>> logic of device_get_dma_attr() cannot be used on PCI devices.
> 
> Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
> DEV_DMA_NOT_SUPPORTED?

I think it _should_ return DEV_DMA_COHERENT on x86/IA-64 (unless a _CCA 
method was actually present to say otherwise), based on 
acpi_init_coherency(), but I only know for sure what happens on arm64.

> I think I should give up on this and just redefine the existing iommu
> cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

TBH I don't see any issue with current name, but I'd certainly be happy 
to nail down a specific definition for it, along the lines of "this 
means that IOMMU_CACHE mappings are generally coherent". That works for 
things like Arm's S2FWB making it OK to assign an otherwise-non-coherent 
device without extra hassle.

For the specific case of overriding PCIe No Snoop (which is more 
problematic from an Arm SMMU PoV) when assigning to a VM, would that not 
be easier solved by just having vfio-pci clear the "Enable No Snoop" 
control bit in the endpoint's PCIe capability?

>>> We could alternatively use existing device_get_dma_attr() as a default
>>> with an iommu wrapper and push the exception down through the iommu
>>> driver and s390 can override it.
>>>
>>
>> if going this way probably device_get_dma_attr() should be renamed to
>> device_fwnode_get_dma_attr() instead to make it clearer?
> 
> I'm looking at the few users:
> 
> drivers/ata/ahci_ceva.c
> drivers/ata/ahci_qoriq.c
>   - These are ARM only drivers. They are trying to copy the dma-coherent
>     property from its DT/ACPI definition to internal register settings
>     which look like they tune how the AXI bus transactions are created.
> 
>     I'm guessing the SATA IP block's AXI interface can be configured to
>     generate coherent or non-coherent requests and it has to be set
>     in a way that is consistent with the SOC architecture and match
>     what the DMA API expects the device will do.
> 
> drivers/crypto/ccp/sp-platform.c
>   - Only used on ARM64 and also programs a HW register similar to the
>     sata drivers. Refuses to work if the FW property is not present.
> 
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c
>   - Seems to be configuring another ARM AXI block
> 
> drivers/gpu/drm/panfrost/panfrost_drv.c
>   - Robin's commit comment here is good, and one of the things this
>     controls is if the coherent_walk is set for the io-pgtable-arm.c
>     code which avoids DMA API calls
> 
> drivers/gpu/drm/tegra/uapi.c
>   - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.
> 
> My take is that the drivers using this API are doing it to make sure
> their HW blocks are setup in a way that is consistent with the DMA API
> they are also using, and run in constrained embedded-style
> environments that know the firmware support is present.
> 
> So in the end it does not seem suitable right now for linking to
> IOMMU_CACHE..

That seems a pretty good summary - I think they're basically all 
"firmware told Linux I'm coherent so I'd better act coherent" cases, but 
that still doesn't necessarily mean that they're *forced* to respect 
that. One of the things on my to-do list is to try adding a 
DMA_ATTR_NO_SNOOP that can force DMA cache maintenance for coherent 
devices, primarily to hook up in Panfrost (where there is a bit of a 
performance to claw back on the coherent AmLogic SoCs by leaving certain 
buffers non-cacheable).

Cheers,
Robin.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:17                       ` Robin Murphy
  0 siblings, 0 replies; 135+ messages in thread
From: Robin Murphy @ 2022-04-07 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Tian, Kevin
  Cc: Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On 2022-04-07 14:59, Jason Gunthorpe wrote:
> On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Thursday, April 7, 2022 1:17 AM
>>>
>>> On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
>>>> On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
>>>>> On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
>>>>>> On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
>>>>>>>> Oh, I didn't know about device_get_dma_attr()..
>>>>>>
>>>>>> Which is completely broken for any non-OF, non-ACPI plaform.
>>>>>
>>>>> I saw that, but I spent some time searching and could not find an
>>>>> iommu driver that would load independently of OF or ACPI. ie no IOMMU
>>>>> platform drivers are created by board files. Things like Intel/AMD
>>>>> discover only from ACPI, etc.
>>
>> Intel discovers IOMMUs (and optionally ACPI namespace devices) from
>> ACPI, but there is no ACPI description for PCI devices i.e. the current
>> logic of device_get_dma_attr() cannot be used on PCI devices.
> 
> Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
> DEV_DMA_NOT_SUPPORTED?

I think it _should_ return DEV_DMA_COHERENT on x86/IA-64 (unless a _CCA 
method was actually present to say otherwise), based on 
acpi_init_coherency(), but I only know for sure what happens on arm64.

> I think I should give up on this and just redefine the existing iommu
> cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.

TBH I don't see any issue with current name, but I'd certainly be happy 
to nail down a specific definition for it, along the lines of "this 
means that IOMMU_CACHE mappings are generally coherent". That works for 
things like Arm's S2FWB making it OK to assign an otherwise-non-coherent 
device without extra hassle.

For the specific case of overriding PCIe No Snoop (which is more 
problematic from an Arm SMMU PoV) when assigning to a VM, would that not 
be easier solved by just having vfio-pci clear the "Enable No Snoop" 
control bit in the endpoint's PCIe capability?

>>> We could alternatively use existing device_get_dma_attr() as a default
>>> with an iommu wrapper and push the exception down through the iommu
>>> driver and s390 can override it.
>>>
>>
>> if going this way probably device_get_dma_attr() should be renamed to
>> device_fwnode_get_dma_attr() instead to make it clearer?
> 
> I'm looking at the few users:
> 
> drivers/ata/ahci_ceva.c
> drivers/ata/ahci_qoriq.c
>   - These are ARM only drivers. They are trying to copy the dma-coherent
>     property from its DT/ACPI definition to internal register settings
>     which look like they tune how the AXI bus transactions are created.
> 
>     I'm guessing the SATA IP block's AXI interface can be configured to
>     generate coherent or non-coherent requests and it has to be set
>     in a way that is consistent with the SOC architecture and match
>     what the DMA API expects the device will do.
> 
> drivers/crypto/ccp/sp-platform.c
>   - Only used on ARM64 and also programs a HW register similar to the
>     sata drivers. Refuses to work if the FW property is not present.
> 
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c
>   - Seems to be configuring another ARM AXI block
> 
> drivers/gpu/drm/panfrost/panfrost_drv.c
>   - Robin's commit comment here is good, and one of the things this
>     controls is if the coherent_walk is set for the io-pgtable-arm.c
>     code which avoids DMA API calls
> 
> drivers/gpu/drm/tegra/uapi.c
>   - Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.
> 
> My take is that the drivers using this API are doing it to make sure
> their HW blocks are setup in a way that is consistent with the DMA API
> they are also using, and run in constrained embedded-style
> environments that know the firmware support is present.
> 
> So in the end it does not seem suitable right now for linking to
> IOMMU_CACHE..

That seems a pretty good summary - I think they're basically all 
"firmware told Linux I'm coherent so I'd better act coherent" cases, but 
that still doesn't necessarily mean that they're *forced* to respect 
that. One of the things on my to-do list is to try adding a 
DMA_ATTR_NO_SNOOP that can force DMA cache maintenance for coherent 
devices, primarily to hook up in Panfrost (where there is a bit of a 
performance to claw back on the coherent AmLogic SoCs by leaving certain 
buffers non-cacheable).

Cheers,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-07 15:17                       ` Robin Murphy
  (?)
@ 2022-04-07 15:23                         ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 15:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:

> For the specific case of overriding PCIe No Snoop (which is more problematic
> from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> the endpoint's PCIe capability?

Ideally.

That was rediscussed recently, apparently there are non-compliant
devices and drivers that just ignore the bit. 

Presumably this is why x86 had to move to an IOMMU enforced feature..

> That seems a pretty good summary - I think they're basically all "firmware
> told Linux I'm coherent so I'd better act coherent" cases, but that still
> doesn't necessarily mean that they're *forced* to respect that. One of the
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force
> DMA cache maintenance for coherent devices, primarily to hook up in Panfrost
> (where there is a bit of a performance to claw back on the coherent AmLogic
> SoCs by leaving certain buffers non-cacheable).

It would be great to see that in a way that could bring in the few other
GPU drivers doing no-snoop to a formal DMA API instead of hacking
their own stuff with wbinvd calls or whatever.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:23                         ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe via iommu @ 2022-04-07 15:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Gerald Schaefer, Tian, Kevin, linux-arm-msm, Alex Williamson,
	linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:

> For the specific case of overriding PCIe No Snoop (which is more problematic
> from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> the endpoint's PCIe capability?

Ideally.

That was rediscussed recently, apparently there are non-compliant
devices and drivers that just ignore the bit. 

Presumably this is why x86 had to move to an IOMMU enforced feature..

> That seems a pretty good summary - I think they're basically all "firmware
> told Linux I'm coherent so I'd better act coherent" cases, but that still
> doesn't necessarily mean that they're *forced* to respect that. One of the
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force
> DMA cache maintenance for coherent devices, primarily to hook up in Panfrost
> (where there is a bit of a performance to claw back on the coherent AmLogic
> SoCs by leaving certain buffers non-cacheable).

It would be great to see that in a way that could bring in the few other
GPU drivers doing no-snoop to a formal DMA API instead of hacking
their own stuff with wbinvd calls or whatever.

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:23                         ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2022-04-07 15:23 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, Christoph Hellwig, Alex Williamson, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:

> For the specific case of overriding PCIe No Snoop (which is more problematic
> from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> the endpoint's PCIe capability?

Ideally.

That was rediscussed recently, apparently there are non-compliant
devices and drivers that just ignore the bit. 

Presumably this is why x86 had to move to an IOMMU enforced feature..

> That seems a pretty good summary - I think they're basically all "firmware
> told Linux I'm coherent so I'd better act coherent" cases, but that still
> doesn't necessarily mean that they're *forced* to respect that. One of the
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force
> DMA cache maintenance for coherent devices, primarily to hook up in Panfrost
> (where there is a bit of a performance to claw back on the coherent AmLogic
> SoCs by leaving certain buffers non-cacheable).

It would be great to see that in a way that could bring in the few other
GPU drivers doing no-snoop to a formal DMA API instead of hacking
their own stuff with wbinvd calls or whatever.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-07 15:17                       ` Robin Murphy
  (?)
  (?)
@ 2022-04-07 15:31                         ` Christoph Hellwig
  -1 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-07 15:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, Rob Clark, Jason Gunthorpe,
	Gerald Schaefer, linux-arm-msm, linux-arm-kernel, netdev,
	Cornelia Huck, iommu, Suravee Suthikulpanit, Christian Benvenuti,
	David Woodhouse, Lu Baolu

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
>> My take is that the drivers using this API are doing it to make sure
>> their HW blocks are setup in a way that is consistent with the DMA API
>> they are also using, and run in constrained embedded-style
>> environments that know the firmware support is present.
>>
>> So in the end it does not seem suitable right now for linking to
>> IOMMU_CACHE..
>
> That seems a pretty good summary - I think they're basically all "firmware 
> told Linux I'm coherent so I'd better act coherent" cases, but that still 
> doesn't necessarily mean that they're *forced* to respect that.

Yes. And the interface is horribly misnamed for that.  I'll see what
I can do to clean this up as I've noticed various other not very
nice things in that area.

> One of the 
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force 
> DMA cache maintenance for coherent devices, primarily to hook up in 
> Panfrost (where there is a bit of a performance to claw back on the 
> coherent AmLogic SoCs by leaving certain buffers non-cacheable).

This has been an explicit request from the amdgpu folks and thus been
on my TODO list for quite a while as well.  Note that I don't think it
should be a flag to dma_alloc_attrs, but rather for dma_alloc_pages
as the drivers that want non-snoop generally also want to actually
be able to deal with pages.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:31                         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-07 15:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	Jason Gunthorpe, Gerald Schaefer, Tian, Kevin, linux-arm-msm,
	Alex Williamson, linux-arm-kernel, netdev, Cornelia Huck, iommu,
	Christian Benvenuti, David Woodhouse

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
>> My take is that the drivers using this API are doing it to make sure
>> their HW blocks are setup in a way that is consistent with the DMA API
>> they are also using, and run in constrained embedded-style
>> environments that know the firmware support is present.
>>
>> So in the end it does not seem suitable right now for linking to
>> IOMMU_CACHE..
>
> That seems a pretty good summary - I think they're basically all "firmware 
> told Linux I'm coherent so I'd better act coherent" cases, but that still 
> doesn't necessarily mean that they're *forced* to respect that.

Yes. And the interface is horribly misnamed for that.  I'll see what
I can do to clean this up as I've noticed various other not very
nice things in that area.

> One of the 
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force 
> DMA cache maintenance for coherent devices, primarily to hook up in 
> Panfrost (where there is a bit of a performance to claw back on the 
> coherent AmLogic SoCs by leaving certain buffers non-cacheable).

This has been an explicit request from the amdgpu folks and thus been
on my TODO list for quite a while as well.  Note that I don't think it
should be a flag to dma_alloc_attrs, but rather for dma_alloc_pages
as the drivers that want non-snoop generally also want to actually
be able to deal with pages.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:31                         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-07 15:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jason Gunthorpe, Tian, Kevin, Christoph Hellwig, Alex Williamson,
	Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
>> My take is that the drivers using this API are doing it to make sure
>> their HW blocks are setup in a way that is consistent with the DMA API
>> they are also using, and run in constrained embedded-style
>> environments that know the firmware support is present.
>>
>> So in the end it does not seem suitable right now for linking to
>> IOMMU_CACHE..
>
> That seems a pretty good summary - I think they're basically all "firmware 
> told Linux I'm coherent so I'd better act coherent" cases, but that still 
> doesn't necessarily mean that they're *forced* to respect that.

Yes. And the interface is horribly misnamed for that.  I'll see what
I can do to clean this up as I've noticed various other not very
nice things in that area.

> One of the 
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force 
> DMA cache maintenance for coherent devices, primarily to hook up in 
> Panfrost (where there is a bit of a performance to claw back on the 
> coherent AmLogic SoCs by leaving certain buffers non-cacheable).

This has been an explicit request from the amdgpu folks and thus been
on my TODO list for quite a while as well.  Note that I don't think it
should be a flag to dma_alloc_attrs, but rather for dma_alloc_pages
as the drivers that want non-snoop generally also want to actually
be able to deal with pages.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 15:31                         ` Christoph Hellwig
  0 siblings, 0 replies; 135+ messages in thread
From: Christoph Hellwig @ 2022-04-07 15:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jason Gunthorpe, Tian, Kevin, Christoph Hellwig, Alex Williamson,
	Lu Baolu, Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
>> My take is that the drivers using this API are doing it to make sure
>> their HW blocks are setup in a way that is consistent with the DMA API
>> they are also using, and run in constrained embedded-style
>> environments that know the firmware support is present.
>>
>> So in the end it does not seem suitable right now for linking to
>> IOMMU_CACHE..
>
> That seems a pretty good summary - I think they're basically all "firmware 
> told Linux I'm coherent so I'd better act coherent" cases, but that still 
> doesn't necessarily mean that they're *forced* to respect that.

Yes. And the interface is horribly misnamed for that.  I'll see what
I can do to clean this up as I've noticed various other not very
nice things in that area.

> One of the 
> things on my to-do list is to try adding a DMA_ATTR_NO_SNOOP that can force 
> DMA cache maintenance for coherent devices, primarily to hook up in 
> Panfrost (where there is a bit of a performance to claw back on the 
> coherent AmLogic SoCs by leaving certain buffers non-cacheable).

This has been an explicit request from the amdgpu folks and thus been
on my TODO list for quite a while as well.  Note that I don't think it
should be a flag to dma_alloc_attrs, but rather for dma_alloc_pages
as the drivers that want non-snoop generally also want to actually
be able to deal with pages.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
  2022-04-07 15:23                         ` Jason Gunthorpe via iommu
  (?)
  (?)
@ 2022-04-07 22:37                           ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-07 22:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Michael S. Tsirkin, Jason Wang, virtualization,
	Will Deacon, Christoph Hellwig, linux-s390, kvm, linux-rdma,
	iommu, Gerald Schaefer, Tian, Kevin, linux-arm-msm, Robin Murphy,
	linux-arm-kernel, netdev, Cornelia Huck, Christian Benvenuti,
	David Woodhouse

On Thu, 7 Apr 2022 12:23:31 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
> 
> > For the specific case of overriding PCIe No Snoop (which is more problematic
> > from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> > solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> > the endpoint's PCIe capability?  
> 
> Ideally.
> 
> That was rediscussed recently, apparently there are non-compliant
> devices and drivers that just ignore the bit. 
> 
> Presumably this is why x86 had to move to an IOMMU enforced feature..

I considered this option when implementing the current solution, but
ultimately I didn't have confidence in being able to prevent drivers
from using device specific means to effect the change anyway.  GPUs
especially have various back channels to config space.  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 22:37                           ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-07 22:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nelson Escobar, Matthew Rosato, Michael S. Tsirkin,
	virtualization, Will Deacon, Christoph Hellwig, linux-s390, kvm,
	linux-rdma, Joerg Roedel, iommu, Gerald Schaefer, linux-arm-msm,
	Robin Murphy, linux-arm-kernel, netdev, Cornelia Huck, Rob Clark,
	Suravee Suthikulpanit, Christian Benvenuti, David Woodhouse,
	Lu Baolu

On Thu, 7 Apr 2022 12:23:31 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
> 
> > For the specific case of overriding PCIe No Snoop (which is more problematic
> > from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> > solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> > the endpoint's PCIe capability?  
> 
> Ideally.
> 
> That was rediscussed recently, apparently there are non-compliant
> devices and drivers that just ignore the bit. 
> 
> Presumably this is why x86 had to move to an IOMMU enforced feature..

I considered this option when implementing the current solution, but
ultimately I didn't have confidence in being able to prevent drivers
from using device specific means to effect the change anyway.  GPUs
especially have various back channels to config space.  Thanks,

Alex

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 22:37                           ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-07 22:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Tian, Kevin, Christoph Hellwig, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, 7 Apr 2022 12:23:31 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
> 
> > For the specific case of overriding PCIe No Snoop (which is more problematic
> > from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> > solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> > the endpoint's PCIe capability?  
> 
> Ideally.
> 
> That was rediscussed recently, apparently there are non-compliant
> devices and drivers that just ignore the bit. 
> 
> Presumably this is why x86 had to move to an IOMMU enforced feature..

I considered this option when implementing the current solution, but
ultimately I didn't have confidence in being able to prevent drivers
from using device specific means to effect the change anyway.  GPUs
especially have various back channels to config space.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()
@ 2022-04-07 22:37                           ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2022-04-07 22:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, Tian, Kevin, Christoph Hellwig, Lu Baolu,
	Christian Benvenuti, Cornelia Huck, David Woodhouse,
	Gerald Schaefer, iommu, Jason Wang, Joerg Roedel, kvm,
	linux-arm-kernel, linux-arm-msm, linux-rdma, linux-s390,
	Matthew Rosato, Michael S. Tsirkin, Nelson Escobar, netdev,
	Rob Clark, Suravee Suthikulpanit, virtualization, Will Deacon

On Thu, 7 Apr 2022 12:23:31 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Apr 07, 2022 at 04:17:11PM +0100, Robin Murphy wrote:
> 
> > For the specific case of overriding PCIe No Snoop (which is more problematic
> > from an Arm SMMU PoV) when assigning to a VM, would that not be easier
> > solved by just having vfio-pci clear the "Enable No Snoop" control bit in
> > the endpoint's PCIe capability?  
> 
> Ideally.
> 
> That was rediscussed recently, apparently there are non-compliant
> devices and drivers that just ignore the bit. 
> 
> Presumably this is why x86 had to move to an IOMMU enforced feature..

I considered this option when implementing the current solution, but
ultimately I didn't have confidence in being able to prevent drivers
from using device specific means to effect the change anyway.  GPUs
especially have various back channels to config space.  Thanks,

Alex


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 135+ messages in thread

end of thread, other threads:[~2022-04-07 22:39 UTC | newest]

Thread overview: 135+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 16:15 [PATCH 0/5] Make the iommu driver no-snoop block feature consistent Jason Gunthorpe via iommu
2022-04-05 16:15 ` Jason Gunthorpe
2022-04-05 16:15 ` Jason Gunthorpe
2022-04-05 16:16 ` [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent() Jason Gunthorpe via iommu
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-06  5:30   ` Christoph Hellwig
2022-04-06  5:30     ` Christoph Hellwig
2022-04-06  5:30     ` Christoph Hellwig
2022-04-06  5:30     ` Christoph Hellwig
2022-04-06 12:07     ` Jason Gunthorpe via iommu
2022-04-06 12:07       ` Jason Gunthorpe
2022-04-06 12:07       ` Jason Gunthorpe
2022-04-06 13:51       ` Christoph Hellwig
2022-04-06 13:51         ` Christoph Hellwig
2022-04-06 13:51         ` Christoph Hellwig
2022-04-06 13:51         ` Christoph Hellwig
2022-04-06 14:14         ` Jason Gunthorpe via iommu
2022-04-06 14:14           ` Jason Gunthorpe
2022-04-06 14:14           ` Jason Gunthorpe
2022-04-06 15:47           ` Christoph Hellwig
2022-04-06 15:47             ` Christoph Hellwig
2022-04-06 15:47             ` Christoph Hellwig
2022-04-06 15:47             ` Christoph Hellwig
2022-04-06 15:48             ` Jason Gunthorpe via iommu
2022-04-06 15:48               ` Jason Gunthorpe
2022-04-06 15:48               ` Jason Gunthorpe
2022-04-06 15:48           ` Robin Murphy
2022-04-06 15:48             ` Robin Murphy
2022-04-06 15:48             ` Robin Murphy
2022-04-06 15:48             ` Robin Murphy
2022-04-06 13:56   ` Robin Murphy
2022-04-06 13:56     ` Robin Murphy
2022-04-06 13:56     ` Robin Murphy
2022-04-06 13:56     ` Robin Murphy
2022-04-06 14:24     ` Jason Gunthorpe via iommu
2022-04-06 14:24       ` Jason Gunthorpe
2022-04-06 14:24       ` Jason Gunthorpe
2022-04-06 15:18       ` Jason Gunthorpe via iommu
2022-04-06 15:18         ` Jason Gunthorpe
2022-04-06 15:18         ` Jason Gunthorpe
2022-04-06 15:50         ` Christoph Hellwig
2022-04-06 15:50           ` Christoph Hellwig
2022-04-06 15:50           ` Christoph Hellwig
2022-04-06 15:50           ` Christoph Hellwig
2022-04-06 16:06           ` Jason Gunthorpe via iommu
2022-04-06 16:06             ` Jason Gunthorpe
2022-04-06 16:06             ` Jason Gunthorpe
2022-04-06 16:10             ` Christoph Hellwig
2022-04-06 16:10               ` Christoph Hellwig
2022-04-06 16:10               ` Christoph Hellwig
2022-04-06 16:10               ` Christoph Hellwig
2022-04-06 17:17               ` Jason Gunthorpe via iommu
2022-04-06 17:17                 ` Jason Gunthorpe
2022-04-06 17:17                 ` Jason Gunthorpe
2022-04-07  7:18                 ` Tian, Kevin
2022-04-07  7:18                   ` Tian, Kevin
2022-04-07  7:18                   ` Tian, Kevin
2022-04-07  7:18                   ` Tian, Kevin
2022-04-07 13:59                   ` Jason Gunthorpe via iommu
2022-04-07 13:59                     ` Jason Gunthorpe
2022-04-07 13:59                     ` Jason Gunthorpe
2022-04-07 15:17                     ` Robin Murphy
2022-04-07 15:17                       ` Robin Murphy
2022-04-07 15:17                       ` Robin Murphy
2022-04-07 15:17                       ` Robin Murphy
2022-04-07 15:23                       ` Jason Gunthorpe
2022-04-07 15:23                         ` Jason Gunthorpe
2022-04-07 15:23                         ` Jason Gunthorpe via iommu
2022-04-07 22:37                         ` Alex Williamson
2022-04-07 22:37                           ` Alex Williamson
2022-04-07 22:37                           ` Alex Williamson
2022-04-07 22:37                           ` Alex Williamson
2022-04-07 15:31                       ` Christoph Hellwig
2022-04-07 15:31                         ` Christoph Hellwig
2022-04-07 15:31                         ` Christoph Hellwig
2022-04-07 15:31                         ` Christoph Hellwig
2022-04-07  8:53                 ` Niklas Schnelle
2022-04-07  8:53                   ` Niklas Schnelle
2022-04-07  8:53                   ` Niklas Schnelle
2022-04-05 16:16 ` [PATCH 2/5] vfio: Require that devices support DMA cache coherence Jason Gunthorpe via iommu
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 19:10   ` Alex Williamson
2022-04-05 19:10     ` Alex Williamson
2022-04-05 19:10     ` Alex Williamson
2022-04-05 19:10     ` Alex Williamson
2022-04-05 19:29     ` Jason Gunthorpe via iommu
2022-04-05 19:29       ` Jason Gunthorpe
2022-04-05 19:29       ` Jason Gunthorpe
2022-04-06  7:02       ` Tian, Kevin
2022-04-06  7:02         ` Tian, Kevin
2022-04-06  7:02         ` Tian, Kevin
2022-04-06  7:02         ` Tian, Kevin
2022-04-07 14:53         ` Jason Gunthorpe
2022-04-07 14:53           ` Jason Gunthorpe
2022-04-07 14:53           ` Jason Gunthorpe via iommu
2022-04-05 16:16 ` [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency() Jason Gunthorpe via iommu
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 19:50   ` Alex Williamson
2022-04-05 19:50     ` Alex Williamson
2022-04-05 19:50     ` Alex Williamson
2022-04-05 19:50     ` Alex Williamson
2022-04-05 22:57     ` Jason Gunthorpe via iommu
2022-04-05 22:57       ` Jason Gunthorpe
2022-04-05 22:57       ` Jason Gunthorpe
2022-04-05 23:31       ` Tian, Kevin
2022-04-05 23:31         ` Tian, Kevin
2022-04-05 23:31         ` Tian, Kevin
2022-04-05 23:31         ` Tian, Kevin
2022-04-06  0:08       ` Tian, Kevin
2022-04-06  0:08         ` Tian, Kevin
2022-04-06  0:08         ` Tian, Kevin
2022-04-06  0:08         ` Tian, Kevin
2022-04-06  7:09   ` Tian, Kevin
2022-04-06  7:09     ` Tian, Kevin
2022-04-06  7:09     ` Tian, Kevin
2022-04-06  7:09     ` Tian, Kevin
2022-04-06 12:27     ` Jason Gunthorpe via iommu
2022-04-06 12:27       ` Jason Gunthorpe
2022-04-06 12:27       ` Jason Gunthorpe
2022-04-05 16:16 ` [PATCH 4/5] vfio: Move the Intel no-snoop control off of IOMMU_CACHE Jason Gunthorpe via iommu
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16 ` [PATCH 5/5] iommu: Delete IOMMU_CAP_CACHE_COHERENCY Jason Gunthorpe via iommu
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-05 16:16   ` Jason Gunthorpe
2022-04-06  6:52 ` [PATCH 0/5] Make the iommu driver no-snoop block feature consistent Tian, Kevin
2022-04-06  6:52   ` Tian, Kevin
2022-04-06  6:52   ` Tian, Kevin
2022-04-06  6:52   ` Tian, Kevin
2022-04-07 14:56   ` Jason Gunthorpe
2022-04-07 14:56     ` Jason Gunthorpe
2022-04-07 14:56     ` Jason Gunthorpe via iommu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.