All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Add PCI ATS support to Arm SMMUv3
@ 2019-03-20 17:36 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

Add ATS support to the SMMUv3 driver. The previous posting was about a
year ago, as part of SVA v2 [1]. I feel slightly more confident
upstreaming this now that we can disable the feature for untrusted
devices. It's a low-hanging fruit in the SVA patch stack, and a good
candidate for v5.2.

As for testing, there is a public (free as in beer) software model that
has both SMMUv3 and a SATA controller with ATS [2]. It's not much since,
except for a dev_dbg message, you won't notice a difference when
enabling ATS, but it can still be used for checking that the patches
don't break anything.

[1] https://www.spinics.net/lists/kvm/msg168742.html
[2] https://patchwork.kernel.org/patch/10781793/

Jean-Philippe Brucker (4):
  ACPI/IORT: Check ATS capability in root complex nodes
  iommu/arm-smmu-v3: Link domains and devices
  iommu/arm-smmu-v3: Add support for PCI ATS
  iommu/arm-smmu-v3: Disable tagged pointers

 drivers/acpi/arm64/iort.c   |  11 ++
 drivers/iommu/arm-smmu-v3.c | 250 +++++++++++++++++++++++++++++++++++-
 include/linux/iommu.h       |   4 +
 3 files changed, 259 insertions(+), 6 deletions(-)

-- 
2.21.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 0/4] Add PCI ATS support to Arm SMMUv3
@ 2019-03-20 17:36 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

Add ATS support to the SMMUv3 driver. The previous posting was about a
year ago, as part of SVA v2 [1]. I feel slightly more confident
upstreaming this now that we can disable the feature for untrusted
devices. It's a low-hanging fruit in the SVA patch stack, and a good
candidate for v5.2.

As for testing, there is a public (free as in beer) software model that
has both SMMUv3 and a SATA controller with ATS [2]. It's not much since,
except for a dev_dbg message, you won't notice a difference when
enabling ATS, but it can still be used for checking that the patches
don't break anything.

[1] https://www.spinics.net/lists/kvm/msg168742.html
[2] https://patchwork.kernel.org/patch/10781793/

Jean-Philippe Brucker (4):
  ACPI/IORT: Check ATS capability in root complex nodes
  iommu/arm-smmu-v3: Link domains and devices
  iommu/arm-smmu-v3: Add support for PCI ATS
  iommu/arm-smmu-v3: Disable tagged pointers

 drivers/acpi/arm64/iort.c   |  11 ++
 drivers/iommu/arm-smmu-v3.c | 250 +++++++++++++++++++++++++++++++++++-
 include/linux/iommu.h       |   4 +
 3 files changed, 259 insertions(+), 6 deletions(-)

-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
  2019-03-20 17:36 ` Jean-Philippe Brucker
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.

Use the negative version (NO_ATS) at the moment because it's not clear
if/how the bit needs to be integrated in other firmware descriptions. The
SMMU has a feature bit telling if it supports ATS, which might be
sufficient in most systems for deciding whether or not we should enable
the ATS capability in endpoints.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/acpi/arm64/iort.c | 11 +++++++++++
 include/linux/iommu.h     |  4 ++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e48894e002ba..7f2c1c9c6b38 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
 }
 
+static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
+{
+	struct acpi_iort_root_complex *pci_rc;
+
+	pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+	return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
+}
+
 /**
  * iort_iommu_configure - Set-up IOMMU configuration for a device.
  *
@@ -1063,6 +1071,9 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev)
 		info.node = node;
 		err = pci_for_each_dma_alias(to_pci_dev(dev),
 					     iort_pci_iommu_init, &info);
+
+		if (!err && !iort_pci_rc_supports_ats(node))
+			dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
 	} else {
 		int i = 0;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3dbeb457fb16..ed6738c358ca 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -509,10 +509,14 @@ struct iommu_fwspec {
 	const struct iommu_ops	*ops;
 	struct fwnode_handle	*iommu_fwnode;
 	void			*iommu_priv;
+	u32			flags;
 	unsigned int		num_ids;
 	u32			ids[1];
 };
 
+/* Firmware disabled ATS in the root complex */
+#define IOMMU_FWSPEC_PCI_NO_ATS			(1 << 0)
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
 		      const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

Root complex node in IORT has a bit telling whether it supports ATS or
not. Store this bit in the IOMMU fwspec when setting up a device, so it
can be accessed later by an IOMMU driver.

Use the negative version (NO_ATS) at the moment because it's not clear
if/how the bit needs to be integrated in other firmware descriptions. The
SMMU has a feature bit telling if it supports ATS, which might be
sufficient in most systems for deciding whether or not we should enable
the ATS capability in endpoints.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/acpi/arm64/iort.c | 11 +++++++++++
 include/linux/iommu.h     |  4 ++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e48894e002ba..7f2c1c9c6b38 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1028,6 +1028,14 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
 }
 
+static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
+{
+	struct acpi_iort_root_complex *pci_rc;
+
+	pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+	return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
+}
+
 /**
  * iort_iommu_configure - Set-up IOMMU configuration for a device.
  *
@@ -1063,6 +1071,9 @@ const struct iommu_ops *iort_iommu_configure(struct device *dev)
 		info.node = node;
 		err = pci_for_each_dma_alias(to_pci_dev(dev),
 					     iort_pci_iommu_init, &info);
+
+		if (!err && !iort_pci_rc_supports_ats(node))
+			dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
 	} else {
 		int i = 0;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3dbeb457fb16..ed6738c358ca 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -509,10 +509,14 @@ struct iommu_fwspec {
 	const struct iommu_ops	*ops;
 	struct fwnode_handle	*iommu_fwnode;
 	void			*iommu_priv;
+	u32			flags;
 	unsigned int		num_ids;
 	u32			ids[1];
 };
 
+/* Firmware disabled ATS in the root complex */
+#define IOMMU_FWSPEC_PCI_NO_ATS			(1 << 0)
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
 		      const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
  2019-03-20 17:36 ` Jean-Philippe Brucker
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

When removing a mapping from a domain, we need to send an invalidation to
all devices that might have stored it in their Address Translation Cache
(ATC). In addition when updating the context descriptor of a live domain,
we'll need to send invalidations for all devices attached to it.

Maintain a list of devices in each domain, protected by a spinlock. It is
updated every time we attach or detach devices to and from domains.

It needs to be a spinlock because we'll invalidate ATC entries from
within hardirq-safe contexts, but it may be possible to relax the read
side with RCU later.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..66a29c113dbc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -594,6 +594,11 @@ struct arm_smmu_device {
 struct arm_smmu_master_data {
 	struct arm_smmu_device		*smmu;
 	struct arm_smmu_strtab_ent	ste;
+
+	struct arm_smmu_domain		*domain;
+	struct list_head		domain_head;
+
+	struct device			*dev;
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -618,6 +623,9 @@ struct arm_smmu_domain {
 	};
 
 	struct iommu_domain		domain;
+
+	struct list_head		devices;
+	spinlock_t			devices_lock;
 };
 
 struct arm_smmu_option_prop {
@@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	}
 
 	mutex_init(&smmu_domain->init_mutex);
+	INIT_LIST_HEAD(&smmu_domain->devices);
+	spin_lock_init(&smmu_domain->devices_lock);
+
 	return &smmu_domain->domain;
 }
 
@@ -1711,8 +1722,18 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 
 static void arm_smmu_detach_dev(struct device *dev)
 {
+	unsigned long flags;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_master_data *master = fwspec->iommu_priv;
+	struct arm_smmu_domain *smmu_domain = master->domain;
+
+	if (smmu_domain) {
+		spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+		list_del(&master->domain_head);
+		spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+		master->domain = NULL;
+	}
 
 	master->ste.assigned = false;
 	arm_smmu_install_ste_for_dev(fwspec);
@@ -1721,6 +1742,7 @@ static void arm_smmu_detach_dev(struct device *dev)
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
+	unsigned long flags;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -1757,6 +1779,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	}
 
 	ste->assigned = true;
+	master->domain = smmu_domain;
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_add(&master->domain_head, &smmu_domain->devices);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 	if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) {
 		ste->s1_cfg = NULL;
@@ -1883,6 +1910,7 @@ static int arm_smmu_add_device(struct device *dev)
 			return -ENOMEM;
 
 		master->smmu = smmu;
+		master->dev = dev;
 		fwspec->iommu_priv = master;
 	}
 
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

When removing a mapping from a domain, we need to send an invalidation to
all devices that might have stored it in their Address Translation Cache
(ATC). In addition when updating the context descriptor of a live domain,
we'll need to send invalidations for all devices attached to it.

Maintain a list of devices in each domain, protected by a spinlock. It is
updated every time we attach or detach devices to and from domains.

It needs to be a spinlock because we'll invalidate ATC entries from
within hardirq-safe contexts, but it may be possible to relax the read
side with RCU later.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d3880010c6cf..66a29c113dbc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -594,6 +594,11 @@ struct arm_smmu_device {
 struct arm_smmu_master_data {
 	struct arm_smmu_device		*smmu;
 	struct arm_smmu_strtab_ent	ste;
+
+	struct arm_smmu_domain		*domain;
+	struct list_head		domain_head;
+
+	struct device			*dev;
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -618,6 +623,9 @@ struct arm_smmu_domain {
 	};
 
 	struct iommu_domain		domain;
+
+	struct list_head		devices;
+	spinlock_t			devices_lock;
 };
 
 struct arm_smmu_option_prop {
@@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	}
 
 	mutex_init(&smmu_domain->init_mutex);
+	INIT_LIST_HEAD(&smmu_domain->devices);
+	spin_lock_init(&smmu_domain->devices_lock);
+
 	return &smmu_domain->domain;
 }
 
@@ -1711,8 +1722,18 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 
 static void arm_smmu_detach_dev(struct device *dev)
 {
+	unsigned long flags;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_master_data *master = fwspec->iommu_priv;
+	struct arm_smmu_domain *smmu_domain = master->domain;
+
+	if (smmu_domain) {
+		spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+		list_del(&master->domain_head);
+		spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+		master->domain = NULL;
+	}
 
 	master->ste.assigned = false;
 	arm_smmu_install_ste_for_dev(fwspec);
@@ -1721,6 +1742,7 @@ static void arm_smmu_detach_dev(struct device *dev)
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
+	unsigned long flags;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -1757,6 +1779,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	}
 
 	ste->assigned = true;
+	master->domain = smmu_domain;
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_add(&master->domain_head, &smmu_domain->devices);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 	if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) {
 		ste->s1_cfg = NULL;
@@ -1883,6 +1910,7 @@ static int arm_smmu_add_device(struct device *dev)
 			return -ENOMEM;
 
 		master->smmu = smmu;
+		master->dev = dev;
 		fwspec->iommu_priv = master;
 	}
 
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
  2019-03-20 17:36 ` Jean-Philippe Brucker
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). Enable Address Translation Service (ATS) for devices that support
it and send them invalidation requests whenever we invalidate the IOTLBs.

ATC invalidation is allowed to take up to 90 seconds, according to the
PCIe spec, so it is possible to get a SMMU command queue timeout during
normal operations. However we expect implementations to complete
invalidation in reasonable time.

We only enable ATS for "trusted" devices, and currently rely on the
pci_dev->untrusted bit. For ATS we have to trust that:

(a) The device doesn't issue "translated" memory requests for addresses
    that weren't returned by the SMMU in a Translation Completion. In
    particular, if we give control of a device or device partition to a VM
    or userspace, software cannot program the device to access arbitrary
    "translated" addresses.

(b) The device follows permissions granted by the SMMU in a Translation
    Completion. If the device requested read+write permission and only
    got read, then it doesn't write.

(c) The device doesn't send Translated transactions for an address that
    was invalidated by an ATC invalidation.

Note that the PCIe specification explicitly requires all of these, so we
can assume that implementations will cleanly shield ATCs from software.

All ATS translated requests still go through the SMMU, to walk the stream
table and check that the device is actually allowed to send translated
requests.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
Since last version:
* Refresh
* Don't enable ATS for untrusted devices, or when pci.noats is on the
  kernel command line.
---
 drivers/iommu/arm-smmu-v3.c | 221 +++++++++++++++++++++++++++++++++++-
 1 file changed, 216 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 66a29c113dbc..c605d6f1b2df 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -29,6 +29,7 @@
 #include <linux/of_iommu.h>
 #include <linux/of_platform.h>
 #include <linux/pci.h>
+#include <linux/pci-ats.h>
 #include <linux/platform_device.h>
 
 #include <linux/amba/bus.h>
@@ -86,6 +87,7 @@
 #define IDR5_VAX_52_BIT			1
 
 #define ARM_SMMU_CR0			0x20
+#define CR0_ATSCHK			(1 << 4)
 #define CR0_CMDQEN			(1 << 3)
 #define CR0_EVTQEN			(1 << 2)
 #define CR0_PRIQEN			(1 << 1)
@@ -294,6 +296,7 @@
 #define CMDQ_ERR_CERROR_NONE_IDX	0
 #define CMDQ_ERR_CERROR_ILL_IDX		1
 #define CMDQ_ERR_CERROR_ABT_IDX		2
+#define CMDQ_ERR_CERROR_ATC_INV_IDX	3
 
 #define CMDQ_0_OP			GENMASK_ULL(7, 0)
 #define CMDQ_0_SSV			(1UL << 11)
@@ -312,6 +315,12 @@
 #define CMDQ_TLBI_1_VA_MASK		GENMASK_ULL(63, 12)
 #define CMDQ_TLBI_1_IPA_MASK		GENMASK_ULL(51, 12)
 
+#define CMDQ_ATC_0_SSID			GENMASK_ULL(31, 12)
+#define CMDQ_ATC_0_SID			GENMASK_ULL(63, 32)
+#define CMDQ_ATC_0_GLOBAL		(1UL << 9)
+#define CMDQ_ATC_1_SIZE			GENMASK_ULL(5, 0)
+#define CMDQ_ATC_1_ADDR_MASK		GENMASK_ULL(63, 12)
+
 #define CMDQ_PRI_0_SSID			GENMASK_ULL(31, 12)
 #define CMDQ_PRI_0_SID			GENMASK_ULL(63, 32)
 #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
@@ -365,6 +374,11 @@ module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
 	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
 
+static bool disable_ats_check;
+module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO);
+MODULE_PARM_DESC(disable_ats_check,
+	"By default, the SMMU checks whether each incoming transaction marked as translated is allowed by the stream configuration. This option disables the check.");
+
 enum pri_resp {
 	PRI_RESP_DENY = 0,
 	PRI_RESP_FAIL = 1,
@@ -433,6 +447,16 @@ struct arm_smmu_cmdq_ent {
 			u64			addr;
 		} tlbi;
 
+		#define CMDQ_OP_ATC_INV		0x40
+		#define ATC_INV_SIZE_ALL	52
+		struct {
+			u32			sid;
+			u32			ssid;
+			u64			addr;
+			u8			size;
+			bool			global;
+		} atc;
+
 		#define CMDQ_OP_PRI_RESP	0x41
 		struct {
 			u32			sid;
@@ -516,6 +540,8 @@ struct arm_smmu_strtab_ent {
 	bool				assigned;
 	struct arm_smmu_s1_cfg		*s1_cfg;
 	struct arm_smmu_s2_cfg		*s2_cfg;
+
+	bool				ats_enabled;
 };
 
 struct arm_smmu_strtab_cfg {
@@ -828,6 +854,14 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
 	case CMDQ_OP_TLBI_S12_VMALL:
 		cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
 		break;
+	case CMDQ_OP_ATC_INV:
+		cmd[0] |= FIELD_PREP(CMDQ_0_SSV, ent->substream_valid);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_GLOBAL, ent->atc.global);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_SSID, ent->atc.ssid);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_SID, ent->atc.sid);
+		cmd[1] |= FIELD_PREP(CMDQ_ATC_1_SIZE, ent->atc.size);
+		cmd[1] |= ent->atc.addr & CMDQ_ATC_1_ADDR_MASK;
+		break;
 	case CMDQ_OP_PRI_RESP:
 		cmd[0] |= FIELD_PREP(CMDQ_0_SSV, ent->substream_valid);
 		cmd[0] |= FIELD_PREP(CMDQ_PRI_0_SSID, ent->pri.ssid);
@@ -872,6 +906,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
 		[CMDQ_ERR_CERROR_NONE_IDX]	= "No error",
 		[CMDQ_ERR_CERROR_ILL_IDX]	= "Illegal command",
 		[CMDQ_ERR_CERROR_ABT_IDX]	= "Abort on command fetch",
+		[CMDQ_ERR_CERROR_ATC_INV_IDX]	= "ATC invalidate timeout",
 	};
 
 	int i;
@@ -891,6 +926,14 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
 		dev_err(smmu->dev, "retrying command fetch\n");
 	case CMDQ_ERR_CERROR_NONE_IDX:
 		return;
+	case CMDQ_ERR_CERROR_ATC_INV_IDX:
+		/*
+		 * ATC Invalidation Completion timeout. CONS is still pointing
+		 * at the CMD_SYNC. Attempt to complete other pending commands
+		 * by repeating the CMD_SYNC, though we might well end up back
+		 * here since the ATC invalidation may still be pending.
+		 */
+		return;
 	case CMDQ_ERR_CERROR_ILL_IDX:
 		/* Fallthrough */
 	default:
@@ -1166,15 +1209,16 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
-#ifdef CONFIG_PCI_ATS
-			 FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS) |
-#endif
 			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
+		if (ste->ats_enabled)
+			dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+							 STRTAB_STE_1_EATS_TRANS));
+
 		val |= (ste->s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS);
 	}
@@ -1398,6 +1442,98 @@ static irqreturn_t arm_smmu_combined_irq_handler(int irq, void *dev)
 	return IRQ_WAKE_THREAD;
 }
 
+static void
+arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
+			struct arm_smmu_cmdq_ent *cmd)
+{
+	size_t log2_span;
+	size_t span_mask;
+	/* ATC invalidates are always on 4096 bytes pages */
+	size_t inval_grain_shift = 12;
+	unsigned long page_start, page_end;
+
+	*cmd = (struct arm_smmu_cmdq_ent) {
+		.opcode			= CMDQ_OP_ATC_INV,
+		.substream_valid	= !!ssid,
+		.atc.ssid		= ssid,
+	};
+
+	if (!size) {
+		cmd->atc.size = ATC_INV_SIZE_ALL;
+		return;
+	}
+
+	page_start	= iova >> inval_grain_shift;
+	page_end	= (iova + size - 1) >> inval_grain_shift;
+
+	/*
+	 * Find the smallest power of two that covers the range. Most
+	 * significant differing bit between start and end address indicates the
+	 * required span, ie. fls(start ^ end). For example:
+	 *
+	 * We want to invalidate pages [8; 11]. This is already the ideal range:
+	 *		x = 0b1000 ^ 0b1011 = 0b11
+	 *		span = 1 << fls(x) = 4
+	 *
+	 * To invalidate pages [7; 10], we need to invalidate [0; 15]:
+	 *		x = 0b0111 ^ 0b1010 = 0b1101
+	 *		span = 1 << fls(x) = 16
+	 */
+	log2_span	= fls_long(page_start ^ page_end);
+	span_mask	= (1ULL << log2_span) - 1;
+
+	page_start	&= ~span_mask;
+
+	cmd->atc.addr	= page_start << inval_grain_shift;
+	cmd->atc.size	= log2_span;
+}
+
+static void arm_smmu_atc_inv_master(struct arm_smmu_master_data *master,
+				    struct arm_smmu_cmdq_ent *cmd)
+{
+	int i;
+	struct iommu_fwspec *fwspec = master->dev->iommu_fwspec;
+
+	if (!master->ste.ats_enabled)
+		return;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		cmd->atc.sid = fwspec->ids[i];
+		arm_smmu_cmdq_issue_cmd(master->smmu, cmd);
+	}
+
+	arm_smmu_cmdq_issue_sync(master->smmu);
+
+	return;
+}
+
+static void arm_smmu_atc_inv_master_all(struct arm_smmu_master_data *master,
+				       int ssid)
+{
+	struct arm_smmu_cmdq_ent cmd;
+
+	arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
+	arm_smmu_atc_inv_master(master, &cmd);
+}
+
+static void arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+				    int ssid, unsigned long iova, size_t size)
+{
+	unsigned long flags;
+	struct arm_smmu_cmdq_ent cmd;
+	struct arm_smmu_master_data *master;
+
+	if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
+		return;
+
+	arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_atc_inv_master(master, &cmd);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
+
 /* IO_PGTABLE API */
 static void arm_smmu_tlb_sync(void *cookie)
 {
@@ -1728,6 +1864,8 @@ static void arm_smmu_detach_dev(struct device *dev)
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (smmu_domain) {
+		arm_smmu_atc_inv_master_all(master, 0);
+
 		spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 		list_del(&master->domain_head);
 		spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
@@ -1817,12 +1955,18 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t
 arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	int ret;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
 
 	if (!ops)
 		return 0;
 
-	return ops->unmap(ops, iova, size);
+	ret = ops->unmap(ops, iova, size);
+	if (ret)
+		arm_smmu_atc_inv_domain(smmu_domain, 0, iova, size);
+
+	return ret;
 }
 
 static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
@@ -1881,6 +2025,58 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static int arm_smmu_enable_ats(struct arm_smmu_master_data *master)
+{
+	size_t stu;
+	int ret, pos;
+	struct pci_dev *pdev;
+	struct arm_smmu_device *smmu = master->smmu;
+	struct iommu_fwspec *fwspec = master->dev->iommu_fwspec;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev) ||
+	    (fwspec->flags & IOMMU_FWSPEC_PCI_NO_ATS) || pci_ats_disabled())
+		return -ENOSYS;
+
+	pdev = to_pci_dev(master->dev);
+	if (pdev->untrusted)
+		return -EPERM;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
+	if (!pos)
+		return -ENOSYS;
+
+	/* Smallest Translation Unit: log2 of the smallest supported granule */
+	stu = __ffs(smmu->pgsize_bitmap);
+
+	ret = pci_enable_ats(pdev, stu);
+	if (ret) {
+		dev_err(&pdev->dev, "could not enable ATS: %d\n", ret);
+		return ret;
+	}
+
+	master->ste.ats_enabled = true;
+	dev_dbg(&pdev->dev, "enabled ATS (STU=%zu, QDEP=%d)\n", stu,
+		pci_ats_queue_depth(pdev));
+
+	return 0;
+}
+
+static void arm_smmu_disable_ats(struct arm_smmu_master_data *master)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(master->dev))
+		return;
+
+	pdev = to_pci_dev(master->dev);
+
+	if (!pdev->ats_enabled)
+		return;
+
+	pci_disable_ats(pdev);
+	master->ste.ats_enabled = false;
+}
+
 static struct iommu_ops arm_smmu_ops;
 
 static int arm_smmu_add_device(struct device *dev)
@@ -1929,10 +2125,14 @@ static int arm_smmu_add_device(struct device *dev)
 		}
 	}
 
+	arm_smmu_enable_ats(master);
+
 	group = iommu_group_get_for_dev(dev);
 	if (!IS_ERR(group)) {
 		iommu_group_put(group);
 		iommu_device_link(&smmu->iommu, dev);
+	} else {
+		arm_smmu_disable_ats(master);
 	}
 
 	return PTR_ERR_OR_ZERO(group);
@@ -1953,6 +2153,7 @@ static void arm_smmu_remove_device(struct device *dev)
 		arm_smmu_detach_dev(dev);
 	iommu_group_remove_device(dev);
 	iommu_device_unlink(&smmu->iommu, dev);
+	arm_smmu_disable_ats(master);
 	kfree(master);
 	iommu_fwspec_free(dev);
 }
@@ -2575,6 +2776,16 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 		}
 	}
 
+	if (smmu->features & ARM_SMMU_FEAT_ATS && !disable_ats_check) {
+		enables |= CR0_ATSCHK;
+		ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+					      ARM_SMMU_CR0ACK);
+		if (ret) {
+			dev_err(smmu->dev, "failed to enable ATS check\n");
+			return ret;
+		}
+	}
+
 	ret = arm_smmu_setup_irqs(smmu);
 	if (ret) {
 		dev_err(smmu->dev, "failed to setup irqs\n");
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

PCIe devices can implement their own TLB, named Address Translation Cache
(ATC). Enable Address Translation Service (ATS) for devices that support
it and send them invalidation requests whenever we invalidate the IOTLBs.

ATC invalidation is allowed to take up to 90 seconds, according to the
PCIe spec, so it is possible to get a SMMU command queue timeout during
normal operations. However we expect implementations to complete
invalidation in reasonable time.

We only enable ATS for "trusted" devices, and currently rely on the
pci_dev->untrusted bit. For ATS we have to trust that:

(a) The device doesn't issue "translated" memory requests for addresses
    that weren't returned by the SMMU in a Translation Completion. In
    particular, if we give control of a device or device partition to a VM
    or userspace, software cannot program the device to access arbitrary
    "translated" addresses.

(b) The device follows permissions granted by the SMMU in a Translation
    Completion. If the device requested read+write permission and only
    got read, then it doesn't write.

(c) The device doesn't send Translated transactions for an address that
    was invalidated by an ATC invalidation.

Note that the PCIe specification explicitly requires all of these, so we
can assume that implementations will cleanly shield ATCs from software.

All ATS translated requests still go through the SMMU, to walk the stream
table and check that the device is actually allowed to send translated
requests.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
Since last version:
* Refresh
* Don't enable ATS for untrusted devices, or when pci.noats is on the
  kernel command line.
---
 drivers/iommu/arm-smmu-v3.c | 221 +++++++++++++++++++++++++++++++++++-
 1 file changed, 216 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 66a29c113dbc..c605d6f1b2df 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -29,6 +29,7 @@
 #include <linux/of_iommu.h>
 #include <linux/of_platform.h>
 #include <linux/pci.h>
+#include <linux/pci-ats.h>
 #include <linux/platform_device.h>
 
 #include <linux/amba/bus.h>
@@ -86,6 +87,7 @@
 #define IDR5_VAX_52_BIT			1
 
 #define ARM_SMMU_CR0			0x20
+#define CR0_ATSCHK			(1 << 4)
 #define CR0_CMDQEN			(1 << 3)
 #define CR0_EVTQEN			(1 << 2)
 #define CR0_PRIQEN			(1 << 1)
@@ -294,6 +296,7 @@
 #define CMDQ_ERR_CERROR_NONE_IDX	0
 #define CMDQ_ERR_CERROR_ILL_IDX		1
 #define CMDQ_ERR_CERROR_ABT_IDX		2
+#define CMDQ_ERR_CERROR_ATC_INV_IDX	3
 
 #define CMDQ_0_OP			GENMASK_ULL(7, 0)
 #define CMDQ_0_SSV			(1UL << 11)
@@ -312,6 +315,12 @@
 #define CMDQ_TLBI_1_VA_MASK		GENMASK_ULL(63, 12)
 #define CMDQ_TLBI_1_IPA_MASK		GENMASK_ULL(51, 12)
 
+#define CMDQ_ATC_0_SSID			GENMASK_ULL(31, 12)
+#define CMDQ_ATC_0_SID			GENMASK_ULL(63, 32)
+#define CMDQ_ATC_0_GLOBAL		(1UL << 9)
+#define CMDQ_ATC_1_SIZE			GENMASK_ULL(5, 0)
+#define CMDQ_ATC_1_ADDR_MASK		GENMASK_ULL(63, 12)
+
 #define CMDQ_PRI_0_SSID			GENMASK_ULL(31, 12)
 #define CMDQ_PRI_0_SID			GENMASK_ULL(63, 32)
 #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
@@ -365,6 +374,11 @@ module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
 	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
 
+static bool disable_ats_check;
+module_param_named(disable_ats_check, disable_ats_check, bool, S_IRUGO);
+MODULE_PARM_DESC(disable_ats_check,
+	"By default, the SMMU checks whether each incoming transaction marked as translated is allowed by the stream configuration. This option disables the check.");
+
 enum pri_resp {
 	PRI_RESP_DENY = 0,
 	PRI_RESP_FAIL = 1,
@@ -433,6 +447,16 @@ struct arm_smmu_cmdq_ent {
 			u64			addr;
 		} tlbi;
 
+		#define CMDQ_OP_ATC_INV		0x40
+		#define ATC_INV_SIZE_ALL	52
+		struct {
+			u32			sid;
+			u32			ssid;
+			u64			addr;
+			u8			size;
+			bool			global;
+		} atc;
+
 		#define CMDQ_OP_PRI_RESP	0x41
 		struct {
 			u32			sid;
@@ -516,6 +540,8 @@ struct arm_smmu_strtab_ent {
 	bool				assigned;
 	struct arm_smmu_s1_cfg		*s1_cfg;
 	struct arm_smmu_s2_cfg		*s2_cfg;
+
+	bool				ats_enabled;
 };
 
 struct arm_smmu_strtab_cfg {
@@ -828,6 +854,14 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
 	case CMDQ_OP_TLBI_S12_VMALL:
 		cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
 		break;
+	case CMDQ_OP_ATC_INV:
+		cmd[0] |= FIELD_PREP(CMDQ_0_SSV, ent->substream_valid);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_GLOBAL, ent->atc.global);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_SSID, ent->atc.ssid);
+		cmd[0] |= FIELD_PREP(CMDQ_ATC_0_SID, ent->atc.sid);
+		cmd[1] |= FIELD_PREP(CMDQ_ATC_1_SIZE, ent->atc.size);
+		cmd[1] |= ent->atc.addr & CMDQ_ATC_1_ADDR_MASK;
+		break;
 	case CMDQ_OP_PRI_RESP:
 		cmd[0] |= FIELD_PREP(CMDQ_0_SSV, ent->substream_valid);
 		cmd[0] |= FIELD_PREP(CMDQ_PRI_0_SSID, ent->pri.ssid);
@@ -872,6 +906,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
 		[CMDQ_ERR_CERROR_NONE_IDX]	= "No error",
 		[CMDQ_ERR_CERROR_ILL_IDX]	= "Illegal command",
 		[CMDQ_ERR_CERROR_ABT_IDX]	= "Abort on command fetch",
+		[CMDQ_ERR_CERROR_ATC_INV_IDX]	= "ATC invalidate timeout",
 	};
 
 	int i;
@@ -891,6 +926,14 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
 		dev_err(smmu->dev, "retrying command fetch\n");
 	case CMDQ_ERR_CERROR_NONE_IDX:
 		return;
+	case CMDQ_ERR_CERROR_ATC_INV_IDX:
+		/*
+		 * ATC Invalidation Completion timeout. CONS is still pointing
+		 * at the CMD_SYNC. Attempt to complete other pending commands
+		 * by repeating the CMD_SYNC, though we might well end up back
+		 * here since the ATC invalidation may still be pending.
+		 */
+		return;
 	case CMDQ_ERR_CERROR_ILL_IDX:
 		/* Fallthrough */
 	default:
@@ -1166,15 +1209,16 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
-#ifdef CONFIG_PCI_ATS
-			 FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS) |
-#endif
 			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
+		if (ste->ats_enabled)
+			dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+							 STRTAB_STE_1_EATS_TRANS));
+
 		val |= (ste->s1_cfg->cdptr_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS);
 	}
@@ -1398,6 +1442,98 @@ static irqreturn_t arm_smmu_combined_irq_handler(int irq, void *dev)
 	return IRQ_WAKE_THREAD;
 }
 
+static void
+arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
+			struct arm_smmu_cmdq_ent *cmd)
+{
+	size_t log2_span;
+	size_t span_mask;
+	/* ATC invalidates are always on 4096 bytes pages */
+	size_t inval_grain_shift = 12;
+	unsigned long page_start, page_end;
+
+	*cmd = (struct arm_smmu_cmdq_ent) {
+		.opcode			= CMDQ_OP_ATC_INV,
+		.substream_valid	= !!ssid,
+		.atc.ssid		= ssid,
+	};
+
+	if (!size) {
+		cmd->atc.size = ATC_INV_SIZE_ALL;
+		return;
+	}
+
+	page_start	= iova >> inval_grain_shift;
+	page_end	= (iova + size - 1) >> inval_grain_shift;
+
+	/*
+	 * Find the smallest power of two that covers the range. Most
+	 * significant differing bit between start and end address indicates the
+	 * required span, ie. fls(start ^ end). For example:
+	 *
+	 * We want to invalidate pages [8; 11]. This is already the ideal range:
+	 *		x = 0b1000 ^ 0b1011 = 0b11
+	 *		span = 1 << fls(x) = 4
+	 *
+	 * To invalidate pages [7; 10], we need to invalidate [0; 15]:
+	 *		x = 0b0111 ^ 0b1010 = 0b1101
+	 *		span = 1 << fls(x) = 16
+	 */
+	log2_span	= fls_long(page_start ^ page_end);
+	span_mask	= (1ULL << log2_span) - 1;
+
+	page_start	&= ~span_mask;
+
+	cmd->atc.addr	= page_start << inval_grain_shift;
+	cmd->atc.size	= log2_span;
+}
+
+static void arm_smmu_atc_inv_master(struct arm_smmu_master_data *master,
+				    struct arm_smmu_cmdq_ent *cmd)
+{
+	int i;
+	struct iommu_fwspec *fwspec = master->dev->iommu_fwspec;
+
+	if (!master->ste.ats_enabled)
+		return;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		cmd->atc.sid = fwspec->ids[i];
+		arm_smmu_cmdq_issue_cmd(master->smmu, cmd);
+	}
+
+	arm_smmu_cmdq_issue_sync(master->smmu);
+
+	return;
+}
+
+static void arm_smmu_atc_inv_master_all(struct arm_smmu_master_data *master,
+				       int ssid)
+{
+	struct arm_smmu_cmdq_ent cmd;
+
+	arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
+	arm_smmu_atc_inv_master(master, &cmd);
+}
+
+static void arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+				    int ssid, unsigned long iova, size_t size)
+{
+	unsigned long flags;
+	struct arm_smmu_cmdq_ent cmd;
+	struct arm_smmu_master_data *master;
+
+	if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
+		return;
+
+	arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_atc_inv_master(master, &cmd);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
+
 /* IO_PGTABLE API */
 static void arm_smmu_tlb_sync(void *cookie)
 {
@@ -1728,6 +1864,8 @@ static void arm_smmu_detach_dev(struct device *dev)
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (smmu_domain) {
+		arm_smmu_atc_inv_master_all(master, 0);
+
 		spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 		list_del(&master->domain_head);
 		spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
@@ -1817,12 +1955,18 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t
 arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	int ret;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
 
 	if (!ops)
 		return 0;
 
-	return ops->unmap(ops, iova, size);
+	ret = ops->unmap(ops, iova, size);
+	if (ret)
+		arm_smmu_atc_inv_domain(smmu_domain, 0, iova, size);
+
+	return ret;
 }
 
 static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
@@ -1881,6 +2025,58 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static int arm_smmu_enable_ats(struct arm_smmu_master_data *master)
+{
+	size_t stu;
+	int ret, pos;
+	struct pci_dev *pdev;
+	struct arm_smmu_device *smmu = master->smmu;
+	struct iommu_fwspec *fwspec = master->dev->iommu_fwspec;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev) ||
+	    (fwspec->flags & IOMMU_FWSPEC_PCI_NO_ATS) || pci_ats_disabled())
+		return -ENOSYS;
+
+	pdev = to_pci_dev(master->dev);
+	if (pdev->untrusted)
+		return -EPERM;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
+	if (!pos)
+		return -ENOSYS;
+
+	/* Smallest Translation Unit: log2 of the smallest supported granule */
+	stu = __ffs(smmu->pgsize_bitmap);
+
+	ret = pci_enable_ats(pdev, stu);
+	if (ret) {
+		dev_err(&pdev->dev, "could not enable ATS: %d\n", ret);
+		return ret;
+	}
+
+	master->ste.ats_enabled = true;
+	dev_dbg(&pdev->dev, "enabled ATS (STU=%zu, QDEP=%d)\n", stu,
+		pci_ats_queue_depth(pdev));
+
+	return 0;
+}
+
+static void arm_smmu_disable_ats(struct arm_smmu_master_data *master)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(master->dev))
+		return;
+
+	pdev = to_pci_dev(master->dev);
+
+	if (!pdev->ats_enabled)
+		return;
+
+	pci_disable_ats(pdev);
+	master->ste.ats_enabled = false;
+}
+
 static struct iommu_ops arm_smmu_ops;
 
 static int arm_smmu_add_device(struct device *dev)
@@ -1929,10 +2125,14 @@ static int arm_smmu_add_device(struct device *dev)
 		}
 	}
 
+	arm_smmu_enable_ats(master);
+
 	group = iommu_group_get_for_dev(dev);
 	if (!IS_ERR(group)) {
 		iommu_group_put(group);
 		iommu_device_link(&smmu->iommu, dev);
+	} else {
+		arm_smmu_disable_ats(master);
 	}
 
 	return PTR_ERR_OR_ZERO(group);
@@ -1953,6 +2153,7 @@ static void arm_smmu_remove_device(struct device *dev)
 		arm_smmu_detach_dev(dev);
 	iommu_group_remove_device(dev);
 	iommu_device_unlink(&smmu->iommu, dev);
+	arm_smmu_disable_ats(master);
 	kfree(master);
 	iommu_fwspec_free(dev);
 }
@@ -2575,6 +2776,16 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 		}
 	}
 
+	if (smmu->features & ARM_SMMU_FEAT_ATS && !disable_ats_check) {
+		enables |= CR0_ATSCHK;
+		ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+					      ARM_SMMU_CR0ACK);
+		if (ret) {
+			dev_err(smmu->dev, "failed to enable ATS check\n");
+			return ret;
+		}
+	}
+
 	ret = arm_smmu_setup_irqs(smmu);
 	if (ret) {
 		dev_err(smmu->dev, "failed to setup irqs\n");
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/4] iommu/arm-smmu-v3: Disable tagged pointers
  2019-03-20 17:36 ` Jean-Philippe Brucker
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the
MMU mask out bits [63:56] of an address, allowing a userspace application
to store data in its pointers. This option is incompatible with PCI ATS.

If TBI is enabled in the SMMU and userspace triggers DMA transactions on
tagged pointers, the endpoint might create ATC entries for addresses that
include a tag. Software would then have to send ATC invalidation packets
for each 255 possible alias of an address, or just wipe the whole address
space. This is not a viable option, so disable TBI.

The impact of this change is unclear, since there are very few users of
tagged pointers, much less SVA. But the requirement introduced by this
patch doesn't seem excessive: a userspace application using both tagged
pointers and SVA should now sanitize addresses (clear the tag) before
using them for device DMA.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index c605d6f1b2df..69afffdaf907 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1076,7 +1076,6 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
 	val |= ARM_SMMU_TCR2CD(tcr, EPD0);
 	val |= ARM_SMMU_TCR2CD(tcr, EPD1);
 	val |= ARM_SMMU_TCR2CD(tcr, IPS);
-	val |= ARM_SMMU_TCR2CD(tcr, TBI0);
 
 	return val;
 }
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/4] iommu/arm-smmu-v3: Disable tagged pointers
@ 2019-03-20 17:36   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-20 17:36 UTC (permalink / raw)
  To: will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the
MMU mask out bits [63:56] of an address, allowing a userspace application
to store data in its pointers. This option is incompatible with PCI ATS.

If TBI is enabled in the SMMU and userspace triggers DMA transactions on
tagged pointers, the endpoint might create ATC entries for addresses that
include a tag. Software would then have to send ATC invalidation packets
for each 255 possible alias of an address, or just wipe the whole address
space. This is not a viable option, so disable TBI.

The impact of this change is unclear, since there are very few users of
tagged pointers, much less SVA. But the requirement introduced by this
patch doesn't seem excessive: a userspace application using both tagged
pointers and SVA should now sanitize addresses (clear the tag) before
using them for device DMA.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index c605d6f1b2df..69afffdaf907 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1076,7 +1076,6 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
 	val |= ARM_SMMU_TCR2CD(tcr, EPD0);
 	val |= ARM_SMMU_TCR2CD(tcr, EPD1);
 	val |= ARM_SMMU_TCR2CD(tcr, IPS);
-	val |= ARM_SMMU_TCR2CD(tcr, TBI0);
 
 	return val;
 }
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
  2019-03-20 17:36   ` Jean-Philippe Brucker
@ 2019-03-21 15:52       ` Sinan Kaya
  -1 siblings, 0 replies; 32+ messages in thread
From: Sinan Kaya @ 2019-03-21 15:52 UTC (permalink / raw)
  To: Jean-Philippe Brucker, will.deacon-5wv7dgnIgG8
  Cc: zhongmiao-C8/M+/jPZTeaMJb+Lgu22Q, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
> pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
> +	if (!pos)
> +		return -ENOSYS;
> +

You don't need this. pci_enable_ats() validates this via.

	if (!dev->ats_cap)
		return -EINVAL;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
@ 2019-03-21 15:52       ` Sinan Kaya
  0 siblings, 0 replies; 32+ messages in thread
From: Sinan Kaya @ 2019-03-21 15:52 UTC (permalink / raw)
  To: Jean-Philippe Brucker, will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, joro, rjw, linux-acpi, iommu,
	hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
> pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
> +	if (!pos)
> +		return -ENOSYS;
> +

You don't need this. pci_enable_ats() validates this via.

	if (!dev->ats_cap)
		return -EINVAL;



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
  2019-03-20 17:36   ` Jean-Philippe Brucker
@ 2019-03-21 16:00       ` Sinan Kaya
  -1 siblings, 0 replies; 32+ messages in thread
From: Sinan Kaya @ 2019-03-21 16:00 UTC (permalink / raw)
  To: Jean-Philippe Brucker, will.deacon-5wv7dgnIgG8
  Cc: zhongmiao-C8/M+/jPZTeaMJb+Lgu22Q, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>   		err = pci_for_each_dma_alias(to_pci_dev(dev),
>   					     iort_pci_iommu_init, &info);
> +
> +		if (!err && !iort_pci_rc_supports_ats(node))
> +			dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;

Is it possible to remove !err check here. ATS supported is just a flag
in IORT table. Does this decision have to rely on having a correct dma
alias?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
@ 2019-03-21 16:00       ` Sinan Kaya
  0 siblings, 0 replies; 32+ messages in thread
From: Sinan Kaya @ 2019-03-21 16:00 UTC (permalink / raw)
  To: Jean-Philippe Brucker, will.deacon
  Cc: lorenzo.pieralisi, zhongmiao, joro, rjw, linux-acpi, iommu,
	hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>   		err = pci_for_each_dma_alias(to_pci_dev(dev),
>   					     iort_pci_iommu_init, &info);
> +
> +		if (!err && !iort_pci_rc_supports_ats(node))
> +			dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;

Is it possible to remove !err check here. ATS supported is just a flag
in IORT table. Does this decision have to rely on having a correct dma
alias?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
  2019-03-21 15:52       ` Sinan Kaya
@ 2019-03-25 15:01           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-25 15:01 UTC (permalink / raw)
  To: Sinan Kaya, will.deacon-5wv7dgnIgG8
  Cc: zhongmiao-C8/M+/jPZTeaMJb+Lgu22Q, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

On 21/03/2019 15:52, Sinan Kaya wrote:
> On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>> pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
>> +    if (!pos)
>> +        return -ENOSYS;
>> +
> 
> You don't need this. pci_enable_ats() validates this via.
> 
>     if (!dev->ats_cap)
>         return -EINVAL;

Right, I think I wanted to differentiate lack of ATS capability from
actual error in pci_enable_ats(), in which case we'd issue a dev_err().
But the only other errors in pci_enable_ats() are related to invalid STU
parameter, which are valid in our case, so the dev_err() is pretty
useless. I'll remove this.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS
@ 2019-03-25 15:01           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-25 15:01 UTC (permalink / raw)
  To: Sinan Kaya, will.deacon
  Cc: zhongmiao, rjw, linux-acpi, iommu, sudeep.holla, robin.murphy,
	linux-arm-kernel, lenb

On 21/03/2019 15:52, Sinan Kaya wrote:
> On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>> pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
>> +    if (!pos)
>> +        return -ENOSYS;
>> +
> 
> You don't need this. pci_enable_ats() validates this via.
> 
>     if (!dev->ats_cap)
>         return -EINVAL;

Right, I think I wanted to differentiate lack of ATS capability from
actual error in pci_enable_ats(), in which case we'd issue a dev_err().
But the only other errors in pci_enable_ats() are related to invalid STU
parameter, which are valid in our case, so the dev_err() is pretty
useless. I'll remove this.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
  2019-03-21 16:00       ` Sinan Kaya
@ 2019-03-25 15:02           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-25 15:02 UTC (permalink / raw)
  To: Sinan Kaya, will.deacon-5wv7dgnIgG8
  Cc: zhongmiao-C8/M+/jPZTeaMJb+Lgu22Q, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

On 21/03/2019 16:00, Sinan Kaya wrote:
> On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>>           err = pci_for_each_dma_alias(to_pci_dev(dev),
>>                            iort_pci_iommu_init, &info);
>> +
>> +        if (!err && !iort_pci_rc_supports_ats(node))
>> +            dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
> 
> Is it possible to remove !err check here. ATS supported is just a flag
> in IORT table. Does this decision have to rely on having a correct dma
> alias?

iort_pci_iommu_init() allocates and initializes dev->iommu_fwspec, so
checking !err ensures that dev->iommu_fwspec->flags is valid.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes
@ 2019-03-25 15:02           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-25 15:02 UTC (permalink / raw)
  To: Sinan Kaya, will.deacon
  Cc: zhongmiao, rjw, linux-acpi, iommu, sudeep.holla, robin.murphy,
	linux-arm-kernel, lenb

On 21/03/2019 16:00, Sinan Kaya wrote:
> On 3/20/2019 1:36 PM, Jean-Philippe Brucker wrote:
>>           err = pci_for_each_dma_alias(to_pci_dev(dev),
>>                            iort_pci_iommu_init, &info);
>> +
>> +        if (!err && !iort_pci_rc_supports_ats(node))
>> +            dev->iommu_fwspec->flags |= IOMMU_FWSPEC_PCI_NO_ATS;
> 
> Is it possible to remove !err check here. ATS supported is just a flag
> in IORT table. Does this decision have to rely on having a correct dma
> alias?

iort_pci_iommu_init() allocates and initializes dev->iommu_fwspec, so
checking !err ensures that dev->iommu_fwspec->flags is valid.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
  2019-03-20 17:36   ` Jean-Philippe Brucker
@ 2019-04-04 14:39       ` Will Deacon
  -1 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-04 14:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: zhongmiao-C8/M+/jPZTeaMJb+Lgu22Q, okaya-DgEjT+Ai2ygdnm+yROfE0A,
	rjw-LthD3rsA81gm4RdzfppkhA, linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	lenb-DgEjT+Ai2ygdnm+yROfE0A

Hi Jean-Philippe,

First off, thanks for posting this: it's definitely something that I'm keen
to support, and getting bits in a piece at a time is probably a good idea.

On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> When removing a mapping from a domain, we need to send an invalidation to
> all devices that might have stored it in their Address Translation Cache
> (ATC). In addition when updating the context descriptor of a live domain,
> we'll need to send invalidations for all devices attached to it.
> 
> Maintain a list of devices in each domain, protected by a spinlock. It is
> updated every time we attach or detach devices to and from domains.
> 
> It needs to be a spinlock because we'll invalidate ATC entries from
> within hardirq-safe contexts, but it may be possible to relax the read
> side with RCU later.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..66a29c113dbc 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>  struct arm_smmu_master_data {
>  	struct arm_smmu_device		*smmu;
>  	struct arm_smmu_strtab_ent	ste;
> +
> +	struct arm_smmu_domain		*domain;
> +	struct list_head		domain_head;
> +
> +	struct device			*dev;
>  };
>  
>  /* SMMU private data for an IOMMU domain */
> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>  	};
>  
>  	struct iommu_domain		domain;
> +
> +	struct list_head		devices;
> +	spinlock_t			devices_lock;
>  };
>  
>  struct arm_smmu_option_prop {
> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>  	}
>  
>  	mutex_init(&smmu_domain->init_mutex);
> +	INIT_LIST_HEAD(&smmu_domain->devices);
> +	spin_lock_init(&smmu_domain->devices_lock);

I'm wondering whether we can't take this a bit further and re-organise the
data structures to make this a little simpler overall. Something along the
lines of:

	struct arm_smmu_master_data {
		struct list_head		list; // masters in the same domain
		struct arm_smmu_device		*smmu;
		unsigned int			num_sids;
		u32				*sids; // Points into fwspec
		struct arm_smmu_domain		*domain; // NULL -> !assigned
	};

and then just add a list_head into struct arm_smmu_domain to track the
masters that have been attached (if you're feeling brave, you could put
this into the s1_cfg).

The ATC invalidation logic would then be:

  - Detaching a device: walk over the sids from the master data
  - Unmapping a range from a domain: walk over the attached masters

I think this would also allow us to remove struct arm_smmu_strtab_ent
completely.

Dunno: this is one of the those things where you really have to try it
to figure out why it doesn't work...

Will

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-04 14:39       ` Will Deacon
  0 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-04 14:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: lorenzo.pieralisi, zhongmiao, okaya, joro, rjw, linux-acpi,
	iommu, hanjun.guo, sudeep.holla, robin.murphy, linux-arm-kernel,
	lenb

Hi Jean-Philippe,

First off, thanks for posting this: it's definitely something that I'm keen
to support, and getting bits in a piece at a time is probably a good idea.

On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> When removing a mapping from a domain, we need to send an invalidation to
> all devices that might have stored it in their Address Translation Cache
> (ATC). In addition when updating the context descriptor of a live domain,
> we'll need to send invalidations for all devices attached to it.
> 
> Maintain a list of devices in each domain, protected by a spinlock. It is
> updated every time we attach or detach devices to and from domains.
> 
> It needs to be a spinlock because we'll invalidate ATC entries from
> within hardirq-safe contexts, but it may be possible to relax the read
> side with RCU later.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..66a29c113dbc 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>  struct arm_smmu_master_data {
>  	struct arm_smmu_device		*smmu;
>  	struct arm_smmu_strtab_ent	ste;
> +
> +	struct arm_smmu_domain		*domain;
> +	struct list_head		domain_head;
> +
> +	struct device			*dev;
>  };
>  
>  /* SMMU private data for an IOMMU domain */
> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>  	};
>  
>  	struct iommu_domain		domain;
> +
> +	struct list_head		devices;
> +	spinlock_t			devices_lock;
>  };
>  
>  struct arm_smmu_option_prop {
> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>  	}
>  
>  	mutex_init(&smmu_domain->init_mutex);
> +	INIT_LIST_HEAD(&smmu_domain->devices);
> +	spin_lock_init(&smmu_domain->devices_lock);

I'm wondering whether we can't take this a bit further and re-organise the
data structures to make this a little simpler overall. Something along the
lines of:

	struct arm_smmu_master_data {
		struct list_head		list; // masters in the same domain
		struct arm_smmu_device		*smmu;
		unsigned int			num_sids;
		u32				*sids; // Points into fwspec
		struct arm_smmu_domain		*domain; // NULL -> !assigned
	};

and then just add a list_head into struct arm_smmu_domain to track the
masters that have been attached (if you're feeling brave, you could put
this into the s1_cfg).

The ATC invalidation logic would then be:

  - Detaching a device: walk over the sids from the master data
  - Unmapping a range from a domain: walk over the attached masters

I think this would also allow us to remove struct arm_smmu_strtab_ent
completely.

Dunno: this is one of the those things where you really have to try it
to figure out why it doesn't work...

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:35         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 16:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: eric.auger, zhongmiao, okaya, rjw, linux-acpi, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 04/04/2019 15:39, Will Deacon wrote:
> Hi Jean-Philippe,
> 
> First off, thanks for posting this: it's definitely something that I'm keen
> to support, and getting bits in a piece at a time is probably a good idea.
> 
> On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
>> When removing a mapping from a domain, we need to send an invalidation to
>> all devices that might have stored it in their Address Translation Cache
>> (ATC). In addition when updating the context descriptor of a live domain,
>> we'll need to send invalidations for all devices attached to it.
>>
>> Maintain a list of devices in each domain, protected by a spinlock. It is
>> updated every time we attach or detach devices to and from domains.
>>
>> It needs to be a spinlock because we'll invalidate ATC entries from
>> within hardirq-safe contexts, but it may be possible to relax the read
>> side with RCU later.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..66a29c113dbc 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>>  struct arm_smmu_master_data {
>>  	struct arm_smmu_device		*smmu;
>>  	struct arm_smmu_strtab_ent	ste;
>> +
>> +	struct arm_smmu_domain		*domain;
>> +	struct list_head		domain_head;
>> +
>> +	struct device			*dev;
>>  };
>>  
>>  /* SMMU private data for an IOMMU domain */
>> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>>  	};
>>  
>>  	struct iommu_domain		domain;
>> +
>> +	struct list_head		devices;
>> +	spinlock_t			devices_lock;
>>  };
>>  
>>  struct arm_smmu_option_prop {
>> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>>  	}
>>  
>>  	mutex_init(&smmu_domain->init_mutex);
>> +	INIT_LIST_HEAD(&smmu_domain->devices);
>> +	spin_lock_init(&smmu_domain->devices_lock);
> 
> I'm wondering whether we can't take this a bit further and re-organise the
> data structures to make this a little simpler overall. Something along the
> lines of:
> 
> 	struct arm_smmu_master_data {
> 		struct list_head		list; // masters in the same domain
> 		struct arm_smmu_device		*smmu;
> 		unsigned int			num_sids;
> 		u32				*sids; // Points into fwspec
> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> 	};
> 
> and then just add a list_head into struct arm_smmu_domain to track the
> masters that have been attached (if you're feeling brave, you could put
> this into the s1_cfg).

I'm not sure about that last bit, shouldn't the list of masters apply to
both s1 and s2?

> 
> The ATC invalidation logic would then be:
> 
>   - Detaching a device: walk over the sids from the master data
>   - Unmapping a range from a domain: walk over the attached masters
> 
> I think this would also allow us to remove struct arm_smmu_strtab_ent
> completely.

Makes sense, it does work and simplifies the structures. It makes the
PASID and PRI patches slightly nicer as well. I'll resend once my tests
complete.

Thanks,
Jean

> 
> Dunno: this is one of the those things where you really have to try it
> to figure out why it doesn't work...
> 
> Will
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:35         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 16:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: zhongmiao, okaya, rjw, linux-acpi, iommu, sudeep.holla,
	robin.murphy, linux-arm-kernel, lenb, eric.auger

On 04/04/2019 15:39, Will Deacon wrote:
> Hi Jean-Philippe,
> 
> First off, thanks for posting this: it's definitely something that I'm keen
> to support, and getting bits in a piece at a time is probably a good idea.
> 
> On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
>> When removing a mapping from a domain, we need to send an invalidation to
>> all devices that might have stored it in their Address Translation Cache
>> (ATC). In addition when updating the context descriptor of a live domain,
>> we'll need to send invalidations for all devices attached to it.
>>
>> Maintain a list of devices in each domain, protected by a spinlock. It is
>> updated every time we attach or detach devices to and from domains.
>>
>> It needs to be a spinlock because we'll invalidate ATC entries from
>> within hardirq-safe contexts, but it may be possible to relax the read
>> side with RCU later.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..66a29c113dbc 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>>  struct arm_smmu_master_data {
>>  	struct arm_smmu_device		*smmu;
>>  	struct arm_smmu_strtab_ent	ste;
>> +
>> +	struct arm_smmu_domain		*domain;
>> +	struct list_head		domain_head;
>> +
>> +	struct device			*dev;
>>  };
>>  
>>  /* SMMU private data for an IOMMU domain */
>> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>>  	};
>>  
>>  	struct iommu_domain		domain;
>> +
>> +	struct list_head		devices;
>> +	spinlock_t			devices_lock;
>>  };
>>  
>>  struct arm_smmu_option_prop {
>> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>>  	}
>>  
>>  	mutex_init(&smmu_domain->init_mutex);
>> +	INIT_LIST_HEAD(&smmu_domain->devices);
>> +	spin_lock_init(&smmu_domain->devices_lock);
> 
> I'm wondering whether we can't take this a bit further and re-organise the
> data structures to make this a little simpler overall. Something along the
> lines of:
> 
> 	struct arm_smmu_master_data {
> 		struct list_head		list; // masters in the same domain
> 		struct arm_smmu_device		*smmu;
> 		unsigned int			num_sids;
> 		u32				*sids; // Points into fwspec
> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> 	};
> 
> and then just add a list_head into struct arm_smmu_domain to track the
> masters that have been attached (if you're feeling brave, you could put
> this into the s1_cfg).

I'm not sure about that last bit, shouldn't the list of masters apply to
both s1 and s2?

> 
> The ATC invalidation logic would then be:
> 
>   - Detaching a device: walk over the sids from the master data
>   - Unmapping a range from a domain: walk over the attached masters
> 
> I think this would also allow us to remove struct arm_smmu_strtab_ent
> completely.

Makes sense, it does work and simplifies the structures. It makes the
PASID and PRI patches slightly nicer as well. I'll resend once my tests
complete.

Thanks,
Jean

> 
> Dunno: this is one of the those things where you really have to try it
> to figure out why it doesn't work...
> 
> Will
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:35         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 16:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: zhongmiao, okaya, rjw, linux-acpi, iommu, sudeep.holla,
	robin.murphy, linux-arm-kernel, lenb

On 04/04/2019 15:39, Will Deacon wrote:
> Hi Jean-Philippe,
> 
> First off, thanks for posting this: it's definitely something that I'm keen
> to support, and getting bits in a piece at a time is probably a good idea.
> 
> On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
>> When removing a mapping from a domain, we need to send an invalidation to
>> all devices that might have stored it in their Address Translation Cache
>> (ATC). In addition when updating the context descriptor of a live domain,
>> we'll need to send invalidations for all devices attached to it.
>>
>> Maintain a list of devices in each domain, protected by a spinlock. It is
>> updated every time we attach or detach devices to and from domains.
>>
>> It needs to be a spinlock because we'll invalidate ATC entries from
>> within hardirq-safe contexts, but it may be possible to relax the read
>> side with RCU later.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..66a29c113dbc 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>>  struct arm_smmu_master_data {
>>  	struct arm_smmu_device		*smmu;
>>  	struct arm_smmu_strtab_ent	ste;
>> +
>> +	struct arm_smmu_domain		*domain;
>> +	struct list_head		domain_head;
>> +
>> +	struct device			*dev;
>>  };
>>  
>>  /* SMMU private data for an IOMMU domain */
>> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>>  	};
>>  
>>  	struct iommu_domain		domain;
>> +
>> +	struct list_head		devices;
>> +	spinlock_t			devices_lock;
>>  };
>>  
>>  struct arm_smmu_option_prop {
>> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>>  	}
>>  
>>  	mutex_init(&smmu_domain->init_mutex);
>> +	INIT_LIST_HEAD(&smmu_domain->devices);
>> +	spin_lock_init(&smmu_domain->devices_lock);
> 
> I'm wondering whether we can't take this a bit further and re-organise the
> data structures to make this a little simpler overall. Something along the
> lines of:
> 
> 	struct arm_smmu_master_data {
> 		struct list_head		list; // masters in the same domain
> 		struct arm_smmu_device		*smmu;
> 		unsigned int			num_sids;
> 		u32				*sids; // Points into fwspec
> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> 	};
> 
> and then just add a list_head into struct arm_smmu_domain to track the
> masters that have been attached (if you're feeling brave, you could put
> this into the s1_cfg).

I'm not sure about that last bit, shouldn't the list of masters apply to
both s1 and s2?

> 
> The ATC invalidation logic would then be:
> 
>   - Detaching a device: walk over the sids from the master data
>   - Unmapping a range from a domain: walk over the attached masters
> 
> I think this would also allow us to remove struct arm_smmu_strtab_ent
> completely.

Makes sense, it does work and simplifies the structures. It makes the
PASID and PRI patches slightly nicer as well. I'll resend once my tests
complete.

Thanks,
Jean

> 
> Dunno: this is one of the those things where you really have to try it
> to figure out why it doesn't work...
> 
> Will
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:35         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 16:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: eric.auger, zhongmiao, okaya, rjw, linux-acpi, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 04/04/2019 15:39, Will Deacon wrote:
> Hi Jean-Philippe,
> 
> First off, thanks for posting this: it's definitely something that I'm keen
> to support, and getting bits in a piece at a time is probably a good idea.
> 
> On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
>> When removing a mapping from a domain, we need to send an invalidation to
>> all devices that might have stored it in their Address Translation Cache
>> (ATC). In addition when updating the context descriptor of a live domain,
>> we'll need to send invalidations for all devices attached to it.
>>
>> Maintain a list of devices in each domain, protected by a spinlock. It is
>> updated every time we attach or detach devices to and from domains.
>>
>> It needs to be a spinlock because we'll invalidate ATC entries from
>> within hardirq-safe contexts, but it may be possible to relax the read
>> side with RCU later.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..66a29c113dbc 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -594,6 +594,11 @@ struct arm_smmu_device {
>>  struct arm_smmu_master_data {
>>  	struct arm_smmu_device		*smmu;
>>  	struct arm_smmu_strtab_ent	ste;
>> +
>> +	struct arm_smmu_domain		*domain;
>> +	struct list_head		domain_head;
>> +
>> +	struct device			*dev;
>>  };
>>  
>>  /* SMMU private data for an IOMMU domain */
>> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
>>  	};
>>  
>>  	struct iommu_domain		domain;
>> +
>> +	struct list_head		devices;
>> +	spinlock_t			devices_lock;
>>  };
>>  
>>  struct arm_smmu_option_prop {
>> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>>  	}
>>  
>>  	mutex_init(&smmu_domain->init_mutex);
>> +	INIT_LIST_HEAD(&smmu_domain->devices);
>> +	spin_lock_init(&smmu_domain->devices_lock);
> 
> I'm wondering whether we can't take this a bit further and re-organise the
> data structures to make this a little simpler overall. Something along the
> lines of:
> 
> 	struct arm_smmu_master_data {
> 		struct list_head		list; // masters in the same domain
> 		struct arm_smmu_device		*smmu;
> 		unsigned int			num_sids;
> 		u32				*sids; // Points into fwspec
> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> 	};
> 
> and then just add a list_head into struct arm_smmu_domain to track the
> masters that have been attached (if you're feeling brave, you could put
> this into the s1_cfg).

I'm not sure about that last bit, shouldn't the list of masters apply to
both s1 and s2?

> 
> The ATC invalidation logic would then be:
> 
>   - Detaching a device: walk over the sids from the master data
>   - Unmapping a range from a domain: walk over the attached masters
> 
> I think this would also allow us to remove struct arm_smmu_strtab_ent
> completely.

Makes sense, it does work and simplifies the structures. It makes the
PASID and PRI patches slightly nicer as well. I'll resend once my tests
complete.

Thanks,
Jean

> 
> Dunno: this is one of the those things where you really have to try it
> to figure out why it doesn't work...
> 
> Will
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:39           ` Will Deacon
  0 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-05 16:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: eric.auger, zhongmiao, okaya, rjw, linux-acpi, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On Fri, Apr 05, 2019 at 05:35:52PM +0100, Jean-Philippe Brucker wrote:
> On 04/04/2019 15:39, Will Deacon wrote:
> > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> >> When removing a mapping from a domain, we need to send an invalidation to
> >> all devices that might have stored it in their Address Translation Cache
> >> (ATC). In addition when updating the context descriptor of a live domain,
> >> we'll need to send invalidations for all devices attached to it.
> >>
> >> Maintain a list of devices in each domain, protected by a spinlock. It is
> >> updated every time we attach or detach devices to and from domains.
> >>
> >> It needs to be a spinlock because we'll invalidate ATC entries from
> >> within hardirq-safe contexts, but it may be possible to relax the read
> >> side with RCU later.
> >>
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> >> ---
> >>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> index d3880010c6cf..66a29c113dbc 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -594,6 +594,11 @@ struct arm_smmu_device {
> >>  struct arm_smmu_master_data {
> >>  	struct arm_smmu_device		*smmu;
> >>  	struct arm_smmu_strtab_ent	ste;
> >> +
> >> +	struct arm_smmu_domain		*domain;
> >> +	struct list_head		domain_head;
> >> +
> >> +	struct device			*dev;
> >>  };
> >>  
> >>  /* SMMU private data for an IOMMU domain */
> >> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
> >>  	};
> >>  
> >>  	struct iommu_domain		domain;
> >> +
> >> +	struct list_head		devices;
> >> +	spinlock_t			devices_lock;
> >>  };
> >>  
> >>  struct arm_smmu_option_prop {
> >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> >>  	}
> >>  
> >>  	mutex_init(&smmu_domain->init_mutex);
> >> +	INIT_LIST_HEAD(&smmu_domain->devices);
> >> +	spin_lock_init(&smmu_domain->devices_lock);
> > 
> > I'm wondering whether we can't take this a bit further and re-organise the
> > data structures to make this a little simpler overall. Something along the
> > lines of:
> > 
> > 	struct arm_smmu_master_data {
> > 		struct list_head		list; // masters in the same domain
> > 		struct arm_smmu_device		*smmu;
> > 		unsigned int			num_sids;
> > 		u32				*sids; // Points into fwspec
> > 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> > 	};
> > 
> > and then just add a list_head into struct arm_smmu_domain to track the
> > masters that have been attached (if you're feeling brave, you could put
> > this into the s1_cfg).
> 
> I'm not sure about that last bit, shouldn't the list of masters apply to
> both s1 and s2?

I was assuming that (a) we were only using ATS with stage-1 and (b) we only
need the masters list for ATC invalidation. Did I go wrong somewhere?

> > The ATC invalidation logic would then be:
> > 
> >   - Detaching a device: walk over the sids from the master data
> >   - Unmapping a range from a domain: walk over the attached masters
> > 
> > I think this would also allow us to remove struct arm_smmu_strtab_ent
> > completely.
> 
> Makes sense, it does work and simplifies the structures. It makes the
> PASID and PRI patches slightly nicer as well. I'll resend once my tests
> complete.

Brill, thanks for giving it a go so quickly.

Will

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:39           ` Will Deacon
  0 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-05 16:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: zhongmiao, okaya, rjw, linux-acpi, iommu, sudeep.holla,
	robin.murphy, linux-arm-kernel, lenb, eric.auger

On Fri, Apr 05, 2019 at 05:35:52PM +0100, Jean-Philippe Brucker wrote:
> On 04/04/2019 15:39, Will Deacon wrote:
> > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> >> When removing a mapping from a domain, we need to send an invalidation to
> >> all devices that might have stored it in their Address Translation Cache
> >> (ATC). In addition when updating the context descriptor of a live domain,
> >> we'll need to send invalidations for all devices attached to it.
> >>
> >> Maintain a list of devices in each domain, protected by a spinlock. It is
> >> updated every time we attach or detach devices to and from domains.
> >>
> >> It needs to be a spinlock because we'll invalidate ATC entries from
> >> within hardirq-safe contexts, but it may be possible to relax the read
> >> side with RCU later.
> >>
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> >> ---
> >>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> index d3880010c6cf..66a29c113dbc 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -594,6 +594,11 @@ struct arm_smmu_device {
> >>  struct arm_smmu_master_data {
> >>  	struct arm_smmu_device		*smmu;
> >>  	struct arm_smmu_strtab_ent	ste;
> >> +
> >> +	struct arm_smmu_domain		*domain;
> >> +	struct list_head		domain_head;
> >> +
> >> +	struct device			*dev;
> >>  };
> >>  
> >>  /* SMMU private data for an IOMMU domain */
> >> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
> >>  	};
> >>  
> >>  	struct iommu_domain		domain;
> >> +
> >> +	struct list_head		devices;
> >> +	spinlock_t			devices_lock;
> >>  };
> >>  
> >>  struct arm_smmu_option_prop {
> >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> >>  	}
> >>  
> >>  	mutex_init(&smmu_domain->init_mutex);
> >> +	INIT_LIST_HEAD(&smmu_domain->devices);
> >> +	spin_lock_init(&smmu_domain->devices_lock);
> > 
> > I'm wondering whether we can't take this a bit further and re-organise the
> > data structures to make this a little simpler overall. Something along the
> > lines of:
> > 
> > 	struct arm_smmu_master_data {
> > 		struct list_head		list; // masters in the same domain
> > 		struct arm_smmu_device		*smmu;
> > 		unsigned int			num_sids;
> > 		u32				*sids; // Points into fwspec
> > 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> > 	};
> > 
> > and then just add a list_head into struct arm_smmu_domain to track the
> > masters that have been attached (if you're feeling brave, you could put
> > this into the s1_cfg).
> 
> I'm not sure about that last bit, shouldn't the list of masters apply to
> both s1 and s2?

I was assuming that (a) we were only using ATS with stage-1 and (b) we only
need the masters list for ATC invalidation. Did I go wrong somewhere?

> > The ATC invalidation logic would then be:
> > 
> >   - Detaching a device: walk over the sids from the master data
> >   - Unmapping a range from a domain: walk over the attached masters
> > 
> > I think this would also allow us to remove struct arm_smmu_strtab_ent
> > completely.
> 
> Makes sense, it does work and simplifies the structures. It makes the
> PASID and PRI patches slightly nicer as well. I'll resend once my tests
> complete.

Brill, thanks for giving it a go so quickly.

Will

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:39           ` Will Deacon
  0 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-05 16:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: zhongmiao, okaya, rjw, linux-acpi, iommu, sudeep.holla,
	robin.murphy, linux-arm-kernel, lenb

On Fri, Apr 05, 2019 at 05:35:52PM +0100, Jean-Philippe Brucker wrote:
> On 04/04/2019 15:39, Will Deacon wrote:
> > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> >> When removing a mapping from a domain, we need to send an invalidation to
> >> all devices that might have stored it in their Address Translation Cache
> >> (ATC). In addition when updating the context descriptor of a live domain,
> >> we'll need to send invalidations for all devices attached to it.
> >>
> >> Maintain a list of devices in each domain, protected by a spinlock. It is
> >> updated every time we attach or detach devices to and from domains.
> >>
> >> It needs to be a spinlock because we'll invalidate ATC entries from
> >> within hardirq-safe contexts, but it may be possible to relax the read
> >> side with RCU later.
> >>
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> >> ---
> >>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> index d3880010c6cf..66a29c113dbc 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -594,6 +594,11 @@ struct arm_smmu_device {
> >>  struct arm_smmu_master_data {
> >>  	struct arm_smmu_device		*smmu;
> >>  	struct arm_smmu_strtab_ent	ste;
> >> +
> >> +	struct arm_smmu_domain		*domain;
> >> +	struct list_head		domain_head;
> >> +
> >> +	struct device			*dev;
> >>  };
> >>  
> >>  /* SMMU private data for an IOMMU domain */
> >> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
> >>  	};
> >>  
> >>  	struct iommu_domain		domain;
> >> +
> >> +	struct list_head		devices;
> >> +	spinlock_t			devices_lock;
> >>  };
> >>  
> >>  struct arm_smmu_option_prop {
> >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> >>  	}
> >>  
> >>  	mutex_init(&smmu_domain->init_mutex);
> >> +	INIT_LIST_HEAD(&smmu_domain->devices);
> >> +	spin_lock_init(&smmu_domain->devices_lock);
> > 
> > I'm wondering whether we can't take this a bit further and re-organise the
> > data structures to make this a little simpler overall. Something along the
> > lines of:
> > 
> > 	struct arm_smmu_master_data {
> > 		struct list_head		list; // masters in the same domain
> > 		struct arm_smmu_device		*smmu;
> > 		unsigned int			num_sids;
> > 		u32				*sids; // Points into fwspec
> > 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> > 	};
> > 
> > and then just add a list_head into struct arm_smmu_domain to track the
> > masters that have been attached (if you're feeling brave, you could put
> > this into the s1_cfg).
> 
> I'm not sure about that last bit, shouldn't the list of masters apply to
> both s1 and s2?

I was assuming that (a) we were only using ATS with stage-1 and (b) we only
need the masters list for ATC invalidation. Did I go wrong somewhere?

> > The ATC invalidation logic would then be:
> > 
> >   - Detaching a device: walk over the sids from the master data
> >   - Unmapping a range from a domain: walk over the attached masters
> > 
> > I think this would also allow us to remove struct arm_smmu_strtab_ent
> > completely.
> 
> Makes sense, it does work and simplifies the structures. It makes the
> PASID and PRI patches slightly nicer as well. I'll resend once my tests
> complete.

Brill, thanks for giving it a go so quickly.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 16:39           ` Will Deacon
  0 siblings, 0 replies; 32+ messages in thread
From: Will Deacon @ 2019-04-05 16:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: eric.auger, zhongmiao, okaya, rjw, linux-acpi, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On Fri, Apr 05, 2019 at 05:35:52PM +0100, Jean-Philippe Brucker wrote:
> On 04/04/2019 15:39, Will Deacon wrote:
> > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote:
> >> When removing a mapping from a domain, we need to send an invalidation to
> >> all devices that might have stored it in their Address Translation Cache
> >> (ATC). In addition when updating the context descriptor of a live domain,
> >> we'll need to send invalidations for all devices attached to it.
> >>
> >> Maintain a list of devices in each domain, protected by a spinlock. It is
> >> updated every time we attach or detach devices to and from domains.
> >>
> >> It needs to be a spinlock because we'll invalidate ATC entries from
> >> within hardirq-safe contexts, but it may be possible to relax the read
> >> side with RCU later.
> >>
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> >> ---
> >>  drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> index d3880010c6cf..66a29c113dbc 100644
> >> --- a/drivers/iommu/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm-smmu-v3.c
> >> @@ -594,6 +594,11 @@ struct arm_smmu_device {
> >>  struct arm_smmu_master_data {
> >>  	struct arm_smmu_device		*smmu;
> >>  	struct arm_smmu_strtab_ent	ste;
> >> +
> >> +	struct arm_smmu_domain		*domain;
> >> +	struct list_head		domain_head;
> >> +
> >> +	struct device			*dev;
> >>  };
> >>  
> >>  /* SMMU private data for an IOMMU domain */
> >> @@ -618,6 +623,9 @@ struct arm_smmu_domain {
> >>  	};
> >>  
> >>  	struct iommu_domain		domain;
> >> +
> >> +	struct list_head		devices;
> >> +	spinlock_t			devices_lock;
> >>  };
> >>  
> >>  struct arm_smmu_option_prop {
> >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> >>  	}
> >>  
> >>  	mutex_init(&smmu_domain->init_mutex);
> >> +	INIT_LIST_HEAD(&smmu_domain->devices);
> >> +	spin_lock_init(&smmu_domain->devices_lock);
> > 
> > I'm wondering whether we can't take this a bit further and re-organise the
> > data structures to make this a little simpler overall. Something along the
> > lines of:
> > 
> > 	struct arm_smmu_master_data {
> > 		struct list_head		list; // masters in the same domain
> > 		struct arm_smmu_device		*smmu;
> > 		unsigned int			num_sids;
> > 		u32				*sids; // Points into fwspec
> > 		struct arm_smmu_domain		*domain; // NULL -> !assigned
> > 	};
> > 
> > and then just add a list_head into struct arm_smmu_domain to track the
> > masters that have been attached (if you're feeling brave, you could put
> > this into the s1_cfg).
> 
> I'm not sure about that last bit, shouldn't the list of masters apply to
> both s1 and s2?

I was assuming that (a) we were only using ATS with stage-1 and (b) we only
need the masters list for ATC invalidation. Did I go wrong somewhere?

> > The ATC invalidation logic would then be:
> > 
> >   - Detaching a device: walk over the sids from the master data
> >   - Unmapping a range from a domain: walk over the attached masters
> > 
> > I think this would also allow us to remove struct arm_smmu_strtab_ent
> > completely.
> 
> Makes sense, it does work and simplifies the structures. It makes the
> PASID and PRI patches slightly nicer as well. I'll resend once my tests
> complete.

Brill, thanks for giving it a go so quickly.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 18:07             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 18:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-acpi, zhongmiao, rjw, okaya, eric.auger, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 05/04/2019 17:39, Will Deacon wrote:
>>> I'm wondering whether we can't take this a bit further and re-organise the
>>> data structures to make this a little simpler overall. Something along the
>>> lines of:
>>>
>>> 	struct arm_smmu_master_data {
>>> 		struct list_head		list; // masters in the same domain
>>> 		struct arm_smmu_device		*smmu;
>>> 		unsigned int			num_sids;
>>> 		u32				*sids; // Points into fwspec
>>> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
>>> 	};
>>>
>>> and then just add a list_head into struct arm_smmu_domain to track the
>>> masters that have been attached (if you're feeling brave, you could put
>>> this into the s1_cfg).
>>
>> I'm not sure about that last bit, shouldn't the list of masters apply to
>> both s1 and s2?
> 
> I was assuming that (a) we were only using ATS with stage-1 and (b) we only
> need the masters list for ATC invalidation. Did I go wrong somewhere?

Hmm, I messed that up in patch 3, actually. I'm still unsure about it,
but in the end I was meaning to enable ATS for both stage-1 and stage-2
domains. We could make it stage-1 only, but VFIO can assign a device
with a stage-1 domain to userspace. That's the case with QEMU, which
doesn't use VFIO_TYPE1_NESTING_IOMMU at the moment. So I don't think
limiting ATS to stage-1 protects against anything.


For (b), the master list will likely also be used for PASID support, to
invalidate cached CDs for each device in a domain, but that's only stage-1.

Then the current patchsets for nested translation also rely on the
master list to bind guest PASID tables to all devices attached
to a domain. But with those patches, s1_cfg and s2_cfg aren't in an enum
anymore and we could still keep the list in s1_cfg.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 18:07             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 18:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: eric.auger, zhongmiao, okaya, rjw, linux-acpi, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 05/04/2019 17:39, Will Deacon wrote:
>>> I'm wondering whether we can't take this a bit further and re-organise the
>>> data structures to make this a little simpler overall. Something along the
>>> lines of:
>>>
>>> 	struct arm_smmu_master_data {
>>> 		struct list_head		list; // masters in the same domain
>>> 		struct arm_smmu_device		*smmu;
>>> 		unsigned int			num_sids;
>>> 		u32				*sids; // Points into fwspec
>>> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
>>> 	};
>>>
>>> and then just add a list_head into struct arm_smmu_domain to track the
>>> masters that have been attached (if you're feeling brave, you could put
>>> this into the s1_cfg).
>>
>> I'm not sure about that last bit, shouldn't the list of masters apply to
>> both s1 and s2?
> 
> I was assuming that (a) we were only using ATS with stage-1 and (b) we only
> need the masters list for ATC invalidation. Did I go wrong somewhere?

Hmm, I messed that up in patch 3, actually. I'm still unsure about it,
but in the end I was meaning to enable ATS for both stage-1 and stage-2
domains. We could make it stage-1 only, but VFIO can assign a device
with a stage-1 domain to userspace. That's the case with QEMU, which
doesn't use VFIO_TYPE1_NESTING_IOMMU at the moment. So I don't think
limiting ATS to stage-1 protects against anything.


For (b), the master list will likely also be used for PASID support, to
invalidate cached CDs for each device in a domain, but that's only stage-1.

Then the current patchsets for nested translation also rely on the
master list to bind guest PASID tables to all devices attached
to a domain. But with those patches, s1_cfg and s2_cfg aren't in an enum
anymore and we could still keep the list in s1_cfg.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 18:07             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 18:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-acpi, zhongmiao, rjw, okaya, iommu, sudeep.holla,
	robin.murphy, linux-arm-kernel, lenb

On 05/04/2019 17:39, Will Deacon wrote:
>>> I'm wondering whether we can't take this a bit further and re-organise the
>>> data structures to make this a little simpler overall. Something along the
>>> lines of:
>>>
>>> 	struct arm_smmu_master_data {
>>> 		struct list_head		list; // masters in the same domain
>>> 		struct arm_smmu_device		*smmu;
>>> 		unsigned int			num_sids;
>>> 		u32				*sids; // Points into fwspec
>>> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
>>> 	};
>>>
>>> and then just add a list_head into struct arm_smmu_domain to track the
>>> masters that have been attached (if you're feeling brave, you could put
>>> this into the s1_cfg).
>>
>> I'm not sure about that last bit, shouldn't the list of masters apply to
>> both s1 and s2?
> 
> I was assuming that (a) we were only using ATS with stage-1 and (b) we only
> need the masters list for ATC invalidation. Did I go wrong somewhere?

Hmm, I messed that up in patch 3, actually. I'm still unsure about it,
but in the end I was meaning to enable ATS for both stage-1 and stage-2
domains. We could make it stage-1 only, but VFIO can assign a device
with a stage-1 domain to userspace. That's the case with QEMU, which
doesn't use VFIO_TYPE1_NESTING_IOMMU at the moment. So I don't think
limiting ATS to stage-1 protects against anything.


For (b), the master list will likely also be used for PASID support, to
invalidate cached CDs for each device in a domain, but that's only stage-1.

Then the current patchsets for nested translation also rely on the
master list to bind guest PASID tables to all devices attached
to a domain. But with those patches, s1_cfg and s2_cfg aren't in an enum
anymore and we could still keep the list in s1_cfg.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices
@ 2019-04-05 18:07             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2019-04-05 18:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-acpi, zhongmiao, rjw, okaya, eric.auger, iommu,
	sudeep.holla, robin.murphy, linux-arm-kernel, lenb

On 05/04/2019 17:39, Will Deacon wrote:
>>> I'm wondering whether we can't take this a bit further and re-organise the
>>> data structures to make this a little simpler overall. Something along the
>>> lines of:
>>>
>>> 	struct arm_smmu_master_data {
>>> 		struct list_head		list; // masters in the same domain
>>> 		struct arm_smmu_device		*smmu;
>>> 		unsigned int			num_sids;
>>> 		u32				*sids; // Points into fwspec
>>> 		struct arm_smmu_domain		*domain; // NULL -> !assigned
>>> 	};
>>>
>>> and then just add a list_head into struct arm_smmu_domain to track the
>>> masters that have been attached (if you're feeling brave, you could put
>>> this into the s1_cfg).
>>
>> I'm not sure about that last bit, shouldn't the list of masters apply to
>> both s1 and s2?
> 
> I was assuming that (a) we were only using ATS with stage-1 and (b) we only
> need the masters list for ATC invalidation. Did I go wrong somewhere?

Hmm, I messed that up in patch 3, actually. I'm still unsure about it,
but in the end I was meaning to enable ATS for both stage-1 and stage-2
domains. We could make it stage-1 only, but VFIO can assign a device
with a stage-1 domain to userspace. That's the case with QEMU, which
doesn't use VFIO_TYPE1_NESTING_IOMMU at the moment. So I don't think
limiting ATS to stage-1 protects against anything.


For (b), the master list will likely also be used for PASID support, to
invalidate cached CDs for each device in a domain, but that's only stage-1.

Then the current patchsets for nested translation also rely on the
master list to bind guest PASID tables to all devices attached
to a domain. But with those patches, s1_cfg and s2_cfg aren't in an enum
anymore and we could still keep the list in s1_cfg.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-04-05 18:09 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20 17:36 [PATCH 0/4] Add PCI ATS support to Arm SMMUv3 Jean-Philippe Brucker
2019-03-20 17:36 ` Jean-Philippe Brucker
2019-03-20 17:36 ` [PATCH 1/4] ACPI/IORT: Check ATS capability in root complex nodes Jean-Philippe Brucker
2019-03-20 17:36   ` Jean-Philippe Brucker
     [not found]   ` <20190320173634.21895-2-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2019-03-21 16:00     ` Sinan Kaya
2019-03-21 16:00       ` Sinan Kaya
     [not found]       ` <fc53b2f7-53c4-c32d-8588-6f8d98ef5e7f-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2019-03-25 15:02         ` Jean-Philippe Brucker
2019-03-25 15:02           ` Jean-Philippe Brucker
2019-03-20 17:36 ` [PATCH 2/4] iommu/arm-smmu-v3: Link domains and devices Jean-Philippe Brucker
2019-03-20 17:36   ` Jean-Philippe Brucker
     [not found]   ` <20190320173634.21895-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2019-04-04 14:39     ` Will Deacon
2019-04-04 14:39       ` Will Deacon
2019-04-05 16:35       ` Jean-Philippe Brucker
2019-04-05 16:35         ` Jean-Philippe Brucker
2019-04-05 16:35         ` Jean-Philippe Brucker
2019-04-05 16:35         ` Jean-Philippe Brucker
2019-04-05 16:39         ` Will Deacon
2019-04-05 16:39           ` Will Deacon
2019-04-05 16:39           ` Will Deacon
2019-04-05 16:39           ` Will Deacon
2019-04-05 18:07           ` Jean-Philippe Brucker
2019-04-05 18:07             ` Jean-Philippe Brucker
2019-04-05 18:07             ` Jean-Philippe Brucker
2019-04-05 18:07             ` Jean-Philippe Brucker
2019-03-20 17:36 ` [PATCH 3/4] iommu/arm-smmu-v3: Add support for PCI ATS Jean-Philippe Brucker
2019-03-20 17:36   ` Jean-Philippe Brucker
     [not found]   ` <20190320173634.21895-4-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2019-03-21 15:52     ` Sinan Kaya
2019-03-21 15:52       ` Sinan Kaya
     [not found]       ` <6a4fb991-b8cd-d64d-25bd-dbbefcf69fa5-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2019-03-25 15:01         ` Jean-Philippe Brucker
2019-03-25 15:01           ` Jean-Philippe Brucker
2019-03-20 17:36 ` [PATCH 4/4] iommu/arm-smmu-v3: Disable tagged pointers Jean-Philippe Brucker
2019-03-20 17:36   ` Jean-Philippe Brucker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.