All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-08 11:44 ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

Hi,

here is a patch-set to make the AMD IOMMU driver use the
generic IOVA allocator, which is already used in the Intel
VT-d driver and a few other places.

The main reason for the conversion is to make the driver
benefit from the recent scalability improvements to the IOVA
code. Some of these improvements happened in the Intel VT-d
driver, these are not re-used but, for now, re-implemented.

This leaves more room for merging things together into
common code for the future.

The allocator that was previously used will be removed with
these patches.

Please review.

Thanks,

	Joerg

Joerg Roedel (20):
  iommu: Add apply_dm_region call-back to iommu-ops
  iommu/amd: Select IOMMU_IOVA for AMD IOMMU
  iommu/amd: Allocate iova_domain for dma_ops_domain
  iommu/amd: Create a list of reserved iova addresses
  iommu/amd: Implement apply_dm_region call-back
  iommu/amd: Pass gfp-flags to iommu_map_page()
  iommu/amd: Remove special mapping code for dma_ops path
  iommu/amd: Make use of the generic IOVA allocator
  iommu/amd: Remove other remains of old address allocator
  iommu/amd: Remove align-parameter from __map_single()
  iommu/amd: Set up data structures for flush queue
  iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
  iommu/amd: Implement flush queue
  iommu/amd: Implement timeout to flush unmap queues
  iommu/amd: Introduce dir2prot() helper
  iommu/amd: Optimize map_sg and unmap_sg
  iommu/amd: Use dev_data->domain in get_domain()
  iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  iommu/amd: Flush iova queue before releasing dma_ops_domain
  iommu/amd: Use container_of to get dma_ops_domain

 drivers/iommu/Kconfig           |   1 +
 drivers/iommu/amd_iommu.c       | 976 ++++++++++++++++------------------------
 drivers/iommu/amd_iommu_types.h |   1 -
 drivers/iommu/iommu.c           |   3 +
 include/linux/iommu.h           |   3 +
 5 files changed, 387 insertions(+), 597 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-08 11:44 ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

here is a patch-set to make the AMD IOMMU driver use the
generic IOVA allocator, which is already used in the Intel
VT-d driver and a few other places.

The main reason for the conversion is to make the driver
benefit from the recent scalability improvements to the IOVA
code. Some of these improvements happened in the Intel VT-d
driver, these are not re-used but, for now, re-implemented.

This leaves more room for merging things together into
common code for the future.

The allocator that was previously used will be removed with
these patches.

Please review.

Thanks,

	Joerg

Joerg Roedel (20):
  iommu: Add apply_dm_region call-back to iommu-ops
  iommu/amd: Select IOMMU_IOVA for AMD IOMMU
  iommu/amd: Allocate iova_domain for dma_ops_domain
  iommu/amd: Create a list of reserved iova addresses
  iommu/amd: Implement apply_dm_region call-back
  iommu/amd: Pass gfp-flags to iommu_map_page()
  iommu/amd: Remove special mapping code for dma_ops path
  iommu/amd: Make use of the generic IOVA allocator
  iommu/amd: Remove other remains of old address allocator
  iommu/amd: Remove align-parameter from __map_single()
  iommu/amd: Set up data structures for flush queue
  iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
  iommu/amd: Implement flush queue
  iommu/amd: Implement timeout to flush unmap queues
  iommu/amd: Introduce dir2prot() helper
  iommu/amd: Optimize map_sg and unmap_sg
  iommu/amd: Use dev_data->domain in get_domain()
  iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  iommu/amd: Flush iova queue before releasing dma_ops_domain
  iommu/amd: Use container_of to get dma_ops_domain

 drivers/iommu/Kconfig           |   1 +
 drivers/iommu/amd_iommu.c       | 976 ++++++++++++++++------------------------
 drivers/iommu/amd_iommu_types.h |   1 -
 drivers/iommu/iommu.c           |   3 +
 include/linux/iommu.h           |   3 +
 5 files changed, 387 insertions(+), 597 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 01/20] iommu: Add apply_dm_region call-back to iommu-ops
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

This new call-back will be used by the iommu driver to do
reserve the given dm_region in its iova space before the
mapping is created.

The call-back is temporary until the dma-ops implementation
is part of the common iommu code.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/iommu.c | 3 +++
 include/linux/iommu.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3000051..e8d2fb0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -348,6 +348,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	list_for_each_entry(entry, &mappings, list) {
 		dma_addr_t start, end, addr;
 
+		if (domain->ops->apply_dm_region)
+			domain->ops->apply_dm_region(dev, domain, entry);
+
 		start = ALIGN(entry->start, pg_size);
 		end   = ALIGN(entry->start + entry->length, pg_size);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 664683a..a35fb8b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -152,6 +152,7 @@ struct iommu_dm_region {
  * @domain_set_attr: Change domain attributes
  * @get_dm_regions: Request list of direct mapping requirements for a device
  * @put_dm_regions: Free list of direct mapping requirements for a device
+ * @apply_dm_region: Temporary helper call-back for iova reserved ranges
  * @domain_window_enable: Configure and enable a particular window for a domain
  * @domain_window_disable: Disable a particular window for a domain
  * @domain_set_windows: Set the number of windows for a domain
@@ -186,6 +187,8 @@ struct iommu_ops {
 	/* Request/Free a list of direct mapping requirements for a device */
 	void (*get_dm_regions)(struct device *dev, struct list_head *list);
 	void (*put_dm_regions)(struct device *dev, struct list_head *list);
+	void (*apply_dm_region)(struct device *dev, struct iommu_domain *domain,
+				struct iommu_dm_region *region);
 
 	/* Window handling functions */
 	int (*domain_window_enable)(struct iommu_domain *domain, u32 wnd_nr,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 01/20] iommu: Add apply_dm_region call-back to iommu-ops
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

This new call-back will be used by the iommu driver to do
reserve the given dm_region in its iova space before the
mapping is created.

The call-back is temporary until the dma-ops implementation
is part of the common iommu code.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/iommu.c | 3 +++
 include/linux/iommu.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3000051..e8d2fb0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -348,6 +348,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	list_for_each_entry(entry, &mappings, list) {
 		dma_addr_t start, end, addr;
 
+		if (domain->ops->apply_dm_region)
+			domain->ops->apply_dm_region(dev, domain, entry);
+
 		start = ALIGN(entry->start, pg_size);
 		end   = ALIGN(entry->start + entry->length, pg_size);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 664683a..a35fb8b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -152,6 +152,7 @@ struct iommu_dm_region {
  * @domain_set_attr: Change domain attributes
  * @get_dm_regions: Request list of direct mapping requirements for a device
  * @put_dm_regions: Free list of direct mapping requirements for a device
+ * @apply_dm_region: Temporary helper call-back for iova reserved ranges
  * @domain_window_enable: Configure and enable a particular window for a domain
  * @domain_window_disable: Disable a particular window for a domain
  * @domain_set_windows: Set the number of windows for a domain
@@ -186,6 +187,8 @@ struct iommu_ops {
 	/* Request/Free a list of direct mapping requirements for a device */
 	void (*get_dm_regions)(struct device *dev, struct list_head *list);
 	void (*put_dm_regions)(struct device *dev, struct list_head *list);
+	void (*apply_dm_region)(struct device *dev, struct iommu_domain *domain,
+				struct iommu_dm_region *region);
 
 	/* Window handling functions */
 	int (*domain_window_enable)(struct iommu_domain *domain, u32 wnd_nr,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/20] iommu/amd: Select IOMMU_IOVA for AMD IOMMU
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Include the generic IOVA code to make use of it in the AMD
IOMMU driver too.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index ad08603..0b1aabb 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -111,6 +111,7 @@ config AMD_IOMMU
 	select PCI_PRI
 	select PCI_PASID
 	select IOMMU_API
+	select IOMMU_IOVA
 	depends on X86_64 && PCI && ACPI
 	---help---
 	  With this option you can enable support for AMD IOMMU hardware in
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/20] iommu/amd: Select IOMMU_IOVA for AMD IOMMU
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Include the generic IOVA code to make use of it in the AMD
IOMMU driver too.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index ad08603..0b1aabb 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -111,6 +111,7 @@ config AMD_IOMMU
 	select PCI_PRI
 	select PCI_PASID
 	select IOMMU_API
+	select IOMMU_IOVA
 	depends on X86_64 && PCI && ACPI
 	---help---
 	  With this option you can enable support for AMD IOMMU hardware in
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/20] iommu/amd: Allocate iova_domain for dma_ops_domain
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Use it later for allocating the IO virtual addresses.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 634f636..71af367 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -38,6 +38,7 @@
 #include <linux/dma-contiguous.h>
 #include <linux/irqdomain.h>
 #include <linux/percpu.h>
+#include <linux/iova.h>
 #include <asm/irq_remapping.h>
 #include <asm/io_apic.h>
 #include <asm/apic.h>
@@ -56,6 +57,11 @@
 
 #define LOOP_TIMEOUT	100000
 
+/* IO virtual address start page frame number */
+#define IOVA_START_PFN		(1)
+#define IOVA_PFN(addr)		((addr) >> PAGE_SHIFT)
+#define DMA_32BIT_PFN		IOVA_PFN(DMA_BIT_MASK(32))
+
 /*
  * This bitmap is used to advertise the page sizes our hardware support
  * to the IOMMU core, which will then use this information to split
@@ -157,6 +163,9 @@ struct dma_ops_domain {
 
 	/* address space relevant data */
 	struct aperture_range *aperture[APERTURE_MAX_RANGES];
+
+	/* IOVA RB-Tree */
+	struct iova_domain iovad;
 };
 
 /****************************************************************************
@@ -1966,6 +1975,8 @@ static void dma_ops_domain_free(struct dma_ops_domain *dom)
 	if (!dom)
 		return;
 
+	put_iova_domain(&dom->iovad);
+
 	free_percpu(dom->next_index);
 
 	del_domain_from_list(&dom->domain);
@@ -2041,6 +2052,9 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(dma_dom->next_index, cpu) = 0;
 
+	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
+			 IOVA_START_PFN, DMA_32BIT_PFN);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2948,7 +2962,11 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 
 int __init amd_iommu_init_api(void)
 {
-	int err = 0;
+	int ret, err = 0;
+
+	ret = iova_cache_get();
+	if (ret)
+		return ret;
 
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/20] iommu/amd: Allocate iova_domain for dma_ops_domain
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Use it later for allocating the IO virtual addresses.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 634f636..71af367 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -38,6 +38,7 @@
 #include <linux/dma-contiguous.h>
 #include <linux/irqdomain.h>
 #include <linux/percpu.h>
+#include <linux/iova.h>
 #include <asm/irq_remapping.h>
 #include <asm/io_apic.h>
 #include <asm/apic.h>
@@ -56,6 +57,11 @@
 
 #define LOOP_TIMEOUT	100000
 
+/* IO virtual address start page frame number */
+#define IOVA_START_PFN		(1)
+#define IOVA_PFN(addr)		((addr) >> PAGE_SHIFT)
+#define DMA_32BIT_PFN		IOVA_PFN(DMA_BIT_MASK(32))
+
 /*
  * This bitmap is used to advertise the page sizes our hardware support
  * to the IOMMU core, which will then use this information to split
@@ -157,6 +163,9 @@ struct dma_ops_domain {
 
 	/* address space relevant data */
 	struct aperture_range *aperture[APERTURE_MAX_RANGES];
+
+	/* IOVA RB-Tree */
+	struct iova_domain iovad;
 };
 
 /****************************************************************************
@@ -1966,6 +1975,8 @@ static void dma_ops_domain_free(struct dma_ops_domain *dom)
 	if (!dom)
 		return;
 
+	put_iova_domain(&dom->iovad);
+
 	free_percpu(dom->next_index);
 
 	del_domain_from_list(&dom->domain);
@@ -2041,6 +2052,9 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(dma_dom->next_index, cpu) = 0;
 
+	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
+			 IOVA_START_PFN, DMA_32BIT_PFN);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2948,7 +2962,11 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 
 int __init amd_iommu_init_api(void)
 {
-	int err = 0;
+	int ret, err = 0;
+
+	ret = iova_cache_get();
+	if (ret)
+		return ret;
 
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/20] iommu/amd: Create a list of reserved iova addresses
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Put the MSI-range, the HT-range and the MMIO ranges of PCI
devices into that range, so that these addresses are not
allocated for DMA.

Copy this address list into every created dma_ops_domain.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 71af367..4b8b9ee 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -62,6 +62,12 @@
 #define IOVA_PFN(addr)		((addr) >> PAGE_SHIFT)
 #define DMA_32BIT_PFN		IOVA_PFN(DMA_BIT_MASK(32))
 
+/* Reserved IOVA ranges */
+#define MSI_RANGE_START		(0xfee00000)
+#define MSI_RANGE_END		(0xfeefffff)
+#define HT_RANGE_START		(0xfd00000000ULL)
+#define HT_RANGE_END		(0xffffffffffULL)
+
 /*
  * This bitmap is used to advertise the page sizes our hardware support
  * to the IOMMU core, which will then use this information to split
@@ -168,6 +174,9 @@ struct dma_ops_domain {
 	struct iova_domain iovad;
 };
 
+static struct iova_domain reserved_iova_ranges;
+static struct lock_class_key reserved_rbtree_key;
+
 /****************************************************************************
  *
  * Helper functions
@@ -2055,6 +2064,9 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
 			 IOVA_START_PFN, DMA_32BIT_PFN);
 
+	/* Initialize reserved ranges */
+	copy_reserved_iova(&reserved_iova_ranges, &dma_dom->iovad);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2960,6 +2972,59 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 	.set_dma_mask	= set_dma_mask,
 };
 
+static int init_reserved_iova_ranges(void)
+{
+	struct pci_dev *pdev = NULL;
+	struct iova *val;
+
+	init_iova_domain(&reserved_iova_ranges, PAGE_SIZE,
+			 IOVA_START_PFN, DMA_32BIT_PFN);
+
+	lockdep_set_class(&reserved_iova_ranges.iova_rbtree_lock,
+			  &reserved_rbtree_key);
+
+	/* MSI memory range */
+	val = reserve_iova(&reserved_iova_ranges,
+			   IOVA_PFN(MSI_RANGE_START), IOVA_PFN(MSI_RANGE_END));
+	if (!val) {
+		pr_err("Reserving MSI range failed\n");
+		return -ENOMEM;
+	}
+
+	/* HT memory range */
+	val = reserve_iova(&reserved_iova_ranges,
+			   IOVA_PFN(HT_RANGE_START), IOVA_PFN(HT_RANGE_END));
+	if (!val) {
+		pr_err("Reserving HT range failed\n");
+		return -ENOMEM;
+	}
+
+	/*
+	 * Memory used for PCI resources
+	 * FIXME: Check whether we can reserve the PCI-hole completly
+	 */
+	for_each_pci_dev(pdev) {
+		int i;
+
+		for (i = 0; i < PCI_NUM_RESOURCES; ++i) {
+			struct resource *r = &pdev->resource[i];
+
+			if (!(r->flags & IORESOURCE_MEM))
+				continue;
+
+			val = reserve_iova(&reserved_iova_ranges,
+					   IOVA_PFN(r->start),
+					   IOVA_PFN(r->end));
+			if (!val) {
+				pr_err("Reserve pci-resource range failed\n");
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
 int __init amd_iommu_init_api(void)
 {
 	int ret, err = 0;
@@ -2968,6 +3033,10 @@ int __init amd_iommu_init_api(void)
 	if (ret)
 		return ret;
 
+	ret = init_reserved_iova_ranges();
+	if (ret)
+		return ret;
+
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
 		return err;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/20] iommu/amd: Create a list of reserved iova addresses
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Put the MSI-range, the HT-range and the MMIO ranges of PCI
devices into that range, so that these addresses are not
allocated for DMA.

Copy this address list into every created dma_ops_domain.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 71af367..4b8b9ee 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -62,6 +62,12 @@
 #define IOVA_PFN(addr)		((addr) >> PAGE_SHIFT)
 #define DMA_32BIT_PFN		IOVA_PFN(DMA_BIT_MASK(32))
 
+/* Reserved IOVA ranges */
+#define MSI_RANGE_START		(0xfee00000)
+#define MSI_RANGE_END		(0xfeefffff)
+#define HT_RANGE_START		(0xfd00000000ULL)
+#define HT_RANGE_END		(0xffffffffffULL)
+
 /*
  * This bitmap is used to advertise the page sizes our hardware support
  * to the IOMMU core, which will then use this information to split
@@ -168,6 +174,9 @@ struct dma_ops_domain {
 	struct iova_domain iovad;
 };
 
+static struct iova_domain reserved_iova_ranges;
+static struct lock_class_key reserved_rbtree_key;
+
 /****************************************************************************
  *
  * Helper functions
@@ -2055,6 +2064,9 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
 			 IOVA_START_PFN, DMA_32BIT_PFN);
 
+	/* Initialize reserved ranges */
+	copy_reserved_iova(&reserved_iova_ranges, &dma_dom->iovad);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2960,6 +2972,59 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 	.set_dma_mask	= set_dma_mask,
 };
 
+static int init_reserved_iova_ranges(void)
+{
+	struct pci_dev *pdev = NULL;
+	struct iova *val;
+
+	init_iova_domain(&reserved_iova_ranges, PAGE_SIZE,
+			 IOVA_START_PFN, DMA_32BIT_PFN);
+
+	lockdep_set_class(&reserved_iova_ranges.iova_rbtree_lock,
+			  &reserved_rbtree_key);
+
+	/* MSI memory range */
+	val = reserve_iova(&reserved_iova_ranges,
+			   IOVA_PFN(MSI_RANGE_START), IOVA_PFN(MSI_RANGE_END));
+	if (!val) {
+		pr_err("Reserving MSI range failed\n");
+		return -ENOMEM;
+	}
+
+	/* HT memory range */
+	val = reserve_iova(&reserved_iova_ranges,
+			   IOVA_PFN(HT_RANGE_START), IOVA_PFN(HT_RANGE_END));
+	if (!val) {
+		pr_err("Reserving HT range failed\n");
+		return -ENOMEM;
+	}
+
+	/*
+	 * Memory used for PCI resources
+	 * FIXME: Check whether we can reserve the PCI-hole completly
+	 */
+	for_each_pci_dev(pdev) {
+		int i;
+
+		for (i = 0; i < PCI_NUM_RESOURCES; ++i) {
+			struct resource *r = &pdev->resource[i];
+
+			if (!(r->flags & IORESOURCE_MEM))
+				continue;
+
+			val = reserve_iova(&reserved_iova_ranges,
+					   IOVA_PFN(r->start),
+					   IOVA_PFN(r->end));
+			if (!val) {
+				pr_err("Reserve pci-resource range failed\n");
+				return -ENOMEM;
+			}
+		}
+	}
+
+	return 0;
+}
+
 int __init amd_iommu_init_api(void)
 {
 	int ret, err = 0;
@@ -2968,6 +3033,10 @@ int __init amd_iommu_init_api(void)
 	if (ret)
 		return ret;
 
+	ret = init_reserved_iova_ranges();
+	if (ret)
+		return ret;
+
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
 		return err;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/20] iommu/amd: Implement apply_dm_region call-back
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

It is used to reserve the dm-regions in the iova-tree.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4b8b9ee..5ea5679 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3379,6 +3379,20 @@ static void amd_iommu_put_dm_regions(struct device *dev,
 		kfree(entry);
 }
 
+static void amd_iommu_apply_dm_region(struct device *dev,
+				      struct iommu_domain *domain,
+				      struct iommu_dm_region *region)
+{
+	struct protection_domain *pdomain = to_pdomain(domain);
+	struct dma_ops_domain *dma_dom = pdomain->priv;
+	unsigned long start, end;
+
+	start = IOVA_PFN(region->start);
+	end   = IOVA_PFN(region->start + region->length);
+
+	WARN_ON_ONCE(reserve_iova(&dma_dom->iovad, start, end) == NULL);
+}
+
 static const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -3394,6 +3408,7 @@ static const struct iommu_ops amd_iommu_ops = {
 	.device_group = amd_iommu_device_group,
 	.get_dm_regions = amd_iommu_get_dm_regions,
 	.put_dm_regions = amd_iommu_put_dm_regions,
+	.apply_dm_region = amd_iommu_apply_dm_region,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/20] iommu/amd: Implement apply_dm_region call-back
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

It is used to reserve the dm-regions in the iova-tree.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4b8b9ee..5ea5679 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3379,6 +3379,20 @@ static void amd_iommu_put_dm_regions(struct device *dev,
 		kfree(entry);
 }
 
+static void amd_iommu_apply_dm_region(struct device *dev,
+				      struct iommu_domain *domain,
+				      struct iommu_dm_region *region)
+{
+	struct protection_domain *pdomain = to_pdomain(domain);
+	struct dma_ops_domain *dma_dom = pdomain->priv;
+	unsigned long start, end;
+
+	start = IOVA_PFN(region->start);
+	end   = IOVA_PFN(region->start + region->length);
+
+	WARN_ON_ONCE(reserve_iova(&dma_dom->iovad, start, end) == NULL);
+}
+
 static const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -3394,6 +3408,7 @@ static const struct iommu_ops amd_iommu_ops = {
 	.device_group = amd_iommu_device_group,
 	.get_dm_regions = amd_iommu_get_dm_regions,
 	.put_dm_regions = amd_iommu_put_dm_regions,
+	.apply_dm_region = amd_iommu_apply_dm_region,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/20] iommu/amd: Pass gfp-flags to iommu_map_page()
  2016-07-08 11:44 ` Joerg Roedel
                   ` (5 preceding siblings ...)
  (?)
@ 2016-07-08 11:44 ` Joerg Roedel
  -1 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Make this function ready to be used in the DMA-API path.
Reorder parameters a bit while at it.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 5ea5679..e7c042b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1415,8 +1415,9 @@ static u64 *fetch_pte(struct protection_domain *domain,
 static int iommu_map_page(struct protection_domain *dom,
 			  unsigned long bus_addr,
 			  unsigned long phys_addr,
+			  unsigned long page_size,
 			  int prot,
-			  unsigned long page_size)
+			  gfp_t gfp)
 {
 	u64 __pte, *pte;
 	int i, count;
@@ -1428,7 +1429,7 @@ static int iommu_map_page(struct protection_domain *dom,
 		return -EINVAL;
 
 	count = PAGE_SIZE_PTE_COUNT(page_size);
-	pte   = alloc_pte(dom, bus_addr, page_size, NULL, GFP_KERNEL);
+	pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp);
 
 	if (!pte)
 		return -ENOMEM;
@@ -3277,7 +3278,7 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
 		prot |= IOMMU_PROT_IW;
 
 	mutex_lock(&domain->api_lock);
-	ret = iommu_map_page(domain, iova, paddr, prot, page_size);
+	ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL);
 	mutex_unlock(&domain->api_lock);
 
 	return ret;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Use the iommu-api map/unmap functions instead. This will be
required anyway when IOVA code is used for address
allocation.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 107 ++++++----------------------------------------
 1 file changed, 14 insertions(+), 93 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e7c042b..08aa46c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2557,94 +2557,6 @@ static void update_domain(struct protection_domain *domain)
 }
 
 /*
- * This function fetches the PTE for a given address in the aperture
- */
-static u64* dma_ops_get_pte(struct dma_ops_domain *dom,
-			    unsigned long address)
-{
-	struct aperture_range *aperture;
-	u64 *pte, *pte_page;
-
-	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
-	if (!aperture)
-		return NULL;
-
-	pte = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
-	if (!pte) {
-		pte = alloc_pte(&dom->domain, address, PAGE_SIZE, &pte_page,
-				GFP_ATOMIC);
-		aperture->pte_pages[APERTURE_PAGE_INDEX(address)] = pte_page;
-	} else
-		pte += PM_LEVEL_INDEX(0, address);
-
-	update_domain(&dom->domain);
-
-	return pte;
-}
-
-/*
- * This is the generic map function. It maps one 4kb page at paddr to
- * the given address in the DMA address space for the domain.
- */
-static dma_addr_t dma_ops_domain_map(struct dma_ops_domain *dom,
-				     unsigned long address,
-				     phys_addr_t paddr,
-				     int direction)
-{
-	u64 *pte, __pte;
-
-	WARN_ON(address > dom->aperture_size);
-
-	paddr &= PAGE_MASK;
-
-	pte  = dma_ops_get_pte(dom, address);
-	if (!pte)
-		return DMA_ERROR_CODE;
-
-	__pte = paddr | IOMMU_PTE_P | IOMMU_PTE_FC;
-
-	if (direction == DMA_TO_DEVICE)
-		__pte |= IOMMU_PTE_IR;
-	else if (direction == DMA_FROM_DEVICE)
-		__pte |= IOMMU_PTE_IW;
-	else if (direction == DMA_BIDIRECTIONAL)
-		__pte |= IOMMU_PTE_IR | IOMMU_PTE_IW;
-
-	WARN_ON_ONCE(*pte);
-
-	*pte = __pte;
-
-	return (dma_addr_t)address;
-}
-
-/*
- * The generic unmapping function for on page in the DMA address space.
- */
-static void dma_ops_domain_unmap(struct dma_ops_domain *dom,
-				 unsigned long address)
-{
-	struct aperture_range *aperture;
-	u64 *pte;
-
-	if (address >= dom->aperture_size)
-		return;
-
-	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
-	if (!aperture)
-		return;
-
-	pte  = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
-	if (!pte)
-		return;
-
-	pte += PM_LEVEL_INDEX(0, address);
-
-	WARN_ON_ONCE(!*pte);
-
-	*pte = 0ULL;
-}
-
-/*
  * This function contains common code for mapping of a physically
  * contiguous memory region into DMA address space. It is used by all
  * mapping functions provided with this IOMMU driver.
@@ -2654,7 +2566,7 @@ static dma_addr_t __map_single(struct device *dev,
 			       struct dma_ops_domain *dma_dom,
 			       phys_addr_t paddr,
 			       size_t size,
-			       int dir,
+			       int direction,
 			       bool align,
 			       u64 dma_mask)
 {
@@ -2662,6 +2574,7 @@ static dma_addr_t __map_single(struct device *dev,
 	dma_addr_t address, start, ret;
 	unsigned int pages;
 	unsigned long align_mask = 0;
+	int prot = 0;
 	int i;
 
 	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
@@ -2676,10 +2589,18 @@ static dma_addr_t __map_single(struct device *dev,
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
+	if (direction == DMA_TO_DEVICE)
+		prot = IOMMU_PROT_IR;
+	else if (direction == DMA_FROM_DEVICE)
+		prot = IOMMU_PROT_IW;
+	else if (direction == DMA_BIDIRECTIONAL)
+		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
+
 	start = address;
 	for (i = 0; i < pages; ++i) {
-		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
-		if (ret == DMA_ERROR_CODE)
+		ret = iommu_map_page(&dma_dom->domain, start, paddr,
+				     PAGE_SIZE, prot, GFP_ATOMIC);
+		if (ret)
 			goto out_unmap;
 
 		paddr += PAGE_SIZE;
@@ -2699,7 +2620,7 @@ out_unmap:
 
 	for (--i; i >= 0; --i) {
 		start -= PAGE_SIZE;
-		dma_ops_domain_unmap(dma_dom, start);
+		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 	}
 
 	dma_ops_free_addresses(dma_dom, address, pages);
@@ -2730,7 +2651,7 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 	start = dma_addr;
 
 	for (i = 0; i < pages; ++i) {
-		dma_ops_domain_unmap(dma_dom, start);
+		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 		start += PAGE_SIZE;
 	}
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Use the iommu-api map/unmap functions instead. This will be
required anyway when IOVA code is used for address
allocation.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 107 ++++++----------------------------------------
 1 file changed, 14 insertions(+), 93 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e7c042b..08aa46c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2557,94 +2557,6 @@ static void update_domain(struct protection_domain *domain)
 }
 
 /*
- * This function fetches the PTE for a given address in the aperture
- */
-static u64* dma_ops_get_pte(struct dma_ops_domain *dom,
-			    unsigned long address)
-{
-	struct aperture_range *aperture;
-	u64 *pte, *pte_page;
-
-	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
-	if (!aperture)
-		return NULL;
-
-	pte = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
-	if (!pte) {
-		pte = alloc_pte(&dom->domain, address, PAGE_SIZE, &pte_page,
-				GFP_ATOMIC);
-		aperture->pte_pages[APERTURE_PAGE_INDEX(address)] = pte_page;
-	} else
-		pte += PM_LEVEL_INDEX(0, address);
-
-	update_domain(&dom->domain);
-
-	return pte;
-}
-
-/*
- * This is the generic map function. It maps one 4kb page at paddr to
- * the given address in the DMA address space for the domain.
- */
-static dma_addr_t dma_ops_domain_map(struct dma_ops_domain *dom,
-				     unsigned long address,
-				     phys_addr_t paddr,
-				     int direction)
-{
-	u64 *pte, __pte;
-
-	WARN_ON(address > dom->aperture_size);
-
-	paddr &= PAGE_MASK;
-
-	pte  = dma_ops_get_pte(dom, address);
-	if (!pte)
-		return DMA_ERROR_CODE;
-
-	__pte = paddr | IOMMU_PTE_P | IOMMU_PTE_FC;
-
-	if (direction == DMA_TO_DEVICE)
-		__pte |= IOMMU_PTE_IR;
-	else if (direction == DMA_FROM_DEVICE)
-		__pte |= IOMMU_PTE_IW;
-	else if (direction == DMA_BIDIRECTIONAL)
-		__pte |= IOMMU_PTE_IR | IOMMU_PTE_IW;
-
-	WARN_ON_ONCE(*pte);
-
-	*pte = __pte;
-
-	return (dma_addr_t)address;
-}
-
-/*
- * The generic unmapping function for on page in the DMA address space.
- */
-static void dma_ops_domain_unmap(struct dma_ops_domain *dom,
-				 unsigned long address)
-{
-	struct aperture_range *aperture;
-	u64 *pte;
-
-	if (address >= dom->aperture_size)
-		return;
-
-	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
-	if (!aperture)
-		return;
-
-	pte  = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
-	if (!pte)
-		return;
-
-	pte += PM_LEVEL_INDEX(0, address);
-
-	WARN_ON_ONCE(!*pte);
-
-	*pte = 0ULL;
-}
-
-/*
  * This function contains common code for mapping of a physically
  * contiguous memory region into DMA address space. It is used by all
  * mapping functions provided with this IOMMU driver.
@@ -2654,7 +2566,7 @@ static dma_addr_t __map_single(struct device *dev,
 			       struct dma_ops_domain *dma_dom,
 			       phys_addr_t paddr,
 			       size_t size,
-			       int dir,
+			       int direction,
 			       bool align,
 			       u64 dma_mask)
 {
@@ -2662,6 +2574,7 @@ static dma_addr_t __map_single(struct device *dev,
 	dma_addr_t address, start, ret;
 	unsigned int pages;
 	unsigned long align_mask = 0;
+	int prot = 0;
 	int i;
 
 	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
@@ -2676,10 +2589,18 @@ static dma_addr_t __map_single(struct device *dev,
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
+	if (direction == DMA_TO_DEVICE)
+		prot = IOMMU_PROT_IR;
+	else if (direction == DMA_FROM_DEVICE)
+		prot = IOMMU_PROT_IW;
+	else if (direction == DMA_BIDIRECTIONAL)
+		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
+
 	start = address;
 	for (i = 0; i < pages; ++i) {
-		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
-		if (ret == DMA_ERROR_CODE)
+		ret = iommu_map_page(&dma_dom->domain, start, paddr,
+				     PAGE_SIZE, prot, GFP_ATOMIC);
+		if (ret)
 			goto out_unmap;
 
 		paddr += PAGE_SIZE;
@@ -2699,7 +2620,7 @@ out_unmap:
 
 	for (--i; i >= 0; --i) {
 		start -= PAGE_SIZE;
-		dma_ops_domain_unmap(dma_dom, start);
+		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 	}
 
 	dma_ops_free_addresses(dma_dom, address, pages);
@@ -2730,7 +2651,7 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 	start = dma_addr;
 
 	for (i = 0; i < pages; ++i) {
-		dma_ops_domain_unmap(dma_dom, start);
+		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 		start += PAGE_SIZE;
 	}
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/20] iommu/amd: Make use of the generic IOVA allocator
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Remove the old address allocation code and make use of the
generic IOVA allocator that is also used by other dma-ops
implementations.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 183 +++++++---------------------------------------
 1 file changed, 26 insertions(+), 157 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 08aa46c..4276c88 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1646,167 +1646,32 @@ out_free:
 	return -ENOMEM;
 }
 
-static dma_addr_t dma_ops_aperture_alloc(struct dma_ops_domain *dom,
-					 struct aperture_range *range,
-					 unsigned long pages,
-					 unsigned long dma_mask,
-					 unsigned long boundary_size,
-					 unsigned long align_mask,
-					 bool trylock)
-{
-	unsigned long offset, limit, flags;
-	dma_addr_t address;
-	bool flush = false;
-
-	offset = range->offset >> PAGE_SHIFT;
-	limit  = iommu_device_max_index(APERTURE_RANGE_PAGES, offset,
-					dma_mask >> PAGE_SHIFT);
-
-	if (trylock) {
-		if (!spin_trylock_irqsave(&range->bitmap_lock, flags))
-			return -1;
-	} else {
-		spin_lock_irqsave(&range->bitmap_lock, flags);
-	}
-
-	address = iommu_area_alloc(range->bitmap, limit, range->next_bit,
-				   pages, offset, boundary_size, align_mask);
-	if (address == -1) {
-		/* Nothing found, retry one time */
-		address = iommu_area_alloc(range->bitmap, limit,
-					   0, pages, offset, boundary_size,
-					   align_mask);
-		flush = true;
-	}
-
-	if (address != -1)
-		range->next_bit = address + pages;
-
-	spin_unlock_irqrestore(&range->bitmap_lock, flags);
-
-	if (flush) {
-		domain_flush_tlb(&dom->domain);
-		domain_flush_complete(&dom->domain);
-	}
-
-	return address;
-}
-
-static unsigned long dma_ops_area_alloc(struct device *dev,
-					struct dma_ops_domain *dom,
-					unsigned int pages,
-					unsigned long align_mask,
-					u64 dma_mask)
+static unsigned long dma_ops_alloc_iova(struct device *dev,
+					struct dma_ops_domain *dma_dom,
+					unsigned int pages, u64 dma_mask)
 {
-	unsigned long boundary_size, mask;
-	unsigned long address = -1;
-	bool first = true;
-	u32 start, i;
-
-	preempt_disable();
-
-	mask = dma_get_seg_boundary(dev);
-
-again:
-	start = this_cpu_read(*dom->next_index);
-
-	/* Sanity check - is it really necessary? */
-	if (unlikely(start > APERTURE_MAX_RANGES)) {
-		start = 0;
-		this_cpu_write(*dom->next_index, 0);
-	}
-
-	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
-				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
+	unsigned long pfn = 0;
 
-	for (i = 0; i < APERTURE_MAX_RANGES; ++i) {
-		struct aperture_range *range;
-		int index;
-
-		index = (start + i) % APERTURE_MAX_RANGES;
+	pages = __roundup_pow_of_two(pages);
 
-		range = dom->aperture[index];
-
-		if (!range || range->offset >= dma_mask)
-			continue;
-
-		address = dma_ops_aperture_alloc(dom, range, pages,
-						 dma_mask, boundary_size,
-						 align_mask, first);
-		if (address != -1) {
-			address = range->offset + (address << PAGE_SHIFT);
-			this_cpu_write(*dom->next_index, index);
-			break;
-		}
-	}
-
-	if (address == -1 && first) {
-		first = false;
-		goto again;
-	}
+	if (dma_mask > DMA_BIT_MASK(32))
+		pfn = alloc_iova_fast(&dma_dom->iovad, pages,
+				      IOVA_PFN(DMA_BIT_MASK(32)));
 
-	preempt_enable();
+	if (!pfn)
+		pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask));
 
-	return address;
+	return (pfn << PAGE_SHIFT);
 }
 
-static unsigned long dma_ops_alloc_addresses(struct device *dev,
-					     struct dma_ops_domain *dom,
-					     unsigned int pages,
-					     unsigned long align_mask,
-					     u64 dma_mask)
+static void dma_ops_free_iova(struct dma_ops_domain *dma_dom,
+			      unsigned long address,
+			      unsigned int pages)
 {
-	unsigned long address = -1;
-
-	while (address == -1) {
-		address = dma_ops_area_alloc(dev, dom, pages,
-					     align_mask, dma_mask);
-
-		if (address == -1 && alloc_new_range(dom, false, GFP_ATOMIC))
-			break;
-	}
-
-	if (unlikely(address == -1))
-		address = DMA_ERROR_CODE;
-
-	WARN_ON((address + (PAGE_SIZE*pages)) > dom->aperture_size);
-
-	return address;
-}
-
-/*
- * The address free function.
- *
- * called with domain->lock held
- */
-static void dma_ops_free_addresses(struct dma_ops_domain *dom,
-				   unsigned long address,
-				   unsigned int pages)
-{
-	unsigned i = address >> APERTURE_RANGE_SHIFT;
-	struct aperture_range *range = dom->aperture[i];
-	unsigned long flags;
-
-	BUG_ON(i >= APERTURE_MAX_RANGES || range == NULL);
-
-#ifdef CONFIG_IOMMU_STRESS
-	if (i < 4)
-		return;
-#endif
-
-	if (amd_iommu_unmap_flush) {
-		domain_flush_tlb(&dom->domain);
-		domain_flush_complete(&dom->domain);
-	}
-
-	address = (address % APERTURE_RANGE_SIZE) >> PAGE_SHIFT;
-
-	spin_lock_irqsave(&range->bitmap_lock, flags);
-	if (address + pages > range->next_bit)
-		range->next_bit = address + pages;
-	bitmap_clear(range->bitmap, address, pages);
-	spin_unlock_irqrestore(&range->bitmap_lock, flags);
+	pages = __roundup_pow_of_two(pages);
+	address >>= PAGE_SHIFT;
 
+	free_iova_fast(&dma_dom->iovad, address, pages);
 }
 
 /****************************************************************************
@@ -2583,9 +2448,7 @@ static dma_addr_t __map_single(struct device *dev,
 	if (align)
 		align_mask = (1UL << get_order(size)) - 1;
 
-	address = dma_ops_alloc_addresses(dev, dma_dom, pages, align_mask,
-					  dma_mask);
-
+	address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
@@ -2623,7 +2486,10 @@ out_unmap:
 		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 	}
 
-	dma_ops_free_addresses(dma_dom, address, pages);
+	domain_flush_tlb(&dma_dom->domain);
+	domain_flush_complete(&dma_dom->domain);
+
+	dma_ops_free_iova(dma_dom, address, pages);
 
 	return DMA_ERROR_CODE;
 }
@@ -2655,7 +2521,10 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 		start += PAGE_SIZE;
 	}
 
-	dma_ops_free_addresses(dma_dom, dma_addr, pages);
+	domain_flush_tlb(&dma_dom->domain);
+	domain_flush_complete(&dma_dom->domain);
+
+	dma_ops_free_iova(dma_dom, dma_addr, pages);
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/20] iommu/amd: Make use of the generic IOVA allocator
@ 2016-07-08 11:44   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:44 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Remove the old address allocation code and make use of the
generic IOVA allocator that is also used by other dma-ops
implementations.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 183 +++++++---------------------------------------
 1 file changed, 26 insertions(+), 157 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 08aa46c..4276c88 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1646,167 +1646,32 @@ out_free:
 	return -ENOMEM;
 }
 
-static dma_addr_t dma_ops_aperture_alloc(struct dma_ops_domain *dom,
-					 struct aperture_range *range,
-					 unsigned long pages,
-					 unsigned long dma_mask,
-					 unsigned long boundary_size,
-					 unsigned long align_mask,
-					 bool trylock)
-{
-	unsigned long offset, limit, flags;
-	dma_addr_t address;
-	bool flush = false;
-
-	offset = range->offset >> PAGE_SHIFT;
-	limit  = iommu_device_max_index(APERTURE_RANGE_PAGES, offset,
-					dma_mask >> PAGE_SHIFT);
-
-	if (trylock) {
-		if (!spin_trylock_irqsave(&range->bitmap_lock, flags))
-			return -1;
-	} else {
-		spin_lock_irqsave(&range->bitmap_lock, flags);
-	}
-
-	address = iommu_area_alloc(range->bitmap, limit, range->next_bit,
-				   pages, offset, boundary_size, align_mask);
-	if (address == -1) {
-		/* Nothing found, retry one time */
-		address = iommu_area_alloc(range->bitmap, limit,
-					   0, pages, offset, boundary_size,
-					   align_mask);
-		flush = true;
-	}
-
-	if (address != -1)
-		range->next_bit = address + pages;
-
-	spin_unlock_irqrestore(&range->bitmap_lock, flags);
-
-	if (flush) {
-		domain_flush_tlb(&dom->domain);
-		domain_flush_complete(&dom->domain);
-	}
-
-	return address;
-}
-
-static unsigned long dma_ops_area_alloc(struct device *dev,
-					struct dma_ops_domain *dom,
-					unsigned int pages,
-					unsigned long align_mask,
-					u64 dma_mask)
+static unsigned long dma_ops_alloc_iova(struct device *dev,
+					struct dma_ops_domain *dma_dom,
+					unsigned int pages, u64 dma_mask)
 {
-	unsigned long boundary_size, mask;
-	unsigned long address = -1;
-	bool first = true;
-	u32 start, i;
-
-	preempt_disable();
-
-	mask = dma_get_seg_boundary(dev);
-
-again:
-	start = this_cpu_read(*dom->next_index);
-
-	/* Sanity check - is it really necessary? */
-	if (unlikely(start > APERTURE_MAX_RANGES)) {
-		start = 0;
-		this_cpu_write(*dom->next_index, 0);
-	}
-
-	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
-				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
+	unsigned long pfn = 0;
 
-	for (i = 0; i < APERTURE_MAX_RANGES; ++i) {
-		struct aperture_range *range;
-		int index;
-
-		index = (start + i) % APERTURE_MAX_RANGES;
+	pages = __roundup_pow_of_two(pages);
 
-		range = dom->aperture[index];
-
-		if (!range || range->offset >= dma_mask)
-			continue;
-
-		address = dma_ops_aperture_alloc(dom, range, pages,
-						 dma_mask, boundary_size,
-						 align_mask, first);
-		if (address != -1) {
-			address = range->offset + (address << PAGE_SHIFT);
-			this_cpu_write(*dom->next_index, index);
-			break;
-		}
-	}
-
-	if (address == -1 && first) {
-		first = false;
-		goto again;
-	}
+	if (dma_mask > DMA_BIT_MASK(32))
+		pfn = alloc_iova_fast(&dma_dom->iovad, pages,
+				      IOVA_PFN(DMA_BIT_MASK(32)));
 
-	preempt_enable();
+	if (!pfn)
+		pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask));
 
-	return address;
+	return (pfn << PAGE_SHIFT);
 }
 
-static unsigned long dma_ops_alloc_addresses(struct device *dev,
-					     struct dma_ops_domain *dom,
-					     unsigned int pages,
-					     unsigned long align_mask,
-					     u64 dma_mask)
+static void dma_ops_free_iova(struct dma_ops_domain *dma_dom,
+			      unsigned long address,
+			      unsigned int pages)
 {
-	unsigned long address = -1;
-
-	while (address == -1) {
-		address = dma_ops_area_alloc(dev, dom, pages,
-					     align_mask, dma_mask);
-
-		if (address == -1 && alloc_new_range(dom, false, GFP_ATOMIC))
-			break;
-	}
-
-	if (unlikely(address == -1))
-		address = DMA_ERROR_CODE;
-
-	WARN_ON((address + (PAGE_SIZE*pages)) > dom->aperture_size);
-
-	return address;
-}
-
-/*
- * The address free function.
- *
- * called with domain->lock held
- */
-static void dma_ops_free_addresses(struct dma_ops_domain *dom,
-				   unsigned long address,
-				   unsigned int pages)
-{
-	unsigned i = address >> APERTURE_RANGE_SHIFT;
-	struct aperture_range *range = dom->aperture[i];
-	unsigned long flags;
-
-	BUG_ON(i >= APERTURE_MAX_RANGES || range == NULL);
-
-#ifdef CONFIG_IOMMU_STRESS
-	if (i < 4)
-		return;
-#endif
-
-	if (amd_iommu_unmap_flush) {
-		domain_flush_tlb(&dom->domain);
-		domain_flush_complete(&dom->domain);
-	}
-
-	address = (address % APERTURE_RANGE_SIZE) >> PAGE_SHIFT;
-
-	spin_lock_irqsave(&range->bitmap_lock, flags);
-	if (address + pages > range->next_bit)
-		range->next_bit = address + pages;
-	bitmap_clear(range->bitmap, address, pages);
-	spin_unlock_irqrestore(&range->bitmap_lock, flags);
+	pages = __roundup_pow_of_two(pages);
+	address >>= PAGE_SHIFT;
 
+	free_iova_fast(&dma_dom->iovad, address, pages);
 }
 
 /****************************************************************************
@@ -2583,9 +2448,7 @@ static dma_addr_t __map_single(struct device *dev,
 	if (align)
 		align_mask = (1UL << get_order(size)) - 1;
 
-	address = dma_ops_alloc_addresses(dev, dma_dom, pages, align_mask,
-					  dma_mask);
-
+	address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
@@ -2623,7 +2486,10 @@ out_unmap:
 		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
 	}
 
-	dma_ops_free_addresses(dma_dom, address, pages);
+	domain_flush_tlb(&dma_dom->domain);
+	domain_flush_complete(&dma_dom->domain);
+
+	dma_ops_free_iova(dma_dom, address, pages);
 
 	return DMA_ERROR_CODE;
 }
@@ -2655,7 +2521,10 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 		start += PAGE_SIZE;
 	}
 
-	dma_ops_free_addresses(dma_dom, dma_addr, pages);
+	domain_flush_tlb(&dma_dom->domain);
+	domain_flush_complete(&dma_dom->domain);
+
+	dma_ops_free_iova(dma_dom, dma_addr, pages);
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/20] iommu/amd: Remove other remains of old address allocator
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

There are other remains in the code from the old allocatore.
Remove them all.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 305 +---------------------------------------------
 1 file changed, 5 insertions(+), 300 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4276c88..fbfe51b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -133,43 +133,12 @@ static int protection_domain_init(struct protection_domain *domain);
 static void detach_device(struct device *dev);
 
 /*
- * For dynamic growth the aperture size is split into ranges of 128MB of
- * DMA address space each. This struct represents one such range.
- */
-struct aperture_range {
-
-	spinlock_t bitmap_lock;
-
-	/* address allocation bitmap */
-	unsigned long *bitmap;
-	unsigned long offset;
-	unsigned long next_bit;
-
-	/*
-	 * Array of PTE pages for the aperture. In this array we save all the
-	 * leaf pages of the domain page table used for the aperture. This way
-	 * we don't need to walk the page table to find a specific PTE. We can
-	 * just calculate its address in constant time.
-	 */
-	u64 *pte_pages[64];
-};
-
-/*
  * Data container for a dma_ops specific protection domain
  */
 struct dma_ops_domain {
 	/* generic protection domain information */
 	struct protection_domain domain;
 
-	/* size of the aperture for the mappings */
-	unsigned long aperture_size;
-
-	/* aperture index we start searching for free addresses */
-	u32 __percpu *next_index;
-
-	/* address space relevant data */
-	struct aperture_range *aperture[APERTURE_MAX_RANGES];
-
 	/* IOVA RB-Tree */
 	struct iova_domain iovad;
 };
@@ -409,43 +378,6 @@ static bool pdev_pri_erratum(struct pci_dev *pdev, u32 erratum)
 }
 
 /*
- * This function actually applies the mapping to the page table of the
- * dma_ops domain.
- */
-static void alloc_unity_mapping(struct dma_ops_domain *dma_dom,
-				struct unity_map_entry *e)
-{
-	u64 addr;
-
-	for (addr = e->address_start; addr < e->address_end;
-	     addr += PAGE_SIZE) {
-		if (addr < dma_dom->aperture_size)
-			__set_bit(addr >> PAGE_SHIFT,
-				  dma_dom->aperture[0]->bitmap);
-	}
-}
-
-/*
- * Inits the unity mappings required for a specific device
- */
-static void init_unity_mappings_for_device(struct device *dev,
-					   struct dma_ops_domain *dma_dom)
-{
-	struct unity_map_entry *e;
-	int devid;
-
-	devid = get_device_id(dev);
-	if (devid < 0)
-		return;
-
-	list_for_each_entry(e, &amd_iommu_unity_map, list) {
-		if (!(devid >= e->devid_start && devid <= e->devid_end))
-			continue;
-		alloc_unity_mapping(dma_dom, e);
-	}
-}
-
-/*
  * This function checks if the driver got a valid device from the caller to
  * avoid dereferencing invalid pointers.
  */
@@ -486,7 +418,6 @@ static void init_iommu_group(struct device *dev)
 
 	dma_domain = to_pdomain(domain)->priv;
 
-	init_unity_mappings_for_device(dev, dma_domain);
 out:
 	iommu_group_put(group);
 }
@@ -1493,158 +1424,10 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 /****************************************************************************
  *
  * The next functions belong to the address allocator for the dma_ops
- * interface functions. They work like the allocators in the other IOMMU
- * drivers. Its basically a bitmap which marks the allocated pages in
- * the aperture. Maybe it could be enhanced in the future to a more
- * efficient allocator.
+ * interface functions.
  *
  ****************************************************************************/
 
-/*
- * The address allocator core functions.
- *
- * called with domain->lock held
- */
-
-/*
- * Used to reserve address ranges in the aperture (e.g. for exclusion
- * ranges.
- */
-static void dma_ops_reserve_addresses(struct dma_ops_domain *dom,
-				      unsigned long start_page,
-				      unsigned int pages)
-{
-	unsigned int i, last_page = dom->aperture_size >> PAGE_SHIFT;
-
-	if (start_page + pages > last_page)
-		pages = last_page - start_page;
-
-	for (i = start_page; i < start_page + pages; ++i) {
-		int index = i / APERTURE_RANGE_PAGES;
-		int page  = i % APERTURE_RANGE_PAGES;
-		__set_bit(page, dom->aperture[index]->bitmap);
-	}
-}
-
-/*
- * This function is used to add a new aperture range to an existing
- * aperture in case of dma_ops domain allocation or address allocation
- * failure.
- */
-static int alloc_new_range(struct dma_ops_domain *dma_dom,
-			   bool populate, gfp_t gfp)
-{
-	int index = dma_dom->aperture_size >> APERTURE_RANGE_SHIFT;
-	unsigned long i, old_size, pte_pgsize;
-	struct aperture_range *range;
-	struct amd_iommu *iommu;
-	unsigned long flags;
-
-#ifdef CONFIG_IOMMU_STRESS
-	populate = false;
-#endif
-
-	if (index >= APERTURE_MAX_RANGES)
-		return -ENOMEM;
-
-	range = kzalloc(sizeof(struct aperture_range), gfp);
-	if (!range)
-		return -ENOMEM;
-
-	range->bitmap = (void *)get_zeroed_page(gfp);
-	if (!range->bitmap)
-		goto out_free;
-
-	range->offset = dma_dom->aperture_size;
-
-	spin_lock_init(&range->bitmap_lock);
-
-	if (populate) {
-		unsigned long address = dma_dom->aperture_size;
-		int i, num_ptes = APERTURE_RANGE_PAGES / 512;
-		u64 *pte, *pte_page;
-
-		for (i = 0; i < num_ptes; ++i) {
-			pte = alloc_pte(&dma_dom->domain, address, PAGE_SIZE,
-					&pte_page, gfp);
-			if (!pte)
-				goto out_free;
-
-			range->pte_pages[i] = pte_page;
-
-			address += APERTURE_RANGE_SIZE / 64;
-		}
-	}
-
-	spin_lock_irqsave(&dma_dom->domain.lock, flags);
-
-	/* First take the bitmap_lock and then publish the range */
-	spin_lock(&range->bitmap_lock);
-
-	old_size                 = dma_dom->aperture_size;
-	dma_dom->aperture[index] = range;
-	dma_dom->aperture_size  += APERTURE_RANGE_SIZE;
-
-	/* Reserve address range used for MSI messages */
-	if (old_size < MSI_ADDR_BASE_LO &&
-	    dma_dom->aperture_size > MSI_ADDR_BASE_LO) {
-		unsigned long spage;
-		int pages;
-
-		pages = iommu_num_pages(MSI_ADDR_BASE_LO, 0x10000, PAGE_SIZE);
-		spage = MSI_ADDR_BASE_LO >> PAGE_SHIFT;
-
-		dma_ops_reserve_addresses(dma_dom, spage, pages);
-	}
-
-	/* Initialize the exclusion range if necessary */
-	for_each_iommu(iommu) {
-		if (iommu->exclusion_start &&
-		    iommu->exclusion_start >= dma_dom->aperture[index]->offset
-		    && iommu->exclusion_start < dma_dom->aperture_size) {
-			unsigned long startpage;
-			int pages = iommu_num_pages(iommu->exclusion_start,
-						    iommu->exclusion_length,
-						    PAGE_SIZE);
-			startpage = iommu->exclusion_start >> PAGE_SHIFT;
-			dma_ops_reserve_addresses(dma_dom, startpage, pages);
-		}
-	}
-
-	/*
-	 * Check for areas already mapped as present in the new aperture
-	 * range and mark those pages as reserved in the allocator. Such
-	 * mappings may already exist as a result of requested unity
-	 * mappings for devices.
-	 */
-	for (i = dma_dom->aperture[index]->offset;
-	     i < dma_dom->aperture_size;
-	     i += pte_pgsize) {
-		u64 *pte = fetch_pte(&dma_dom->domain, i, &pte_pgsize);
-		if (!pte || !IOMMU_PTE_PRESENT(*pte))
-			continue;
-
-		dma_ops_reserve_addresses(dma_dom, i >> PAGE_SHIFT,
-					  pte_pgsize >> 12);
-	}
-
-	update_domain(&dma_dom->domain);
-
-	spin_unlock(&range->bitmap_lock);
-
-	spin_unlock_irqrestore(&dma_dom->domain.lock, flags);
-
-	return 0;
-
-out_free:
-	update_domain(&dma_dom->domain);
-
-	free_page((unsigned long)range->bitmap);
-
-	kfree(range);
-
-	return -ENOMEM;
-}
 
 static unsigned long dma_ops_alloc_iova(struct device *dev,
 					struct dma_ops_domain *dma_dom,
@@ -1845,46 +1628,18 @@ static void free_gcr3_table(struct protection_domain *domain)
  */
 static void dma_ops_domain_free(struct dma_ops_domain *dom)
 {
-	int i;
-
 	if (!dom)
 		return;
 
-	put_iova_domain(&dom->iovad);
-
-	free_percpu(dom->next_index);
-
 	del_domain_from_list(&dom->domain);
 
-	free_pagetable(&dom->domain);
+	put_iova_domain(&dom->iovad);
 
-	for (i = 0; i < APERTURE_MAX_RANGES; ++i) {
-		if (!dom->aperture[i])
-			continue;
-		free_page((unsigned long)dom->aperture[i]->bitmap);
-		kfree(dom->aperture[i]);
-	}
+	free_pagetable(&dom->domain);
 
 	kfree(dom);
 }
 
-static int dma_ops_domain_alloc_apertures(struct dma_ops_domain *dma_dom,
-					  int max_apertures)
-{
-	int ret, i, apertures;
-
-	apertures = dma_dom->aperture_size >> APERTURE_RANGE_SHIFT;
-	ret       = 0;
-
-	for (i = apertures; i < max_apertures; ++i) {
-		ret = alloc_new_range(dma_dom, false, GFP_KERNEL);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-
 /*
  * Allocates a new protection domain usable for the dma_ops functions.
  * It also initializes the page table and the address allocator data
@@ -1893,7 +1648,6 @@ static int dma_ops_domain_alloc_apertures(struct dma_ops_domain *dma_dom,
 static struct dma_ops_domain *dma_ops_domain_alloc(void)
 {
 	struct dma_ops_domain *dma_dom;
-	int cpu;
 
 	dma_dom = kzalloc(sizeof(struct dma_ops_domain), GFP_KERNEL);
 	if (!dma_dom)
@@ -1902,10 +1656,6 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	if (protection_domain_init(&dma_dom->domain))
 		goto free_dma_dom;
 
-	dma_dom->next_index = alloc_percpu(u32);
-	if (!dma_dom->next_index)
-		goto free_dma_dom;
-
 	dma_dom->domain.mode = PAGE_MODE_2_LEVEL;
 	dma_dom->domain.pt_root = (void *)get_zeroed_page(GFP_KERNEL);
 	dma_dom->domain.flags = PD_DMA_OPS_MASK;
@@ -1913,26 +1663,14 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	if (!dma_dom->domain.pt_root)
 		goto free_dma_dom;
 
-	add_domain_to_list(&dma_dom->domain);
-
-	if (alloc_new_range(dma_dom, true, GFP_KERNEL))
-		goto free_dma_dom;
-
-	/*
-	 * mark the first page as allocated so we never return 0 as
-	 * a valid dma-address. So we can use 0 as error value
-	 */
-	dma_dom->aperture[0]->bitmap[0] = 1;
-
-	for_each_possible_cpu(cpu)
-		*per_cpu_ptr(dma_dom->next_index, cpu) = 0;
-
 	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
 			 IOVA_START_PFN, DMA_32BIT_PFN);
 
 	/* Initialize reserved ranges */
 	copy_reserved_iova(&reserved_iova_ranges, &dma_dom->iovad);
 
+	add_domain_to_list(&dma_dom->domain);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2507,10 +2245,6 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 	dma_addr_t i, start;
 	unsigned int pages;
 
-	if ((dma_addr == DMA_ERROR_CODE) ||
-	    (dma_addr + size > dma_dom->aperture_size))
-		return;
-
 	flush_addr = dma_addr;
 	pages = iommu_num_pages(dma_addr, size, PAGE_SIZE);
 	dma_addr &= PAGE_MASK;
@@ -2724,34 +2458,6 @@ static int amd_iommu_dma_supported(struct device *dev, u64 mask)
 	return check_device(dev);
 }
 
-static int set_dma_mask(struct device *dev, u64 mask)
-{
-	struct protection_domain *domain;
-	int max_apertures = 1;
-
-	domain = get_domain(dev);
-	if (IS_ERR(domain))
-		return PTR_ERR(domain);
-
-	if (mask == DMA_BIT_MASK(64))
-		max_apertures = 8;
-	else if (mask > DMA_BIT_MASK(32))
-		max_apertures = 4;
-
-	/*
-	 * To prevent lock contention it doesn't make sense to allocate more
-	 * apertures than online cpus
-	 */
-	if (max_apertures > num_online_cpus())
-		max_apertures = num_online_cpus();
-
-	if (dma_ops_domain_alloc_apertures(domain->priv, max_apertures))
-		dev_err(dev, "Can't allocate %d iommu apertures\n",
-			max_apertures);
-
-	return 0;
-}
-
 static struct dma_map_ops amd_iommu_dma_ops = {
 	.alloc		= alloc_coherent,
 	.free		= free_coherent,
@@ -2760,7 +2466,6 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 	.map_sg		= map_sg,
 	.unmap_sg	= unmap_sg,
 	.dma_supported	= amd_iommu_dma_supported,
-	.set_dma_mask	= set_dma_mask,
 };
 
 static int init_reserved_iova_ranges(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/20] iommu/amd: Remove other remains of old address allocator
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

There are other remains in the code from the old allocatore.
Remove them all.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 305 +---------------------------------------------
 1 file changed, 5 insertions(+), 300 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4276c88..fbfe51b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -133,43 +133,12 @@ static int protection_domain_init(struct protection_domain *domain);
 static void detach_device(struct device *dev);
 
 /*
- * For dynamic growth the aperture size is split into ranges of 128MB of
- * DMA address space each. This struct represents one such range.
- */
-struct aperture_range {
-
-	spinlock_t bitmap_lock;
-
-	/* address allocation bitmap */
-	unsigned long *bitmap;
-	unsigned long offset;
-	unsigned long next_bit;
-
-	/*
-	 * Array of PTE pages for the aperture. In this array we save all the
-	 * leaf pages of the domain page table used for the aperture. This way
-	 * we don't need to walk the page table to find a specific PTE. We can
-	 * just calculate its address in constant time.
-	 */
-	u64 *pte_pages[64];
-};
-
-/*
  * Data container for a dma_ops specific protection domain
  */
 struct dma_ops_domain {
 	/* generic protection domain information */
 	struct protection_domain domain;
 
-	/* size of the aperture for the mappings */
-	unsigned long aperture_size;
-
-	/* aperture index we start searching for free addresses */
-	u32 __percpu *next_index;
-
-	/* address space relevant data */
-	struct aperture_range *aperture[APERTURE_MAX_RANGES];
-
 	/* IOVA RB-Tree */
 	struct iova_domain iovad;
 };
@@ -409,43 +378,6 @@ static bool pdev_pri_erratum(struct pci_dev *pdev, u32 erratum)
 }
 
 /*
- * This function actually applies the mapping to the page table of the
- * dma_ops domain.
- */
-static void alloc_unity_mapping(struct dma_ops_domain *dma_dom,
-				struct unity_map_entry *e)
-{
-	u64 addr;
-
-	for (addr = e->address_start; addr < e->address_end;
-	     addr += PAGE_SIZE) {
-		if (addr < dma_dom->aperture_size)
-			__set_bit(addr >> PAGE_SHIFT,
-				  dma_dom->aperture[0]->bitmap);
-	}
-}
-
-/*
- * Inits the unity mappings required for a specific device
- */
-static void init_unity_mappings_for_device(struct device *dev,
-					   struct dma_ops_domain *dma_dom)
-{
-	struct unity_map_entry *e;
-	int devid;
-
-	devid = get_device_id(dev);
-	if (devid < 0)
-		return;
-
-	list_for_each_entry(e, &amd_iommu_unity_map, list) {
-		if (!(devid >= e->devid_start && devid <= e->devid_end))
-			continue;
-		alloc_unity_mapping(dma_dom, e);
-	}
-}
-
-/*
  * This function checks if the driver got a valid device from the caller to
  * avoid dereferencing invalid pointers.
  */
@@ -486,7 +418,6 @@ static void init_iommu_group(struct device *dev)
 
 	dma_domain = to_pdomain(domain)->priv;
 
-	init_unity_mappings_for_device(dev, dma_domain);
 out:
 	iommu_group_put(group);
 }
@@ -1493,158 +1424,10 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 /****************************************************************************
  *
  * The next functions belong to the address allocator for the dma_ops
- * interface functions. They work like the allocators in the other IOMMU
- * drivers. Its basically a bitmap which marks the allocated pages in
- * the aperture. Maybe it could be enhanced in the future to a more
- * efficient allocator.
+ * interface functions.
  *
  ****************************************************************************/
 
-/*
- * The address allocator core functions.
- *
- * called with domain->lock held
- */
-
-/*
- * Used to reserve address ranges in the aperture (e.g. for exclusion
- * ranges.
- */
-static void dma_ops_reserve_addresses(struct dma_ops_domain *dom,
-				      unsigned long start_page,
-				      unsigned int pages)
-{
-	unsigned int i, last_page = dom->aperture_size >> PAGE_SHIFT;
-
-	if (start_page + pages > last_page)
-		pages = last_page - start_page;
-
-	for (i = start_page; i < start_page + pages; ++i) {
-		int index = i / APERTURE_RANGE_PAGES;
-		int page  = i % APERTURE_RANGE_PAGES;
-		__set_bit(page, dom->aperture[index]->bitmap);
-	}
-}
-
-/*
- * This function is used to add a new aperture range to an existing
- * aperture in case of dma_ops domain allocation or address allocation
- * failure.
- */
-static int alloc_new_range(struct dma_ops_domain *dma_dom,
-			   bool populate, gfp_t gfp)
-{
-	int index = dma_dom->aperture_size >> APERTURE_RANGE_SHIFT;
-	unsigned long i, old_size, pte_pgsize;
-	struct aperture_range *range;
-	struct amd_iommu *iommu;
-	unsigned long flags;
-
-#ifdef CONFIG_IOMMU_STRESS
-	populate = false;
-#endif
-
-	if (index >= APERTURE_MAX_RANGES)
-		return -ENOMEM;
-
-	range = kzalloc(sizeof(struct aperture_range), gfp);
-	if (!range)
-		return -ENOMEM;
-
-	range->bitmap = (void *)get_zeroed_page(gfp);
-	if (!range->bitmap)
-		goto out_free;
-
-	range->offset = dma_dom->aperture_size;
-
-	spin_lock_init(&range->bitmap_lock);
-
-	if (populate) {
-		unsigned long address = dma_dom->aperture_size;
-		int i, num_ptes = APERTURE_RANGE_PAGES / 512;
-		u64 *pte, *pte_page;
-
-		for (i = 0; i < num_ptes; ++i) {
-			pte = alloc_pte(&dma_dom->domain, address, PAGE_SIZE,
-					&pte_page, gfp);
-			if (!pte)
-				goto out_free;
-
-			range->pte_pages[i] = pte_page;
-
-			address += APERTURE_RANGE_SIZE / 64;
-		}
-	}
-
-	spin_lock_irqsave(&dma_dom->domain.lock, flags);
-
-	/* First take the bitmap_lock and then publish the range */
-	spin_lock(&range->bitmap_lock);
-
-	old_size                 = dma_dom->aperture_size;
-	dma_dom->aperture[index] = range;
-	dma_dom->aperture_size  += APERTURE_RANGE_SIZE;
-
-	/* Reserve address range used for MSI messages */
-	if (old_size < MSI_ADDR_BASE_LO &&
-	    dma_dom->aperture_size > MSI_ADDR_BASE_LO) {
-		unsigned long spage;
-		int pages;
-
-		pages = iommu_num_pages(MSI_ADDR_BASE_LO, 0x10000, PAGE_SIZE);
-		spage = MSI_ADDR_BASE_LO >> PAGE_SHIFT;
-
-		dma_ops_reserve_addresses(dma_dom, spage, pages);
-	}
-
-	/* Initialize the exclusion range if necessary */
-	for_each_iommu(iommu) {
-		if (iommu->exclusion_start &&
-		    iommu->exclusion_start >= dma_dom->aperture[index]->offset
-		    && iommu->exclusion_start < dma_dom->aperture_size) {
-			unsigned long startpage;
-			int pages = iommu_num_pages(iommu->exclusion_start,
-						    iommu->exclusion_length,
-						    PAGE_SIZE);
-			startpage = iommu->exclusion_start >> PAGE_SHIFT;
-			dma_ops_reserve_addresses(dma_dom, startpage, pages);
-		}
-	}
-
-	/*
-	 * Check for areas already mapped as present in the new aperture
-	 * range and mark those pages as reserved in the allocator. Such
-	 * mappings may already exist as a result of requested unity
-	 * mappings for devices.
-	 */
-	for (i = dma_dom->aperture[index]->offset;
-	     i < dma_dom->aperture_size;
-	     i += pte_pgsize) {
-		u64 *pte = fetch_pte(&dma_dom->domain, i, &pte_pgsize);
-		if (!pte || !IOMMU_PTE_PRESENT(*pte))
-			continue;
-
-		dma_ops_reserve_addresses(dma_dom, i >> PAGE_SHIFT,
-					  pte_pgsize >> 12);
-	}
-
-	update_domain(&dma_dom->domain);
-
-	spin_unlock(&range->bitmap_lock);
-
-	spin_unlock_irqrestore(&dma_dom->domain.lock, flags);
-
-	return 0;
-
-out_free:
-	update_domain(&dma_dom->domain);
-
-	free_page((unsigned long)range->bitmap);
-
-	kfree(range);
-
-	return -ENOMEM;
-}
 
 static unsigned long dma_ops_alloc_iova(struct device *dev,
 					struct dma_ops_domain *dma_dom,
@@ -1845,46 +1628,18 @@ static void free_gcr3_table(struct protection_domain *domain)
  */
 static void dma_ops_domain_free(struct dma_ops_domain *dom)
 {
-	int i;
-
 	if (!dom)
 		return;
 
-	put_iova_domain(&dom->iovad);
-
-	free_percpu(dom->next_index);
-
 	del_domain_from_list(&dom->domain);
 
-	free_pagetable(&dom->domain);
+	put_iova_domain(&dom->iovad);
 
-	for (i = 0; i < APERTURE_MAX_RANGES; ++i) {
-		if (!dom->aperture[i])
-			continue;
-		free_page((unsigned long)dom->aperture[i]->bitmap);
-		kfree(dom->aperture[i]);
-	}
+	free_pagetable(&dom->domain);
 
 	kfree(dom);
 }
 
-static int dma_ops_domain_alloc_apertures(struct dma_ops_domain *dma_dom,
-					  int max_apertures)
-{
-	int ret, i, apertures;
-
-	apertures = dma_dom->aperture_size >> APERTURE_RANGE_SHIFT;
-	ret       = 0;
-
-	for (i = apertures; i < max_apertures; ++i) {
-		ret = alloc_new_range(dma_dom, false, GFP_KERNEL);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-
 /*
  * Allocates a new protection domain usable for the dma_ops functions.
  * It also initializes the page table and the address allocator data
@@ -1893,7 +1648,6 @@ static int dma_ops_domain_alloc_apertures(struct dma_ops_domain *dma_dom,
 static struct dma_ops_domain *dma_ops_domain_alloc(void)
 {
 	struct dma_ops_domain *dma_dom;
-	int cpu;
 
 	dma_dom = kzalloc(sizeof(struct dma_ops_domain), GFP_KERNEL);
 	if (!dma_dom)
@@ -1902,10 +1656,6 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	if (protection_domain_init(&dma_dom->domain))
 		goto free_dma_dom;
 
-	dma_dom->next_index = alloc_percpu(u32);
-	if (!dma_dom->next_index)
-		goto free_dma_dom;
-
 	dma_dom->domain.mode = PAGE_MODE_2_LEVEL;
 	dma_dom->domain.pt_root = (void *)get_zeroed_page(GFP_KERNEL);
 	dma_dom->domain.flags = PD_DMA_OPS_MASK;
@@ -1913,26 +1663,14 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	if (!dma_dom->domain.pt_root)
 		goto free_dma_dom;
 
-	add_domain_to_list(&dma_dom->domain);
-
-	if (alloc_new_range(dma_dom, true, GFP_KERNEL))
-		goto free_dma_dom;
-
-	/*
-	 * mark the first page as allocated so we never return 0 as
-	 * a valid dma-address. So we can use 0 as error value
-	 */
-	dma_dom->aperture[0]->bitmap[0] = 1;
-
-	for_each_possible_cpu(cpu)
-		*per_cpu_ptr(dma_dom->next_index, cpu) = 0;
-
 	init_iova_domain(&dma_dom->iovad, PAGE_SIZE,
 			 IOVA_START_PFN, DMA_32BIT_PFN);
 
 	/* Initialize reserved ranges */
 	copy_reserved_iova(&reserved_iova_ranges, &dma_dom->iovad);
 
+	add_domain_to_list(&dma_dom->domain);
+
 	return dma_dom;
 
 free_dma_dom:
@@ -2507,10 +2245,6 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 	dma_addr_t i, start;
 	unsigned int pages;
 
-	if ((dma_addr == DMA_ERROR_CODE) ||
-	    (dma_addr + size > dma_dom->aperture_size))
-		return;
-
 	flush_addr = dma_addr;
 	pages = iommu_num_pages(dma_addr, size, PAGE_SIZE);
 	dma_addr &= PAGE_MASK;
@@ -2724,34 +2458,6 @@ static int amd_iommu_dma_supported(struct device *dev, u64 mask)
 	return check_device(dev);
 }
 
-static int set_dma_mask(struct device *dev, u64 mask)
-{
-	struct protection_domain *domain;
-	int max_apertures = 1;
-
-	domain = get_domain(dev);
-	if (IS_ERR(domain))
-		return PTR_ERR(domain);
-
-	if (mask == DMA_BIT_MASK(64))
-		max_apertures = 8;
-	else if (mask > DMA_BIT_MASK(32))
-		max_apertures = 4;
-
-	/*
-	 * To prevent lock contention it doesn't make sense to allocate more
-	 * apertures than online cpus
-	 */
-	if (max_apertures > num_online_cpus())
-		max_apertures = num_online_cpus();
-
-	if (dma_ops_domain_alloc_apertures(domain->priv, max_apertures))
-		dev_err(dev, "Can't allocate %d iommu apertures\n",
-			max_apertures);
-
-	return 0;
-}
-
 static struct dma_map_ops amd_iommu_dma_ops = {
 	.alloc		= alloc_coherent,
 	.free		= free_coherent,
@@ -2760,7 +2466,6 @@ static struct dma_map_ops amd_iommu_dma_ops = {
 	.map_sg		= map_sg,
 	.unmap_sg	= unmap_sg,
 	.dma_supported	= amd_iommu_dma_supported,
-	.set_dma_mask	= set_dma_mask,
 };
 
 static int init_reserved_iova_ranges(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/20] iommu/amd: Remove align-parameter from __map_single()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

This parameter is not required anymore because the
iova-allocations are always aligned to its size.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fbfe51b..b91843c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2170,22 +2170,17 @@ static dma_addr_t __map_single(struct device *dev,
 			       phys_addr_t paddr,
 			       size_t size,
 			       int direction,
-			       bool align,
 			       u64 dma_mask)
 {
 	dma_addr_t offset = paddr & ~PAGE_MASK;
 	dma_addr_t address, start, ret;
 	unsigned int pages;
-	unsigned long align_mask = 0;
 	int prot = 0;
 	int i;
 
 	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
 	paddr &= PAGE_MASK;
 
-	if (align)
-		align_mask = (1UL << get_order(size)) - 1;
-
 	address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
 	if (address == DMA_ERROR_CODE)
 		goto out;
@@ -2281,8 +2276,7 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 
 	dma_mask = *dev->dma_mask;
 
-	return __map_single(dev, domain->priv, paddr, size, dir, false,
-			    dma_mask);
+	return __map_single(dev, domain->priv, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2325,8 +2319,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 		paddr = sg_phys(s);
 
 		s->dma_address = __map_single(dev, domain->priv,
-					      paddr, s->length, dir, false,
-					      dma_mask);
+					      paddr, s->length, dir, dma_mask);
 
 		if (s->dma_address) {
 			s->dma_length = s->length;
@@ -2410,7 +2403,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 		dma_mask = *dev->dma_mask;
 
 	*dma_addr = __map_single(dev, domain->priv, page_to_phys(page),
-				 size, DMA_BIDIRECTIONAL, true, dma_mask);
+				 size, DMA_BIDIRECTIONAL, dma_mask);
 
 	if (*dma_addr == DMA_ERROR_CODE)
 		goto out_free;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/20] iommu/amd: Remove align-parameter from __map_single()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

This parameter is not required anymore because the
iova-allocations are always aligned to its size.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fbfe51b..b91843c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2170,22 +2170,17 @@ static dma_addr_t __map_single(struct device *dev,
 			       phys_addr_t paddr,
 			       size_t size,
 			       int direction,
-			       bool align,
 			       u64 dma_mask)
 {
 	dma_addr_t offset = paddr & ~PAGE_MASK;
 	dma_addr_t address, start, ret;
 	unsigned int pages;
-	unsigned long align_mask = 0;
 	int prot = 0;
 	int i;
 
 	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
 	paddr &= PAGE_MASK;
 
-	if (align)
-		align_mask = (1UL << get_order(size)) - 1;
-
 	address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
 	if (address == DMA_ERROR_CODE)
 		goto out;
@@ -2281,8 +2276,7 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 
 	dma_mask = *dev->dma_mask;
 
-	return __map_single(dev, domain->priv, paddr, size, dir, false,
-			    dma_mask);
+	return __map_single(dev, domain->priv, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2325,8 +2319,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 		paddr = sg_phys(s);
 
 		s->dma_address = __map_single(dev, domain->priv,
-					      paddr, s->length, dir, false,
-					      dma_mask);
+					      paddr, s->length, dir, dma_mask);
 
 		if (s->dma_address) {
 			s->dma_length = s->length;
@@ -2410,7 +2403,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 		dma_mask = *dev->dma_mask;
 
 	*dma_addr = __map_single(dev, domain->priv, page_to_phys(page),
-				 size, DMA_BIDIRECTIONAL, true, dma_mask);
+				 size, DMA_BIDIRECTIONAL, dma_mask);
 
 	if (*dma_addr == DMA_ERROR_CODE)
 		goto out_free;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/20] iommu/amd: Set up data structures for flush queue
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

The flush queue is the equivalent to defered-flushing in the
Intel VT-d driver. This patch sets up the data structures
needed for this.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b91843c..4e27c57 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -88,6 +88,22 @@ LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
 LIST_HEAD(acpihid_map);
 
+#define FLUSH_QUEUE_SIZE 256
+
+struct flush_queue_entry {
+	unsigned long iova_pfn;
+	unsigned long pages;
+	struct dma_ops_domain *dma_dom;
+};
+
+struct flush_queue {
+	spinlock_t lock;
+	unsigned next;
+	struct flush_queue_entry *entries;
+};
+
+DEFINE_PER_CPU(struct flush_queue, flush_queue);
+
 /*
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
@@ -2516,7 +2532,7 @@ static int init_reserved_iova_ranges(void)
 
 int __init amd_iommu_init_api(void)
 {
-	int ret, err = 0;
+	int ret, cpu, err = 0;
 
 	ret = iova_cache_get();
 	if (ret)
@@ -2526,6 +2542,18 @@ int __init amd_iommu_init_api(void)
 	if (ret)
 		return ret;
 
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue = per_cpu_ptr(&flush_queue, cpu);
+
+		queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
+					 sizeof(*queue->entries),
+					 GFP_KERNEL);
+		if (!queue->entries)
+			goto out_put_iova;
+
+		spin_lock_init(&queue->lock);
+	}
+
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
 		return err;
@@ -2535,6 +2563,15 @@ int __init amd_iommu_init_api(void)
 		return err;
 #endif
 	return 0;
+
+out_put_iova:
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue = per_cpu_ptr(&flush_queue, cpu);
+
+		kfree(queue->entries);
+	}
+
+	return -ENOMEM;
 }
 
 int __init amd_iommu_init_dma_ops(void)
@@ -2557,6 +2594,7 @@ int __init amd_iommu_init_dma_ops(void)
 		pr_info("AMD-Vi: Lazy IO/TLB flushing enabled\n");
 
 	return 0;
+
 }
 
 /*****************************************************************************
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/20] iommu/amd: Set up data structures for flush queue
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

The flush queue is the equivalent to defered-flushing in the
Intel VT-d driver. This patch sets up the data structures
needed for this.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b91843c..4e27c57 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -88,6 +88,22 @@ LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
 LIST_HEAD(acpihid_map);
 
+#define FLUSH_QUEUE_SIZE 256
+
+struct flush_queue_entry {
+	unsigned long iova_pfn;
+	unsigned long pages;
+	struct dma_ops_domain *dma_dom;
+};
+
+struct flush_queue {
+	spinlock_t lock;
+	unsigned next;
+	struct flush_queue_entry *entries;
+};
+
+DEFINE_PER_CPU(struct flush_queue, flush_queue);
+
 /*
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
@@ -2516,7 +2532,7 @@ static int init_reserved_iova_ranges(void)
 
 int __init amd_iommu_init_api(void)
 {
-	int ret, err = 0;
+	int ret, cpu, err = 0;
 
 	ret = iova_cache_get();
 	if (ret)
@@ -2526,6 +2542,18 @@ int __init amd_iommu_init_api(void)
 	if (ret)
 		return ret;
 
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue = per_cpu_ptr(&flush_queue, cpu);
+
+		queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
+					 sizeof(*queue->entries),
+					 GFP_KERNEL);
+		if (!queue->entries)
+			goto out_put_iova;
+
+		spin_lock_init(&queue->lock);
+	}
+
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
 		return err;
@@ -2535,6 +2563,15 @@ int __init amd_iommu_init_api(void)
 		return err;
 #endif
 	return 0;
+
+out_put_iova:
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue = per_cpu_ptr(&flush_queue, cpu);
+
+		kfree(queue->entries);
+	}
+
+	return -ENOMEM;
 }
 
 int __init amd_iommu_init_dma_ops(void)
@@ -2557,6 +2594,7 @@ int __init amd_iommu_init_dma_ops(void)
 		pr_info("AMD-Vi: Lazy IO/TLB flushing enabled\n");
 
 	return 0;
+
 }
 
 /*****************************************************************************
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/20] iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

If domain == NULL is passed to the function, it will queue a
completion-wait command on all IOMMUs in the system.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4e27c57..7b8e1f8 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1185,7 +1185,7 @@ static void domain_flush_complete(struct protection_domain *domain)
 	int i;
 
 	for (i = 0; i < amd_iommus_present; ++i) {
-		if (!domain->dev_iommu[i])
+		if (domain && !domain->dev_iommu[i])
 			continue;
 
 		/*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/20] iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

If domain == NULL is passed to the function, it will queue a
completion-wait command on all IOMMUs in the system.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4e27c57..7b8e1f8 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1185,7 +1185,7 @@ static void domain_flush_complete(struct protection_domain *domain)
 	int i;
 
 	for (i = 0; i < amd_iommus_present; ++i) {
-		if (!domain->dev_iommu[i])
+		if (domain && !domain->dev_iommu[i])
 			continue;
 
 		/*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/20] iommu/amd: Implement flush queue
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

With the flush queue the IOMMU TLBs will not be flushed at
every dma-ops unmap operation. The unmapped ranges will be
queued and flushed at once, when the queue is full. This
makes unmapping operations a lot faster (on average) and
restores the performance of the old address allocator.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 71 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 67 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 7b8e1f8..a3e09cd 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2128,6 +2128,66 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
  *
  *****************************************************************************/
 
+static void __queue_flush(struct flush_queue *queue)
+{
+	struct protection_domain *domain;
+	unsigned long flags;
+	int idx;
+
+	/* First flush TLB of all known domains */
+	spin_lock_irqsave(&amd_iommu_pd_lock, flags);
+	list_for_each_entry(domain, &amd_iommu_pd_list, list)
+		domain_flush_tlb(domain);
+	spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
+
+	/* Wait until flushes have completed */
+	domain_flush_complete(NULL);
+
+	for (idx = 0; idx < queue->next; ++idx) {
+		struct flush_queue_entry *entry;
+
+		entry = queue->entries + idx;
+
+		free_iova_fast(&entry->dma_dom->iovad,
+				entry->iova_pfn,
+				entry->pages);
+
+		/* Not really necessary, just to make sure we catch any bugs */
+		entry->dma_dom = NULL;
+	}
+
+	queue->next = 0;
+}
+
+static void queue_add(struct dma_ops_domain *dma_dom,
+		      unsigned long address, unsigned long pages)
+{
+	struct flush_queue_entry *entry;
+	struct flush_queue *queue;
+	unsigned long flags;
+	int idx;
+
+	pages     = __roundup_pow_of_two(pages);
+	address >>= PAGE_SHIFT;
+
+	queue = get_cpu_ptr(&flush_queue);
+	spin_lock_irqsave(&queue->lock, flags);
+
+	if (queue->next == FLUSH_QUEUE_SIZE)
+		__queue_flush(queue);
+
+	idx   = queue->next++;
+	entry = queue->entries + idx;
+
+	entry->iova_pfn = address;
+	entry->pages    = pages;
+	entry->dma_dom  = dma_dom;
+
+	spin_unlock_irqrestore(&queue->lock, flags);
+	put_cpu_ptr(&flush_queue);
+}
+
+
 /*
  * In the dma_ops path we only have the struct device. This function
  * finds the corresponding IOMMU, the protection domain and the
@@ -2266,10 +2326,13 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 		start += PAGE_SIZE;
 	}
 
-	domain_flush_tlb(&dma_dom->domain);
-	domain_flush_complete(&dma_dom->domain);
-
-	dma_ops_free_iova(dma_dom, dma_addr, pages);
+	if (amd_iommu_unmap_flush) {
+		dma_ops_free_iova(dma_dom, dma_addr, pages);
+		domain_flush_tlb(&dma_dom->domain);
+		domain_flush_complete(&dma_dom->domain);
+	} else {
+		queue_add(dma_dom, dma_addr, pages);
+	}
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/20] iommu/amd: Implement flush queue
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

With the flush queue the IOMMU TLBs will not be flushed at
every dma-ops unmap operation. The unmapped ranges will be
queued and flushed at once, when the queue is full. This
makes unmapping operations a lot faster (on average) and
restores the performance of the old address allocator.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 71 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 67 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 7b8e1f8..a3e09cd 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2128,6 +2128,66 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
  *
  *****************************************************************************/
 
+static void __queue_flush(struct flush_queue *queue)
+{
+	struct protection_domain *domain;
+	unsigned long flags;
+	int idx;
+
+	/* First flush TLB of all known domains */
+	spin_lock_irqsave(&amd_iommu_pd_lock, flags);
+	list_for_each_entry(domain, &amd_iommu_pd_list, list)
+		domain_flush_tlb(domain);
+	spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
+
+	/* Wait until flushes have completed */
+	domain_flush_complete(NULL);
+
+	for (idx = 0; idx < queue->next; ++idx) {
+		struct flush_queue_entry *entry;
+
+		entry = queue->entries + idx;
+
+		free_iova_fast(&entry->dma_dom->iovad,
+				entry->iova_pfn,
+				entry->pages);
+
+		/* Not really necessary, just to make sure we catch any bugs */
+		entry->dma_dom = NULL;
+	}
+
+	queue->next = 0;
+}
+
+static void queue_add(struct dma_ops_domain *dma_dom,
+		      unsigned long address, unsigned long pages)
+{
+	struct flush_queue_entry *entry;
+	struct flush_queue *queue;
+	unsigned long flags;
+	int idx;
+
+	pages     = __roundup_pow_of_two(pages);
+	address >>= PAGE_SHIFT;
+
+	queue = get_cpu_ptr(&flush_queue);
+	spin_lock_irqsave(&queue->lock, flags);
+
+	if (queue->next == FLUSH_QUEUE_SIZE)
+		__queue_flush(queue);
+
+	idx   = queue->next++;
+	entry = queue->entries + idx;
+
+	entry->iova_pfn = address;
+	entry->pages    = pages;
+	entry->dma_dom  = dma_dom;
+
+	spin_unlock_irqrestore(&queue->lock, flags);
+	put_cpu_ptr(&flush_queue);
+}
+
+
 /*
  * In the dma_ops path we only have the struct device. This function
  * finds the corresponding IOMMU, the protection domain and the
@@ -2266,10 +2326,13 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
 		start += PAGE_SIZE;
 	}
 
-	domain_flush_tlb(&dma_dom->domain);
-	domain_flush_complete(&dma_dom->domain);
-
-	dma_ops_free_iova(dma_dom, dma_addr, pages);
+	if (amd_iommu_unmap_flush) {
+		dma_ops_free_iova(dma_dom, dma_addr, pages);
+		domain_flush_tlb(&dma_dom->domain);
+		domain_flush_complete(&dma_dom->domain);
+	} else {
+		queue_add(dma_dom, dma_addr, pages);
+	}
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 14/20] iommu/amd: Implement timeout to flush unmap queues
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

In case the queue doesn't fill up, we flush the TLB at least
10ms after the unmap happened to make sure that the TLB is
cleaned up.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a3e09cd..1bb003e 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -104,6 +104,9 @@ struct flush_queue {
 
 DEFINE_PER_CPU(struct flush_queue, flush_queue);
 
+static atomic_t queue_timer_on;
+static struct timer_list queue_timer;
+
 /*
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
@@ -2159,6 +2162,24 @@ static void __queue_flush(struct flush_queue *queue)
 	queue->next = 0;
 }
 
+void queue_flush_timeout(unsigned long unsused)
+{
+	int cpu;
+
+	atomic_set(&queue_timer_on, 0);
+
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue;
+		unsigned long flags;
+
+		queue = per_cpu_ptr(&flush_queue, cpu);
+		spin_lock_irqsave(&queue->lock, flags);
+		if (queue->next > 0)
+			__queue_flush(queue);
+		spin_unlock_irqrestore(&queue->lock, flags);
+	}
+}
+
 static void queue_add(struct dma_ops_domain *dma_dom,
 		      unsigned long address, unsigned long pages)
 {
@@ -2184,6 +2205,10 @@ static void queue_add(struct dma_ops_domain *dma_dom,
 	entry->dma_dom  = dma_dom;
 
 	spin_unlock_irqrestore(&queue->lock, flags);
+
+	if (atomic_cmpxchg(&queue_timer_on, 0, 1) == 0)
+		mod_timer(&queue_timer, jiffies + msecs_to_jiffies(10));
+
 	put_cpu_ptr(&flush_queue);
 }
 
@@ -2639,6 +2664,9 @@ out_put_iova:
 
 int __init amd_iommu_init_dma_ops(void)
 {
+	setup_timer(&queue_timer, queue_flush_timeout, 0);
+	atomic_set(&queue_timer_on, 0);
+
 	swiotlb        = iommu_pass_through ? 1 : 0;
 	iommu_detected = 1;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 14/20] iommu/amd: Implement timeout to flush unmap queues
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

In case the queue doesn't fill up, we flush the TLB at least
10ms after the unmap happened to make sure that the TLB is
cleaned up.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a3e09cd..1bb003e 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -104,6 +104,9 @@ struct flush_queue {
 
 DEFINE_PER_CPU(struct flush_queue, flush_queue);
 
+static atomic_t queue_timer_on;
+static struct timer_list queue_timer;
+
 /*
  * Domain for untranslated devices - only allocated
  * if iommu=pt passed on kernel cmd line.
@@ -2159,6 +2162,24 @@ static void __queue_flush(struct flush_queue *queue)
 	queue->next = 0;
 }
 
+void queue_flush_timeout(unsigned long unsused)
+{
+	int cpu;
+
+	atomic_set(&queue_timer_on, 0);
+
+	for_each_possible_cpu(cpu) {
+		struct flush_queue *queue;
+		unsigned long flags;
+
+		queue = per_cpu_ptr(&flush_queue, cpu);
+		spin_lock_irqsave(&queue->lock, flags);
+		if (queue->next > 0)
+			__queue_flush(queue);
+		spin_unlock_irqrestore(&queue->lock, flags);
+	}
+}
+
 static void queue_add(struct dma_ops_domain *dma_dom,
 		      unsigned long address, unsigned long pages)
 {
@@ -2184,6 +2205,10 @@ static void queue_add(struct dma_ops_domain *dma_dom,
 	entry->dma_dom  = dma_dom;
 
 	spin_unlock_irqrestore(&queue->lock, flags);
+
+	if (atomic_cmpxchg(&queue_timer_on, 0, 1) == 0)
+		mod_timer(&queue_timer, jiffies + msecs_to_jiffies(10));
+
 	put_cpu_ptr(&flush_queue);
 }
 
@@ -2639,6 +2664,9 @@ out_put_iova:
 
 int __init amd_iommu_init_dma_ops(void)
 {
+	setup_timer(&queue_timer, queue_flush_timeout, 0);
+	atomic_set(&queue_timer_on, 0);
+
 	swiotlb        = iommu_pass_through ? 1 : 0;
 	iommu_detected = 1;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/20] iommu/amd: Introduce dir2prot() helper
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

This function converts dma_data_direction to
iommu-protection flags. This will be needed on multiple
places in the code, so this will save some code.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1bb003e..78b278b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2260,6 +2260,17 @@ static void update_domain(struct protection_domain *domain)
 	domain->updated = false;
 }
 
+static int dir2prot(enum dma_data_direction direction)
+{
+	if (direction == DMA_TO_DEVICE)
+		return IOMMU_PROT_IR;
+	else if (direction == DMA_FROM_DEVICE)
+		return IOMMU_PROT_IW;
+	else if (direction == DMA_BIDIRECTIONAL)
+		return IOMMU_PROT_IW | IOMMU_PROT_IR;
+	else
+		return 0;
+}
 /*
  * This function contains common code for mapping of a physically
  * contiguous memory region into DMA address space. It is used by all
@@ -2270,7 +2281,7 @@ static dma_addr_t __map_single(struct device *dev,
 			       struct dma_ops_domain *dma_dom,
 			       phys_addr_t paddr,
 			       size_t size,
-			       int direction,
+			       enum dma_data_direction direction,
 			       u64 dma_mask)
 {
 	dma_addr_t offset = paddr & ~PAGE_MASK;
@@ -2286,12 +2297,7 @@ static dma_addr_t __map_single(struct device *dev,
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
-	if (direction == DMA_TO_DEVICE)
-		prot = IOMMU_PROT_IR;
-	else if (direction == DMA_FROM_DEVICE)
-		prot = IOMMU_PROT_IW;
-	else if (direction == DMA_BIDIRECTIONAL)
-		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
+	prot = dir2prot(direction);
 
 	start = address;
 	for (i = 0; i < pages; ++i) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/20] iommu/amd: Introduce dir2prot() helper
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

This function converts dma_data_direction to
iommu-protection flags. This will be needed on multiple
places in the code, so this will save some code.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1bb003e..78b278b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2260,6 +2260,17 @@ static void update_domain(struct protection_domain *domain)
 	domain->updated = false;
 }
 
+static int dir2prot(enum dma_data_direction direction)
+{
+	if (direction == DMA_TO_DEVICE)
+		return IOMMU_PROT_IR;
+	else if (direction == DMA_FROM_DEVICE)
+		return IOMMU_PROT_IW;
+	else if (direction == DMA_BIDIRECTIONAL)
+		return IOMMU_PROT_IW | IOMMU_PROT_IR;
+	else
+		return 0;
+}
 /*
  * This function contains common code for mapping of a physically
  * contiguous memory region into DMA address space. It is used by all
@@ -2270,7 +2281,7 @@ static dma_addr_t __map_single(struct device *dev,
 			       struct dma_ops_domain *dma_dom,
 			       phys_addr_t paddr,
 			       size_t size,
-			       int direction,
+			       enum dma_data_direction direction,
 			       u64 dma_mask)
 {
 	dma_addr_t offset = paddr & ~PAGE_MASK;
@@ -2286,12 +2297,7 @@ static dma_addr_t __map_single(struct device *dev,
 	if (address == DMA_ERROR_CODE)
 		goto out;
 
-	if (direction == DMA_TO_DEVICE)
-		prot = IOMMU_PROT_IR;
-	else if (direction == DMA_FROM_DEVICE)
-		prot = IOMMU_PROT_IW;
-	else if (direction == DMA_BIDIRECTIONAL)
-		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
+	prot = dir2prot(direction);
 
 	start = address;
 	for (i = 0; i < pages; ++i) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg
  2016-07-08 11:44 ` Joerg Roedel
                   ` (15 preceding siblings ...)
  (?)
@ 2016-07-08 11:45 ` Joerg Roedel
  2016-07-12 11:33   ` Robin Murphy
  -1 siblings, 1 reply; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Optimize these functions so that they need only one call
into the address alloctor. This also saves a couple of
io-tlb flushes in the unmap_sg path.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 77 ++++++++++++++++++++++++++++++++---------------
 1 file changed, 52 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 78b278b..e5f8e7f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2409,45 +2409,70 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
  * lists).
  */
 static int map_sg(struct device *dev, struct scatterlist *sglist,
-		  int nelems, enum dma_data_direction dir,
+		  int nelems, enum dma_data_direction direction,
 		  struct dma_attrs *attrs)
 {
+	int mapped_pages = 0, npages = 0, prot = 0, i;
+	unsigned long start_addr, address;
 	struct protection_domain *domain;
-	int i;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
-	phys_addr_t paddr;
-	int mapped_elems = 0;
 	u64 dma_mask;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return 0;
 
+	dma_dom  = domain->priv;
 	dma_mask = *dev->dma_mask;
 
+	for_each_sg(sglist, s, nelems, i)
+		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
+	if (address == DMA_ERROR_CODE)
+		goto out_err;
+
+	start_addr = address;
+	prot       = dir2prot(direction);
+
 	for_each_sg(sglist, s, nelems, i) {
-		paddr = sg_phys(s);
+		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+		for (j = 0; j < pages; ++j) {
+			unsigned long bus_addr, phys_addr;
+			int ret;
 
-		s->dma_address = __map_single(dev, domain->priv,
-					      paddr, s->length, dir, dma_mask);
+			bus_addr  = address + (j << PAGE_SHIFT);
+			phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
+			ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
+			if (ret)
+				goto out_unmap;
+
+			mapped_pages += 1;
+		}
 
-		if (s->dma_address) {
-			s->dma_length = s->length;
-			mapped_elems++;
-		} else
-			goto unmap;
+		s->dma_address = address + s->offset;
+		s->dma_length  = s->length;
+		address += pages << PAGE_SHIFT;
 	}
 
-	return mapped_elems;
+	return nelems;
 
-unmap:
-	for_each_sg(sglist, s, mapped_elems, i) {
-		if (s->dma_address)
-			__unmap_single(domain->priv, s->dma_address,
-				       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
+
+out_unmap:
+	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
+	       dev_name(dev), npages);
+
+	for (i = 0; i < mapped_pages; ++i) {
+		iommu_unmap_page(domain,
+				 start_addr + (i << PAGE_SHIFT),
+				 PAGE_SIZE);
 	}
 
+	free_iova_fast(&dma_dom->iovad, start_addr, npages);
+
+out_err:
 	return 0;
 }
 
@@ -2460,18 +2485,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
+	unsigned long startaddr;
 	struct scatterlist *s;
-	int i;
+	int i,npages = 0;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return;
 
-	for_each_sg(sglist, s, nelems, i) {
-		__unmap_single(domain->priv, s->dma_address,
-			       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
-	}
+	for_each_sg(sglist, s, nelems, i)
+		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+
+	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 17/20] iommu/amd: Use dev_data->domain in get_domain()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Using the cached value is much more efficient than calling
into the IOMMU core code.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e5f8e7f..4b17148 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2223,16 +2223,11 @@ static void queue_add(struct dma_ops_domain *dma_dom,
 static struct protection_domain *get_domain(struct device *dev)
 {
 	struct protection_domain *domain;
-	struct iommu_domain *io_domain;
 
 	if (!check_device(dev))
 		return ERR_PTR(-EINVAL);
 
-	io_domain = iommu_get_domain_for_dev(dev);
-	if (!io_domain)
-		return NULL;
-
-	domain = to_pdomain(io_domain);
+	domain = get_dev_data(dev)->domain;
 	if (!dma_ops_domain(domain))
 		return ERR_PTR(-EBUSY);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 17/20] iommu/amd: Use dev_data->domain in get_domain()
@ 2016-07-08 11:45   ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Using the cached value is much more efficient than calling
into the IOMMU core code.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e5f8e7f..4b17148 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2223,16 +2223,11 @@ static void queue_add(struct dma_ops_domain *dma_dom,
 static struct protection_domain *get_domain(struct device *dev)
 {
 	struct protection_domain *domain;
-	struct iommu_domain *io_domain;
 
 	if (!check_device(dev))
 		return ERR_PTR(-EINVAL);
 
-	io_domain = iommu_get_domain_for_dev(dev);
-	if (!io_domain)
-		return NULL;
-
-	domain = to_pdomain(io_domain);
+	domain = get_dev_data(dev)->domain;
 	if (!dma_ops_domain(domain))
 		return ERR_PTR(-EBUSY);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 18/20] iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  2016-07-08 11:44 ` Joerg Roedel
                   ` (17 preceding siblings ...)
  (?)
@ 2016-07-08 11:45 ` Joerg Roedel
  -1 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

This domain type is not yet handled in the
iommu_ops->domain_free() call-back. Fix that.

Fixes: 0bb6e243d7fb ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4b17148..dc3dbb4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2836,9 +2836,7 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
 static void amd_iommu_domain_free(struct iommu_domain *dom)
 {
 	struct protection_domain *domain;
-
-	if (!dom)
-		return;
+	struct dma_ops_domain *dma_dom;
 
 	domain = to_pdomain(dom);
 
@@ -2847,13 +2845,24 @@ static void amd_iommu_domain_free(struct iommu_domain *dom)
 
 	BUG_ON(domain->dev_cnt != 0);
 
-	if (domain->mode != PAGE_MODE_NONE)
-		free_pagetable(domain);
+	if (!dom)
+		return;
 
-	if (domain->flags & PD_IOMMUV2_MASK)
-		free_gcr3_table(domain);
+	switch (dom->type) {
+	case IOMMU_DOMAIN_DMA:
+		dma_dom = domain->priv;
+		dma_ops_domain_free(dma_dom);
+		break;
+	default:
+		if (domain->mode != PAGE_MODE_NONE)
+			free_pagetable(domain);
 
-	protection_domain_free(domain);
+		if (domain->flags & PD_IOMMUV2_MASK)
+			free_gcr3_table(domain);
+
+		protection_domain_free(domain);
+		break;
+	}
 }
 
 static void amd_iommu_detach_device(struct iommu_domain *dom,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 19/20] iommu/amd: Flush iova queue before releasing dma_ops_domain
  2016-07-08 11:44 ` Joerg Roedel
                   ` (18 preceding siblings ...)
  (?)
@ 2016-07-08 11:45 ` Joerg Roedel
  -1 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Before a dma_ops_domain can be freed, we need to make sure
it is not longer referenced by the flush queue. So empty the
queue before a dma_ops_domain can be freed.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index dc3dbb4..4052997 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2162,12 +2162,10 @@ static void __queue_flush(struct flush_queue *queue)
 	queue->next = 0;
 }
 
-void queue_flush_timeout(unsigned long unsused)
+static void queue_flush_all(void)
 {
 	int cpu;
 
-	atomic_set(&queue_timer_on, 0);
-
 	for_each_possible_cpu(cpu) {
 		struct flush_queue *queue;
 		unsigned long flags;
@@ -2180,6 +2178,12 @@ void queue_flush_timeout(unsigned long unsused)
 	}
 }
 
+static void queue_flush_timeout(unsigned long unsused)
+{
+	atomic_set(&queue_timer_on, 0);
+	queue_flush_all();
+}
+
 static void queue_add(struct dma_ops_domain *dma_dom,
 		      unsigned long address, unsigned long pages)
 {
@@ -2850,6 +2854,13 @@ static void amd_iommu_domain_free(struct iommu_domain *dom)
 
 	switch (dom->type) {
 	case IOMMU_DOMAIN_DMA:
+		/*
+		 * First make sure the domain is no longer referenced from the
+		 * flush queue
+		 */
+		queue_flush_all();
+
+		/* Now release the domain */
 		dma_dom = domain->priv;
 		dma_ops_domain_free(dma_dom);
 		break;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 20/20] iommu/amd: Use container_of to get dma_ops_domain
  2016-07-08 11:44 ` Joerg Roedel
                   ` (19 preceding siblings ...)
  (?)
@ 2016-07-08 11:45 ` Joerg Roedel
  -1 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-08 11:45 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Suravee Suthikulpanit, Vincent.Wan, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

This is better than storing an extra pointer in struct
protection_domain, because this pointer can now be removed
from the struct.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c       | 38 +++++++++++++++++++++++++++-----------
 drivers/iommu/amd_iommu_types.h |  1 -
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4052997..945f0c5 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -230,6 +230,12 @@ static struct protection_domain *to_pdomain(struct iommu_domain *dom)
 	return container_of(dom, struct protection_domain, domain);
 }
 
+static struct dma_ops_domain* to_dma_ops_domain(struct protection_domain *domain)
+{
+	BUG_ON(domain->flags != PD_DMA_OPS_MASK);
+	return container_of(domain, struct dma_ops_domain, domain);
+}
+
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
 	struct iommu_dev_data *dev_data;
@@ -435,7 +441,7 @@ static void init_iommu_group(struct device *dev)
 	if (!domain)
 		goto out;
 
-	dma_domain = to_pdomain(domain)->priv;
+	dma_domain = to_dma_ops_domain(to_pdomain(domain));
 
 out:
 	iommu_group_put(group);
@@ -1678,7 +1684,6 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
 	dma_dom->domain.mode = PAGE_MODE_2_LEVEL;
 	dma_dom->domain.pt_root = (void *)get_zeroed_page(GFP_KERNEL);
 	dma_dom->domain.flags = PD_DMA_OPS_MASK;
-	dma_dom->domain.priv = dma_dom;
 	if (!dma_dom->domain.pt_root)
 		goto free_dma_dom;
 
@@ -2375,6 +2380,7 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 {
 	phys_addr_t paddr = page_to_phys(page) + offset;
 	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	u64 dma_mask;
 
 	domain = get_domain(dev);
@@ -2384,8 +2390,9 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 		return DMA_ERROR_CODE;
 
 	dma_mask = *dev->dma_mask;
+	dma_dom = to_dma_ops_domain(domain);
 
-	return __map_single(dev, domain->priv, paddr, size, dir, dma_mask);
+	return __map_single(dev, dma_dom, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2395,12 +2402,15 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 		       enum dma_data_direction dir, struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return;
 
-	__unmap_single(domain->priv, dma_addr, size, dir);
+	dma_dom = to_dma_ops_domain(domain);
+
+	__unmap_single(dma_dom, dma_addr, size, dir);
 }
 
 /*
@@ -2422,7 +2432,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 	if (IS_ERR(domain))
 		return 0;
 
-	dma_dom  = domain->priv;
+	dma_dom  = to_dma_ops_domain(domain);
 	dma_mask = *dev->dma_mask;
 
 	for_each_sg(sglist, s, nelems, i)
@@ -2484,6 +2494,7 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	unsigned long startaddr;
 	struct scatterlist *s;
 	int i,npages = 0;
@@ -2496,8 +2507,9 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
 
 	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = to_dma_ops_domain(domain);
 
-	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
+	__unmap_single(dma_dom, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
@@ -2509,6 +2521,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 {
 	u64 dma_mask = dev->coherent_dma_mask;
 	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	struct page *page;
 
 	domain = get_domain(dev);
@@ -2519,6 +2532,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 	} else if (IS_ERR(domain))
 		return NULL;
 
+	dma_dom   = to_dma_ops_domain(domain);
 	size	  = PAGE_ALIGN(size);
 	dma_mask  = dev->coherent_dma_mask;
 	flag     &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
@@ -2538,7 +2552,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 	if (!dma_mask)
 		dma_mask = *dev->dma_mask;
 
-	*dma_addr = __map_single(dev, domain->priv, page_to_phys(page),
+	*dma_addr = __map_single(dev, dma_dom, page_to_phys(page),
 				 size, DMA_BIDIRECTIONAL, dma_mask);
 
 	if (*dma_addr == DMA_ERROR_CODE)
@@ -2562,6 +2576,7 @@ static void free_coherent(struct device *dev, size_t size,
 			  struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	struct page *page;
 
 	page = virt_to_page(virt_addr);
@@ -2571,7 +2586,9 @@ static void free_coherent(struct device *dev, size_t size,
 	if (IS_ERR(domain))
 		goto free_mem;
 
-	__unmap_single(domain->priv, dma_addr, size, DMA_BIDIRECTIONAL);
+	dma_dom = to_dma_ops_domain(domain);
+
+	__unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL);
 
 free_mem:
 	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
@@ -2861,7 +2878,7 @@ static void amd_iommu_domain_free(struct iommu_domain *dom)
 		queue_flush_all();
 
 		/* Now release the domain */
-		dma_dom = domain->priv;
+		dma_dom = to_dma_ops_domain(domain);
 		dma_ops_domain_free(dma_dom);
 		break;
 	default:
@@ -3049,8 +3066,7 @@ static void amd_iommu_apply_dm_region(struct device *dev,
 				      struct iommu_domain *domain,
 				      struct iommu_dm_region *region)
 {
-	struct protection_domain *pdomain = to_pdomain(domain);
-	struct dma_ops_domain *dma_dom = pdomain->priv;
+	struct dma_ops_domain *dma_dom = to_dma_ops_domain(to_pdomain(domain));
 	unsigned long start, end;
 
 	start = IOVA_PFN(region->start);
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 590956a..caf5e38 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -421,7 +421,6 @@ struct protection_domain {
 	bool updated;		/* complete domain flush required */
 	unsigned dev_cnt;	/* devices assigned to this domain */
 	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
-	void *priv;             /* private data */
 };
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
  2016-07-08 11:44 ` Joerg Roedel
                   ` (20 preceding siblings ...)
  (?)
@ 2016-07-12  9:03 ` Wan Zongshun
  2016-07-12 10:55   ` Joerg Roedel
  -1 siblings, 1 reply; 55+ messages in thread
From: Wan Zongshun @ 2016-07-12  9:03 UTC (permalink / raw)
  To: Joerg Roedel, iommu; +Cc: Vincent.Wan, linux-kernel



On 2016年07月08日 19:44, Joerg Roedel wrote:
> Hi,
>
> here is a patch-set to make the AMD IOMMU driver use the
> generic IOVA allocator, which is already used in the Intel
> VT-d driver and a few other places.
>
> The main reason for the conversion is to make the driver
> benefit from the recent scalability improvements to the IOVA
> code. Some of these improvements happened in the Intel VT-d
> driver, these are not re-used but, for now, re-implemented.
>
> This leaves more room for merging things together into
> common code for the future.
>
> The allocator that was previously used will be removed with
> these patches.
>
> Please review.

Joerg,

Currently, those patches can not work at my eCarrizo board.
When I merged your patches, boot failed, and no any info print to me.
I set iommu=pt, it also does not work; set iommu=soft, boot ok.

When I removed those patches, kernel boot ok.

This eCarrizo board uart doesnot work, so I can not get useful 
information from kernel by uart console, I also set 'text' in boot 
option, but still cannot print any log.


Vincent.

>
> Thanks,
>
> 	Joerg
>
> Joerg Roedel (20):
>    iommu: Add apply_dm_region call-back to iommu-ops
>    iommu/amd: Select IOMMU_IOVA for AMD IOMMU
>    iommu/amd: Allocate iova_domain for dma_ops_domain
>    iommu/amd: Create a list of reserved iova addresses
>    iommu/amd: Implement apply_dm_region call-back
>    iommu/amd: Pass gfp-flags to iommu_map_page()
>    iommu/amd: Remove special mapping code for dma_ops path
>    iommu/amd: Make use of the generic IOVA allocator
>    iommu/amd: Remove other remains of old address allocator
>    iommu/amd: Remove align-parameter from __map_single()
>    iommu/amd: Set up data structures for flush queue
>    iommu/amd: Allow NULL pointer parameter for domain_flush_complete()
>    iommu/amd: Implement flush queue
>    iommu/amd: Implement timeout to flush unmap queues
>    iommu/amd: Introduce dir2prot() helper
>    iommu/amd: Optimize map_sg and unmap_sg
>    iommu/amd: Use dev_data->domain in get_domain()
>    iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
>    iommu/amd: Flush iova queue before releasing dma_ops_domain
>    iommu/amd: Use container_of to get dma_ops_domain
>
>   drivers/iommu/Kconfig           |   1 +
>   drivers/iommu/amd_iommu.c       | 976 ++++++++++++++++------------------------
>   drivers/iommu/amd_iommu_types.h |   1 -
>   drivers/iommu/iommu.c           |   3 +
>   include/linux/iommu.h           |   3 +
>   5 files changed, 387 insertions(+), 597 deletions(-)
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
  2016-07-12  9:03 ` [PATCH 00/20] iommu/amd: Use generic IOVA allocator Wan Zongshun
@ 2016-07-12 10:55   ` Joerg Roedel
  2016-07-13  9:44       ` Wan Zongshun
  0 siblings, 1 reply; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 10:55 UTC (permalink / raw)
  To: Wan Zongshun; +Cc: iommu, Vincent.Wan, linux-kernel

Hey Vincent,

On Tue, Jul 12, 2016 at 05:03:08PM +0800, Wan Zongshun wrote:
> Currently, those patches can not work at my eCarrizo board.
> When I merged your patches, boot failed, and no any info print to me.
> I set iommu=pt, it also does not work; set iommu=soft, boot ok.
> 
> When I removed those patches, kernel boot ok.
> 
> This eCarrizo board uart doesnot work, so I can not get useful
> information from kernel by uart console, I also set 'text' in boot
> option, but still cannot print any log.

Thanks for your testing. The issue turned out to be an older bug, which
just got uncovered by these patches. I updated the branch at

	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git amd-iommu-iova

This branch boots now on my Kaveri and Carrizo system. Can you please
give it a test too?

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-12 10:55     ` Robin Murphy
  0 siblings, 0 replies; 55+ messages in thread
From: Robin Murphy @ 2016-07-12 10:55 UTC (permalink / raw)
  To: Joerg Roedel, iommu; +Cc: Vincent.Wan, Joerg Roedel, linux-kernel

Hi Joerg,

On 08/07/16 12:44, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> Use the iommu-api map/unmap functions instead. This will be
> required anyway when IOVA code is used for address
> allocation.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/amd_iommu.c | 107 ++++++----------------------------------------
>  1 file changed, 14 insertions(+), 93 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index e7c042b..08aa46c 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2557,94 +2557,6 @@ static void update_domain(struct protection_domain *domain)
>  }
>  
>  /*
> - * This function fetches the PTE for a given address in the aperture
> - */
> -static u64* dma_ops_get_pte(struct dma_ops_domain *dom,
> -			    unsigned long address)
> -{
> -	struct aperture_range *aperture;
> -	u64 *pte, *pte_page;
> -
> -	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
> -	if (!aperture)
> -		return NULL;
> -
> -	pte = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
> -	if (!pte) {
> -		pte = alloc_pte(&dom->domain, address, PAGE_SIZE, &pte_page,
> -				GFP_ATOMIC);
> -		aperture->pte_pages[APERTURE_PAGE_INDEX(address)] = pte_page;
> -	} else
> -		pte += PM_LEVEL_INDEX(0, address);
> -
> -	update_domain(&dom->domain);
> -
> -	return pte;
> -}
> -
> -/*
> - * This is the generic map function. It maps one 4kb page at paddr to
> - * the given address in the DMA address space for the domain.
> - */
> -static dma_addr_t dma_ops_domain_map(struct dma_ops_domain *dom,
> -				     unsigned long address,
> -				     phys_addr_t paddr,
> -				     int direction)
> -{
> -	u64 *pte, __pte;
> -
> -	WARN_ON(address > dom->aperture_size);
> -
> -	paddr &= PAGE_MASK;
> -
> -	pte  = dma_ops_get_pte(dom, address);
> -	if (!pte)
> -		return DMA_ERROR_CODE;
> -
> -	__pte = paddr | IOMMU_PTE_P | IOMMU_PTE_FC;
> -
> -	if (direction == DMA_TO_DEVICE)
> -		__pte |= IOMMU_PTE_IR;
> -	else if (direction == DMA_FROM_DEVICE)
> -		__pte |= IOMMU_PTE_IW;
> -	else if (direction == DMA_BIDIRECTIONAL)
> -		__pte |= IOMMU_PTE_IR | IOMMU_PTE_IW;
> -
> -	WARN_ON_ONCE(*pte);
> -
> -	*pte = __pte;
> -
> -	return (dma_addr_t)address;
> -}
> -
> -/*
> - * The generic unmapping function for on page in the DMA address space.
> - */
> -static void dma_ops_domain_unmap(struct dma_ops_domain *dom,
> -				 unsigned long address)
> -{
> -	struct aperture_range *aperture;
> -	u64 *pte;
> -
> -	if (address >= dom->aperture_size)
> -		return;
> -
> -	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
> -	if (!aperture)
> -		return;
> -
> -	pte  = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
> -	if (!pte)
> -		return;
> -
> -	pte += PM_LEVEL_INDEX(0, address);
> -
> -	WARN_ON_ONCE(!*pte);
> -
> -	*pte = 0ULL;
> -}
> -
> -/*
>   * This function contains common code for mapping of a physically
>   * contiguous memory region into DMA address space. It is used by all
>   * mapping functions provided with this IOMMU driver.
> @@ -2654,7 +2566,7 @@ static dma_addr_t __map_single(struct device *dev,
>  			       struct dma_ops_domain *dma_dom,
>  			       phys_addr_t paddr,
>  			       size_t size,
> -			       int dir,
> +			       int direction,
>  			       bool align,
>  			       u64 dma_mask)
>  {
> @@ -2662,6 +2574,7 @@ static dma_addr_t __map_single(struct device *dev,
>  	dma_addr_t address, start, ret;
>  	unsigned int pages;
>  	unsigned long align_mask = 0;
> +	int prot = 0;
>  	int i;
>  
>  	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
> @@ -2676,10 +2589,18 @@ static dma_addr_t __map_single(struct device *dev,
>  	if (address == DMA_ERROR_CODE)
>  		goto out;
>  
> +	if (direction == DMA_TO_DEVICE)
> +		prot = IOMMU_PROT_IR;
> +	else if (direction == DMA_FROM_DEVICE)
> +		prot = IOMMU_PROT_IW;
> +	else if (direction == DMA_BIDIRECTIONAL)
> +		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
> +
>  	start = address;
>  	for (i = 0; i < pages; ++i) {
> -		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
> -		if (ret == DMA_ERROR_CODE)
> +		ret = iommu_map_page(&dma_dom->domain, start, paddr,
> +				     PAGE_SIZE, prot, GFP_ATOMIC);

I see that amd_iommu_map/unmap() takes a lock around calling
iommu_map/unmap_page(), but we don't appear to do that here. That seems
to suggest that either one is unsafe or the other is unnecessary.

Robin.

> +		if (ret)
>  			goto out_unmap;
>  
>  		paddr += PAGE_SIZE;
> @@ -2699,7 +2620,7 @@ out_unmap:
>  
>  	for (--i; i >= 0; --i) {
>  		start -= PAGE_SIZE;
> -		dma_ops_domain_unmap(dma_dom, start);
> +		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
>  	}
>  
>  	dma_ops_free_addresses(dma_dom, address, pages);
> @@ -2730,7 +2651,7 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
>  	start = dma_addr;
>  
>  	for (i = 0; i < pages; ++i) {
> -		dma_ops_domain_unmap(dma_dom, start);
> +		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
>  		start += PAGE_SIZE;
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-12 10:55     ` Robin Murphy
  0 siblings, 0 replies; 55+ messages in thread
From: Robin Murphy @ 2016-07-12 10:55 UTC (permalink / raw)
  To: Joerg Roedel, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Vincent.Wan-5C7GfCeVMHo, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Joerg,

On 08/07/16 12:44, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> 
> Use the iommu-api map/unmap functions instead. This will be
> required anyway when IOVA code is used for address
> allocation.
> 
> Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/iommu/amd_iommu.c | 107 ++++++----------------------------------------
>  1 file changed, 14 insertions(+), 93 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index e7c042b..08aa46c 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2557,94 +2557,6 @@ static void update_domain(struct protection_domain *domain)
>  }
>  
>  /*
> - * This function fetches the PTE for a given address in the aperture
> - */
> -static u64* dma_ops_get_pte(struct dma_ops_domain *dom,
> -			    unsigned long address)
> -{
> -	struct aperture_range *aperture;
> -	u64 *pte, *pte_page;
> -
> -	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
> -	if (!aperture)
> -		return NULL;
> -
> -	pte = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
> -	if (!pte) {
> -		pte = alloc_pte(&dom->domain, address, PAGE_SIZE, &pte_page,
> -				GFP_ATOMIC);
> -		aperture->pte_pages[APERTURE_PAGE_INDEX(address)] = pte_page;
> -	} else
> -		pte += PM_LEVEL_INDEX(0, address);
> -
> -	update_domain(&dom->domain);
> -
> -	return pte;
> -}
> -
> -/*
> - * This is the generic map function. It maps one 4kb page at paddr to
> - * the given address in the DMA address space for the domain.
> - */
> -static dma_addr_t dma_ops_domain_map(struct dma_ops_domain *dom,
> -				     unsigned long address,
> -				     phys_addr_t paddr,
> -				     int direction)
> -{
> -	u64 *pte, __pte;
> -
> -	WARN_ON(address > dom->aperture_size);
> -
> -	paddr &= PAGE_MASK;
> -
> -	pte  = dma_ops_get_pte(dom, address);
> -	if (!pte)
> -		return DMA_ERROR_CODE;
> -
> -	__pte = paddr | IOMMU_PTE_P | IOMMU_PTE_FC;
> -
> -	if (direction == DMA_TO_DEVICE)
> -		__pte |= IOMMU_PTE_IR;
> -	else if (direction == DMA_FROM_DEVICE)
> -		__pte |= IOMMU_PTE_IW;
> -	else if (direction == DMA_BIDIRECTIONAL)
> -		__pte |= IOMMU_PTE_IR | IOMMU_PTE_IW;
> -
> -	WARN_ON_ONCE(*pte);
> -
> -	*pte = __pte;
> -
> -	return (dma_addr_t)address;
> -}
> -
> -/*
> - * The generic unmapping function for on page in the DMA address space.
> - */
> -static void dma_ops_domain_unmap(struct dma_ops_domain *dom,
> -				 unsigned long address)
> -{
> -	struct aperture_range *aperture;
> -	u64 *pte;
> -
> -	if (address >= dom->aperture_size)
> -		return;
> -
> -	aperture = dom->aperture[APERTURE_RANGE_INDEX(address)];
> -	if (!aperture)
> -		return;
> -
> -	pte  = aperture->pte_pages[APERTURE_PAGE_INDEX(address)];
> -	if (!pte)
> -		return;
> -
> -	pte += PM_LEVEL_INDEX(0, address);
> -
> -	WARN_ON_ONCE(!*pte);
> -
> -	*pte = 0ULL;
> -}
> -
> -/*
>   * This function contains common code for mapping of a physically
>   * contiguous memory region into DMA address space. It is used by all
>   * mapping functions provided with this IOMMU driver.
> @@ -2654,7 +2566,7 @@ static dma_addr_t __map_single(struct device *dev,
>  			       struct dma_ops_domain *dma_dom,
>  			       phys_addr_t paddr,
>  			       size_t size,
> -			       int dir,
> +			       int direction,
>  			       bool align,
>  			       u64 dma_mask)
>  {
> @@ -2662,6 +2574,7 @@ static dma_addr_t __map_single(struct device *dev,
>  	dma_addr_t address, start, ret;
>  	unsigned int pages;
>  	unsigned long align_mask = 0;
> +	int prot = 0;
>  	int i;
>  
>  	pages = iommu_num_pages(paddr, size, PAGE_SIZE);
> @@ -2676,10 +2589,18 @@ static dma_addr_t __map_single(struct device *dev,
>  	if (address == DMA_ERROR_CODE)
>  		goto out;
>  
> +	if (direction == DMA_TO_DEVICE)
> +		prot = IOMMU_PROT_IR;
> +	else if (direction == DMA_FROM_DEVICE)
> +		prot = IOMMU_PROT_IW;
> +	else if (direction == DMA_BIDIRECTIONAL)
> +		prot = IOMMU_PROT_IW | IOMMU_PROT_IR;
> +
>  	start = address;
>  	for (i = 0; i < pages; ++i) {
> -		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
> -		if (ret == DMA_ERROR_CODE)
> +		ret = iommu_map_page(&dma_dom->domain, start, paddr,
> +				     PAGE_SIZE, prot, GFP_ATOMIC);

I see that amd_iommu_map/unmap() takes a lock around calling
iommu_map/unmap_page(), but we don't appear to do that here. That seems
to suggest that either one is unsafe or the other is unnecessary.

Robin.

> +		if (ret)
>  			goto out_unmap;
>  
>  		paddr += PAGE_SIZE;
> @@ -2699,7 +2620,7 @@ out_unmap:
>  
>  	for (--i; i >= 0; --i) {
>  		start -= PAGE_SIZE;
> -		dma_ops_domain_unmap(dma_dom, start);
> +		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
>  	}
>  
>  	dma_ops_free_addresses(dma_dom, address, pages);
> @@ -2730,7 +2651,7 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
>  	start = dma_addr;
>  
>  	for (i = 0; i < pages; ++i) {
> -		dma_ops_domain_unmap(dma_dom, start);
> +		iommu_unmap_page(&dma_dom->domain, start, PAGE_SIZE);
>  		start += PAGE_SIZE;
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
  2016-07-12 10:55     ` Robin Murphy
  (?)
@ 2016-07-12 11:08     ` Joerg Roedel
  2016-07-12 11:42       ` Robin Murphy
  -1 siblings, 1 reply; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 11:08 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu, Vincent.Wan, Joerg Roedel, linux-kernel

Hi Robin,

On Tue, Jul 12, 2016 at 11:55:39AM +0100, Robin Murphy wrote:
> >  	start = address;
> >  	for (i = 0; i < pages; ++i) {
> > -		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
> > -		if (ret == DMA_ERROR_CODE)
> > +		ret = iommu_map_page(&dma_dom->domain, start, paddr,
> > +				     PAGE_SIZE, prot, GFP_ATOMIC);
> 
> I see that amd_iommu_map/unmap() takes a lock around calling
> iommu_map/unmap_page(), but we don't appear to do that here. That seems
> to suggest that either one is unsafe or the other is unnecessary.

At this point no locking is required, because in this code path we know
that we own the memory range and that nobody else is mapping that range.

In the IOMMU-API path we can't make that assumption, so locking is
required there. Both code-path use different types of domains, so there
is also no chance that a domain is used in both code-paths (except when
a dma-ops domain is set up).


	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg
  2016-07-08 11:45 ` [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg Joerg Roedel
@ 2016-07-12 11:33   ` Robin Murphy
  2016-07-12 13:30       ` Joerg Roedel
  0 siblings, 1 reply; 55+ messages in thread
From: Robin Murphy @ 2016-07-12 11:33 UTC (permalink / raw)
  To: Joerg Roedel, iommu; +Cc: Vincent.Wan, Joerg Roedel, linux-kernel

On 08/07/16 12:45, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> Optimize these functions so that they need only one call
> into the address alloctor. This also saves a couple of
> io-tlb flushes in the unmap_sg path.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/amd_iommu.c | 77 ++++++++++++++++++++++++++++++++---------------
>  1 file changed, 52 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 78b278b..e5f8e7f 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2409,45 +2409,70 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
>   * lists).
>   */
>  static int map_sg(struct device *dev, struct scatterlist *sglist,
> -		  int nelems, enum dma_data_direction dir,
> +		  int nelems, enum dma_data_direction direction,
>  		  struct dma_attrs *attrs)
>  {
> +	int mapped_pages = 0, npages = 0, prot = 0, i;
> +	unsigned long start_addr, address;
>  	struct protection_domain *domain;
> -	int i;
> +	struct dma_ops_domain *dma_dom;
>  	struct scatterlist *s;
> -	phys_addr_t paddr;
> -	int mapped_elems = 0;
>  	u64 dma_mask;
>  
>  	domain = get_domain(dev);
>  	if (IS_ERR(domain))
>  		return 0;
>  
> +	dma_dom  = domain->priv;
>  	dma_mask = *dev->dma_mask;
>  
> +	for_each_sg(sglist, s, nelems, i)
> +		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);

This fails to account for the segment boundary mask[1]. Given a typical
sglist from the block layer where the boundary mask is 64K, the first
segment is 8k long, and subsequent segments are 64K long, those
subsequent segments will end up with misaligned addresses which certain
hardware may object to.

> +	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);

Since a typical dma_map_sg() call is likely to involve >128K worth of
data, I wonder if it's worth going directly to a slow-path IOVA
allocation...

> +	if (address == DMA_ERROR_CODE)
> +		goto out_err;
> +
> +	start_addr = address;
> +	prot       = dir2prot(direction);
> +
>  	for_each_sg(sglist, s, nelems, i) {
> -		paddr = sg_phys(s);
> +		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> +
> +		for (j = 0; j < pages; ++j) {
> +			unsigned long bus_addr, phys_addr;
> +			int ret;
>  
> -		s->dma_address = __map_single(dev, domain->priv,
> -					      paddr, s->length, dir, dma_mask);
> +			bus_addr  = address + (j << PAGE_SHIFT);
> +			phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
> +			ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
> +			if (ret)
> +				goto out_unmap;
> +
> +			mapped_pages += 1;
> +		}
>  
> -		if (s->dma_address) {
> -			s->dma_length = s->length;
> -			mapped_elems++;
> -		} else
> -			goto unmap;
> +		s->dma_address = address + s->offset;
> +		s->dma_length  = s->length;
> +		address += pages << PAGE_SHIFT;
>  	}
>  
> -	return mapped_elems;
> +	return nelems;
>  
> -unmap:
> -	for_each_sg(sglist, s, mapped_elems, i) {
> -		if (s->dma_address)
> -			__unmap_single(domain->priv, s->dma_address,
> -				       s->dma_length, dir);
> -		s->dma_address = s->dma_length = 0;
> +
> +out_unmap:
> +	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
> +	       dev_name(dev), npages);
> +
> +	for (i = 0; i < mapped_pages; ++i) {
> +		iommu_unmap_page(domain,
> +				 start_addr + (i << PAGE_SHIFT),
> +				 PAGE_SIZE);
>  	}
>  
> +	free_iova_fast(&dma_dom->iovad, start_addr, npages);
> +
> +out_err:
>  	return 0;
>  }
>  
> @@ -2460,18 +2485,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
>  		     struct dma_attrs *attrs)
>  {
>  	struct protection_domain *domain;
> +	unsigned long startaddr;
>  	struct scatterlist *s;
> -	int i;
> +	int i,npages = 0;
>  
>  	domain = get_domain(dev);
>  	if (IS_ERR(domain))
>  		return;
>  
> -	for_each_sg(sglist, s, nelems, i) {
> -		__unmap_single(domain->priv, s->dma_address,
> -			       s->dma_length, dir);
> -		s->dma_address = s->dma_length = 0;
> -	}
> +	for_each_sg(sglist, s, nelems, i)
> +		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);

...which would also then allow this to be further simplified down to the
find_iova() trick we use in iommu-dma.

Robin.

[1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
1-year anniversary of you making much the same comment to me :D

> +
> +	startaddr = sg_dma_address(sglist) & PAGE_MASK;
> +
> +	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
  2016-07-12 11:08     ` Joerg Roedel
@ 2016-07-12 11:42       ` Robin Murphy
  2016-07-12 11:48           ` Joerg Roedel
  0 siblings, 1 reply; 55+ messages in thread
From: Robin Murphy @ 2016-07-12 11:42 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, Vincent.Wan, Joerg Roedel, linux-kernel

On 12/07/16 12:08, Joerg Roedel wrote:
> Hi Robin,
> 
> On Tue, Jul 12, 2016 at 11:55:39AM +0100, Robin Murphy wrote:
>>>  	start = address;
>>>  	for (i = 0; i < pages; ++i) {
>>> -		ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
>>> -		if (ret == DMA_ERROR_CODE)
>>> +		ret = iommu_map_page(&dma_dom->domain, start, paddr,
>>> +				     PAGE_SIZE, prot, GFP_ATOMIC);
>>
>> I see that amd_iommu_map/unmap() takes a lock around calling
>> iommu_map/unmap_page(), but we don't appear to do that here. That seems
>> to suggest that either one is unsafe or the other is unnecessary.
> 
> At this point no locking is required, because in this code path we know
> that we own the memory range and that nobody else is mapping that range.
> 
> In the IOMMU-API path we can't make that assumption, so locking is
> required there. Both code-path use different types of domains, so there
> is also no chance that a domain is used in both code-paths (except when
> a dma-ops domain is set up).

Ah, that's the angle I was missing, yes. So if, say, someone is mapping
a page at IOVA 0x0000 while someone else is mapping a page at 0x1000,
there could still be a race between both callers writing the non-leaf
PTEs, but it's benign since they'd be writing identical entries anyway.
Seems reasonable to me (I assume in a similar map vs. unmap race, the
unmapper would just be removing the leaf entry, rather than bothering to
check for empty tables and tear down intermediate levels, so the same
still applies).

Robin.

> 
> 
> 	Joerg
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-12 11:48           ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 11:48 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Joerg Roedel, iommu, Vincent.Wan, linux-kernel

Hi Robin,

On Tue, Jul 12, 2016 at 12:42:59PM +0100, Robin Murphy wrote:
> Ah, that's the angle I was missing, yes. So if, say, someone is mapping
> a page at IOVA 0x0000 while someone else is mapping a page at 0x1000,
> there could still be a race between both callers writing the non-leaf
> PTEs, but it's benign since they'd be writing identical entries anyway.
> Seems reasonable to me (I assume in a similar map vs. unmap race, the
> unmapper would just be removing the leaf entry, rather than bothering to
> check for empty tables and tear down intermediate levels, so the same
> still applies).

The non-leaf PTE setup code checks for races with cmpxchg, so we are on
the safe side there. Two threads would write different entries there,
because they are allocating differnt sub-pages, but as I said, this is
checked for using cmpxchg. On the PTE level this problem does not exist.


	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path
@ 2016-07-12 11:48           ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 11:48 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Vincent.Wan-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Robin,

On Tue, Jul 12, 2016 at 12:42:59PM +0100, Robin Murphy wrote:
> Ah, that's the angle I was missing, yes. So if, say, someone is mapping
> a page at IOVA 0x0000 while someone else is mapping a page at 0x1000,
> there could still be a race between both callers writing the non-leaf
> PTEs, but it's benign since they'd be writing identical entries anyway.
> Seems reasonable to me (I assume in a similar map vs. unmap race, the
> unmapper would just be removing the leaf entry, rather than bothering to
> check for empty tables and tear down intermediate levels, so the same
> still applies).

The non-leaf PTE setup code checks for races with cmpxchg, so we are on
the safe side there. Two threads would write different entries there,
because they are allocating differnt sub-pages, but as I said, this is
checked for using cmpxchg. On the PTE level this problem does not exist.


	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
@ 2016-07-12 13:30       ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 13:30 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu, Vincent.Wan, Joerg Roedel, linux-kernel

On Tue, Jul 12, 2016 at 12:33:43PM +0100, Robin Murphy wrote:
> > +	for_each_sg(sglist, s, nelems, i)
> > +		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> 
> This fails to account for the segment boundary mask[1]. Given a typical
> sglist from the block layer where the boundary mask is 64K, the first
> segment is 8k long, and subsequent segments are 64K long, those
> subsequent segments will end up with misaligned addresses which certain
> hardware may object to.

Yeah, right. It doesn't matter much on x86, as the smallest
boundary-mask I have seen is 4G, but to be correct it should be
accounted in. How does the attached patch look?

> 
> > +	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
> 
> Since a typical dma_map_sg() call is likely to involve >128K worth of
> data, I wonder if it's worth going directly to a slow-path IOVA
> allocation...

Well, the allocator is the bottle-neck, so I try not to call it for
every sg-element. The global locks have been removed, but more
allocations/deallocations also mean that the per-cpu free-lists fill up
quicker and that we have to flush the IOTLBs more often, which costs
performance.

> [1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
> 1-year anniversary of you making much the same comment to me :D

Touché ;-)

Here is the updated patch:

>From 88e1cc6c8e854a2bf55f972ddc5082a44760abe2 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Wed, 6 Jul 2016 17:20:54 +0200
Subject: [PATCH 1/5] iommu/amd: Optimize map_sg and unmap_sg

Optimize these functions so that they need only one call
into the address alloctor. This also saves a couple of
io-tlb flushes in the unmap_sg path.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 112 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 86 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2cd382e..203c50c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2395,50 +2395,110 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 	__unmap_single(domain->priv, dma_addr, size, dir);
 }
 
+static int sg_num_pages(struct device *dev,
+			struct scatterlist *sglist,
+			int nelems)
+{
+	unsigned long mask, boundary_size;
+	struct scatterlist *s;
+	int i, npages = 0;
+
+	mask          = dma_get_seg_boundary(dev);
+	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
+				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
+
+	for_each_sg(sglist, s, nelems, i) {
+		int p, n;
+
+		s->dma_address = npages << PAGE_SHIFT;
+		p = npages % boundary_size;
+		n = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+		if (p + n > boundary_size)
+			npages += boundary_size - p;
+		npages += n;
+	}
+
+	return npages;
+}
+
 /*
  * The exported map_sg function for dma_ops (handles scatter-gather
  * lists).
  */
 static int map_sg(struct device *dev, struct scatterlist *sglist,
-		  int nelems, enum dma_data_direction dir,
+		  int nelems, enum dma_data_direction direction,
 		  struct dma_attrs *attrs)
 {
+	int mapped_pages = 0, npages = 0, prot = 0, i;
 	struct protection_domain *domain;
-	int i;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
-	phys_addr_t paddr;
-	int mapped_elems = 0;
+	unsigned long address;
 	u64 dma_mask;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return 0;
 
+	dma_dom  = domain->priv;
 	dma_mask = *dev->dma_mask;
 
+	npages = sg_num_pages(dev, sglist, nelems);
+
+	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
+	if (address == DMA_ERROR_CODE)
+		goto out_err;
+
+	prot = dir2prot(direction);
+
+	/* Map all sg entries */
 	for_each_sg(sglist, s, nelems, i) {
-		paddr = sg_phys(s);
+		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+		for (j = 0; j < pages; ++j) {
+			unsigned long bus_addr, phys_addr;
+			int ret;
 
-		s->dma_address = __map_single(dev, domain->priv,
-					      paddr, s->length, dir, dma_mask);
+			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
+			phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
+			ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
+			if (ret)
+				goto out_unmap;
 
-		if (s->dma_address) {
-			s->dma_length = s->length;
-			mapped_elems++;
-		} else
-			goto unmap;
+			mapped_pages += 1;
+		}
 	}
 
-	return mapped_elems;
+	/* Everything is mapped - write the right values into s->dma_address */
+	for_each_sg(sglist, s, nelems, i) {
+		s->dma_address += address + s->offset;
+		s->dma_length   = s->length;
+	}
+
+	return nelems;
+
+out_unmap:
+	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
+	       dev_name(dev), npages);
+
+	for_each_sg(sglist, s, nelems, i) {
+		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+		for (j = 0; j < pages; ++j) {
+			unsigned long bus_addr;
 
-unmap:
-	for_each_sg(sglist, s, mapped_elems, i) {
-		if (s->dma_address)
-			__unmap_single(domain->priv, s->dma_address,
-				       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
+			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
+			iommu_unmap_page(domain, bus_addr, PAGE_SIZE);
+
+			if (--mapped_pages)
+				goto out_free_iova;
+		}
 	}
 
+out_free_iova:
+	free_iova_fast(&dma_dom->iovad, address, npages);
+
+out_err:
 	return 0;
 }
 
@@ -2451,18 +2511,18 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
-	struct scatterlist *s;
-	int i;
+	unsigned long startaddr;
+	int npages = 2;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return;
 
-	for_each_sg(sglist, s, nelems, i) {
-		__unmap_single(domain->priv, s->dma_address,
-			       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
-	}
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = domain->priv;
+	npages    = sg_num_pages(dev, sglist, nelems);
+
+	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
@ 2016-07-12 13:30       ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-12 13:30 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Vincent.Wan-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Jul 12, 2016 at 12:33:43PM +0100, Robin Murphy wrote:
> > +	for_each_sg(sglist, s, nelems, i)
> > +		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> 
> This fails to account for the segment boundary mask[1]. Given a typical
> sglist from the block layer where the boundary mask is 64K, the first
> segment is 8k long, and subsequent segments are 64K long, those
> subsequent segments will end up with misaligned addresses which certain
> hardware may object to.

Yeah, right. It doesn't matter much on x86, as the smallest
boundary-mask I have seen is 4G, but to be correct it should be
accounted in. How does the attached patch look?

> 
> > +	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
> 
> Since a typical dma_map_sg() call is likely to involve >128K worth of
> data, I wonder if it's worth going directly to a slow-path IOVA
> allocation...

Well, the allocator is the bottle-neck, so I try not to call it for
every sg-element. The global locks have been removed, but more
allocations/deallocations also mean that the per-cpu free-lists fill up
quicker and that we have to flush the IOTLBs more often, which costs
performance.

> [1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
> 1-year anniversary of you making much the same comment to me :D

Touché ;-)

Here is the updated patch:

>From 88e1cc6c8e854a2bf55f972ddc5082a44760abe2 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Wed, 6 Jul 2016 17:20:54 +0200
Subject: [PATCH 1/5] iommu/amd: Optimize map_sg and unmap_sg

Optimize these functions so that they need only one call
into the address alloctor. This also saves a couple of
io-tlb flushes in the unmap_sg path.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 112 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 86 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2cd382e..203c50c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2395,50 +2395,110 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 	__unmap_single(domain->priv, dma_addr, size, dir);
 }
 
+static int sg_num_pages(struct device *dev,
+			struct scatterlist *sglist,
+			int nelems)
+{
+	unsigned long mask, boundary_size;
+	struct scatterlist *s;
+	int i, npages = 0;
+
+	mask          = dma_get_seg_boundary(dev);
+	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
+				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
+
+	for_each_sg(sglist, s, nelems, i) {
+		int p, n;
+
+		s->dma_address = npages << PAGE_SHIFT;
+		p = npages % boundary_size;
+		n = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+		if (p + n > boundary_size)
+			npages += boundary_size - p;
+		npages += n;
+	}
+
+	return npages;
+}
+
 /*
  * The exported map_sg function for dma_ops (handles scatter-gather
  * lists).
  */
 static int map_sg(struct device *dev, struct scatterlist *sglist,
-		  int nelems, enum dma_data_direction dir,
+		  int nelems, enum dma_data_direction direction,
 		  struct dma_attrs *attrs)
 {
+	int mapped_pages = 0, npages = 0, prot = 0, i;
 	struct protection_domain *domain;
-	int i;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
-	phys_addr_t paddr;
-	int mapped_elems = 0;
+	unsigned long address;
 	u64 dma_mask;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return 0;
 
+	dma_dom  = domain->priv;
 	dma_mask = *dev->dma_mask;
 
+	npages = sg_num_pages(dev, sglist, nelems);
+
+	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
+	if (address == DMA_ERROR_CODE)
+		goto out_err;
+
+	prot = dir2prot(direction);
+
+	/* Map all sg entries */
 	for_each_sg(sglist, s, nelems, i) {
-		paddr = sg_phys(s);
+		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+		for (j = 0; j < pages; ++j) {
+			unsigned long bus_addr, phys_addr;
+			int ret;
 
-		s->dma_address = __map_single(dev, domain->priv,
-					      paddr, s->length, dir, dma_mask);
+			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
+			phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
+			ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
+			if (ret)
+				goto out_unmap;
 
-		if (s->dma_address) {
-			s->dma_length = s->length;
-			mapped_elems++;
-		} else
-			goto unmap;
+			mapped_pages += 1;
+		}
 	}
 
-	return mapped_elems;
+	/* Everything is mapped - write the right values into s->dma_address */
+	for_each_sg(sglist, s, nelems, i) {
+		s->dma_address += address + s->offset;
+		s->dma_length   = s->length;
+	}
+
+	return nelems;
+
+out_unmap:
+	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
+	       dev_name(dev), npages);
+
+	for_each_sg(sglist, s, nelems, i) {
+		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
+
+		for (j = 0; j < pages; ++j) {
+			unsigned long bus_addr;
 
-unmap:
-	for_each_sg(sglist, s, mapped_elems, i) {
-		if (s->dma_address)
-			__unmap_single(domain->priv, s->dma_address,
-				       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
+			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
+			iommu_unmap_page(domain, bus_addr, PAGE_SIZE);
+
+			if (--mapped_pages)
+				goto out_free_iova;
+		}
 	}
 
+out_free_iova:
+	free_iova_fast(&dma_dom->iovad, address, npages);
+
+out_err:
 	return 0;
 }
 
@@ -2451,18 +2511,18 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     struct dma_attrs *attrs)
 {
 	struct protection_domain *domain;
-	struct scatterlist *s;
-	int i;
+	unsigned long startaddr;
+	int npages = 2;
 
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		return;
 
-	for_each_sg(sglist, s, nelems, i) {
-		__unmap_single(domain->priv, s->dma_address,
-			       s->dma_length, dir);
-		s->dma_address = s->dma_length = 0;
-	}
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = domain->priv;
+	npages    = sg_num_pages(dev, sglist, nelems);
+
+	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
  2016-07-12 13:30       ` Joerg Roedel
  (?)
@ 2016-07-12 15:34       ` Robin Murphy
  2016-07-13 10:27           ` Joerg Roedel
  -1 siblings, 1 reply; 55+ messages in thread
From: Robin Murphy @ 2016-07-12 15:34 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, Vincent.Wan, Joerg Roedel, linux-kernel

On 12/07/16 14:30, Joerg Roedel wrote:
> On Tue, Jul 12, 2016 at 12:33:43PM +0100, Robin Murphy wrote:
>>> +	for_each_sg(sglist, s, nelems, i)
>>> +		npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
>>
>> This fails to account for the segment boundary mask[1]. Given a typical
>> sglist from the block layer where the boundary mask is 64K, the first
>> segment is 8k long, and subsequent segments are 64K long, those
>> subsequent segments will end up with misaligned addresses which certain
>> hardware may object to.
> 
> Yeah, right. It doesn't matter much on x86, as the smallest
> boundary-mask I have seen is 4G, but to be correct it should be
> accounted in. How does the attached patch look?

The boundary masks for block devices are tricky to track down through so
many layers of indirection in the common frameworks, but there are a lot
of 64K ones there. After some more impromptu digging into the subject
I've finally satisfied my curiosity - it seems this restriction stems
from the ATA DMA PRD table format, so it could perhaps still be a real
concern for anyone using some crusty old PCI IDE card in their modern
system.

>>
>>> +	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
>>
>> Since a typical dma_map_sg() call is likely to involve >128K worth of
>> data, I wonder if it's worth going directly to a slow-path IOVA
>> allocation...
> 
> Well, the allocator is the bottle-neck, so I try not to call it for
> every sg-element. The global locks have been removed, but more
> allocations/deallocations also mean that the per-cpu free-lists fill up
> quicker and that we have to flush the IOTLBs more often, which costs
> performance.

Indeed, I wasn't suggesting making more than one call, just that
alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
here, so there may be some mileage in going directly to the latter, with
the benefit of then being able to rely on find_iova() later (since you
know for sure you allocated out of the tree rather than the caches). My
hunch is that dma_map_sg() tends to be called for bulk data transfer
(block devices, DRM, etc.) so is probably a less contended path compared
to the network layer hammering dma_map_single().

> 
>> [1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
>> 1-year anniversary of you making much the same comment to me :D
> 
> Touché ;-)
> 
> Here is the updated patch:
> 
> From 88e1cc6c8e854a2bf55f972ddc5082a44760abe2 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <jroedel@suse.de>
> Date: Wed, 6 Jul 2016 17:20:54 +0200
> Subject: [PATCH 1/5] iommu/amd: Optimize map_sg and unmap_sg
> 
> Optimize these functions so that they need only one call
> into the address alloctor. This also saves a couple of
> io-tlb flushes in the unmap_sg path.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/amd_iommu.c | 112 +++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 86 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 2cd382e..203c50c 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2395,50 +2395,110 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
>  	__unmap_single(domain->priv, dma_addr, size, dir);
>  }
>  
> +static int sg_num_pages(struct device *dev,
> +			struct scatterlist *sglist,
> +			int nelems)
> +{
> +	unsigned long mask, boundary_size;
> +	struct scatterlist *s;
> +	int i, npages = 0;
> +
> +	mask          = dma_get_seg_boundary(dev);
> +	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> +				   1UL << (BITS_PER_LONG - PAGE_SHIFT);

(mask >> PAGE_SHIFT) + 1 ?

> +
> +	for_each_sg(sglist, s, nelems, i) {
> +		int p, n;
> +
> +		s->dma_address = npages << PAGE_SHIFT;
> +		p = npages % boundary_size;
> +		n = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> +		if (p + n > boundary_size)
> +			npages += boundary_size - p;
> +		npages += n;
> +	}
> +
> +	return npages;
> +}

Otherwise, this seems OK to me (as far as you value the judgement of
someone who took at least 3 tries to come up with a correct
implementation themselves...)

> +
>  /*
>   * The exported map_sg function for dma_ops (handles scatter-gather
>   * lists).
>   */
>  static int map_sg(struct device *dev, struct scatterlist *sglist,
> -		  int nelems, enum dma_data_direction dir,
> +		  int nelems, enum dma_data_direction direction,
>  		  struct dma_attrs *attrs)
>  {
> +	int mapped_pages = 0, npages = 0, prot = 0, i;
>  	struct protection_domain *domain;
> -	int i;
> +	struct dma_ops_domain *dma_dom;
>  	struct scatterlist *s;
> -	phys_addr_t paddr;
> -	int mapped_elems = 0;
> +	unsigned long address;
>  	u64 dma_mask;
>  
>  	domain = get_domain(dev);
>  	if (IS_ERR(domain))
>  		return 0;
>  
> +	dma_dom  = domain->priv;
>  	dma_mask = *dev->dma_mask;
>  
> +	npages = sg_num_pages(dev, sglist, nelems);
> +
> +	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
> +	if (address == DMA_ERROR_CODE)
> +		goto out_err;
> +
> +	prot = dir2prot(direction);
> +
> +	/* Map all sg entries */
>  	for_each_sg(sglist, s, nelems, i) {
> -		paddr = sg_phys(s);
> +		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> +
> +		for (j = 0; j < pages; ++j) {
> +			unsigned long bus_addr, phys_addr;
> +			int ret;
>  
> -		s->dma_address = __map_single(dev, domain->priv,
> -					      paddr, s->length, dir, dma_mask);
> +			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
> +			phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
> +			ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
> +			if (ret)
> +				goto out_unmap;
>  
> -		if (s->dma_address) {
> -			s->dma_length = s->length;
> -			mapped_elems++;
> -		} else
> -			goto unmap;
> +			mapped_pages += 1;
> +		}
>  	}
>  
> -	return mapped_elems;
> +	/* Everything is mapped - write the right values into s->dma_address */
> +	for_each_sg(sglist, s, nelems, i) {
> +		s->dma_address += address + s->offset;
> +		s->dma_length   = s->length;
> +	}
> +
> +	return nelems;
> +
> +out_unmap:
> +	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
> +	       dev_name(dev), npages);
> +
> +	for_each_sg(sglist, s, nelems, i) {
> +		int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> +
> +		for (j = 0; j < pages; ++j) {
> +			unsigned long bus_addr;
>  
> -unmap:
> -	for_each_sg(sglist, s, mapped_elems, i) {
> -		if (s->dma_address)
> -			__unmap_single(domain->priv, s->dma_address,
> -				       s->dma_length, dir);
> -		s->dma_address = s->dma_length = 0;
> +			bus_addr  = address + s->dma_address + (j << PAGE_SHIFT);
> +			iommu_unmap_page(domain, bus_addr, PAGE_SIZE);
> +
> +			if (--mapped_pages)
> +				goto out_free_iova;
> +		}
>  	}
>  
> +out_free_iova:
> +	free_iova_fast(&dma_dom->iovad, address, npages);
> +
> +out_err:
>  	return 0;
>  }
>  
> @@ -2451,18 +2511,18 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
>  		     struct dma_attrs *attrs)
>  {
>  	struct protection_domain *domain;
> -	struct scatterlist *s;
> -	int i;
> +	unsigned long startaddr;
> +	int npages = 2;
>  
>  	domain = get_domain(dev);
>  	if (IS_ERR(domain))
>  		return;
>  
> -	for_each_sg(sglist, s, nelems, i) {
> -		__unmap_single(domain->priv, s->dma_address,
> -			       s->dma_length, dir);
> -		s->dma_address = s->dma_length = 0;
> -	}
> +	startaddr = sg_dma_address(sglist) & PAGE_MASK;
> +	dma_dom   = domain->priv;

This line seems to be here by mistake (I guess patch 20 removes it
again, but it's a brief bisection-breaker).

Robin.

> +	npages    = sg_num_pages(dev, sglist, nelems);
> +
> +	__unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-13  9:44       ` Wan Zongshun
  0 siblings, 0 replies; 55+ messages in thread
From: Wan Zongshun @ 2016-07-13  9:44 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Vincent.Wan, iommu, linux-kernel



On 2016年07月12日 18:55, Joerg Roedel wrote:
> Hey Vincent,
>
> On Tue, Jul 12, 2016 at 05:03:08PM +0800, Wan Zongshun wrote:
>> Currently, those patches can not work at my eCarrizo board.
>> When I merged your patches, boot failed, and no any info print to me.
>> I set iommu=pt, it also does not work; set iommu=soft, boot ok.
>>
>> When I removed those patches, kernel boot ok.
>>
>> This eCarrizo board uart doesnot work, so I can not get useful
>> information from kernel by uart console, I also set 'text' in boot
>> option, but still cannot print any log.
>
> Thanks for your testing. The issue turned out to be an older bug, which
> just got uncovered by these patches. I updated the branch at
>
> 	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git amd-iommu-iova
>
> This branch boots now on my Kaveri and Carrizo system. Can you please
> give it a test too?

Joerg, I have tested the patches, it works now.

>
> Thanks,
>
> 	Joerg
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-13  9:44       ` Wan Zongshun
  0 siblings, 0 replies; 55+ messages in thread
From: Wan Zongshun @ 2016-07-13  9:44 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Vincent.Wan-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA



On 2016年07月12日 18:55, Joerg Roedel wrote:
> Hey Vincent,
>
> On Tue, Jul 12, 2016 at 05:03:08PM +0800, Wan Zongshun wrote:
>> Currently, those patches can not work at my eCarrizo board.
>> When I merged your patches, boot failed, and no any info print to me.
>> I set iommu=pt, it also does not work; set iommu=soft, boot ok.
>>
>> When I removed those patches, kernel boot ok.
>>
>> This eCarrizo board uart doesnot work, so I can not get useful
>> information from kernel by uart console, I also set 'text' in boot
>> option, but still cannot print any log.
>
> Thanks for your testing. The issue turned out to be an older bug, which
> just got uncovered by these patches. I updated the branch at
>
> 	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git amd-iommu-iova
>
> This branch boots now on my Kaveri and Carrizo system. Can you please
> give it a test too?

Joerg, I have tested the patches, it works now.

>
> Thanks,
>
> 	Joerg
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-13  9:51         ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-13  9:51 UTC (permalink / raw)
  To: Wan Zongshun; +Cc: Vincent.Wan, iommu, linux-kernel

On Wed, Jul 13, 2016 at 05:44:24PM +0800, Wan Zongshun wrote:
> 
> 
> On 2016年07月12日 18:55, Joerg Roedel wrote:
> >This branch boots now on my Kaveri and Carrizo system. Can you please
> >give it a test too?
> 
> Joerg, I have tested the patches, it works now.

Great, thanks a lot for your testing.



	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator
@ 2016-07-13  9:51         ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-13  9:51 UTC (permalink / raw)
  To: Wan Zongshun
  Cc: Vincent.Wan-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Jul 13, 2016 at 05:44:24PM +0800, Wan Zongshun wrote:
> 
> 
> On 2016年07月12日 18:55, Joerg Roedel wrote:
> >This branch boots now on my Kaveri and Carrizo system. Can you please
> >give it a test too?
> 
> Joerg, I have tested the patches, it works now.

Great, thanks a lot for your testing.



	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
@ 2016-07-13 10:27           ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-13 10:27 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Joerg Roedel, iommu, Vincent.Wan, linux-kernel

On Tue, Jul 12, 2016 at 04:34:16PM +0100, Robin Murphy wrote:
> The boundary masks for block devices are tricky to track down through so
> many layers of indirection in the common frameworks, but there are a lot
> of 64K ones there. After some more impromptu digging into the subject
> I've finally satisfied my curiosity - it seems this restriction stems
> from the ATA DMA PRD table format, so it could perhaps still be a real
> concern for anyone using some crusty old PCI IDE card in their modern
> system.

The boundary-mask is a capability of the underlying PCI device, no? The
ATA or whatever-stack above should have no influence on it.
> 
> Indeed, I wasn't suggesting making more than one call, just that
> alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
> here, so there may be some mileage in going directly to the latter, with
> the benefit of then being able to rely on find_iova() later (since you
> know for sure you allocated out of the tree rather than the caches). My
> hunch is that dma_map_sg() tends to be called for bulk data transfer
> (block devices, DRM, etc.) so is probably a less contended path compared
> to the network layer hammering dma_map_single().

Using different functions for allocation would also require special
handling in the queued-freeing code, as I have to track the allocation
then to know wheter I free it with the _fast variant or not.

> > +	mask          = dma_get_seg_boundary(dev);
> > +	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> > +				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
> 
> (mask >> PAGE_SHIFT) + 1 ?

Should make no difference unless some of the first PAGE_SHIFT bits of
mask is 0 (which shouldn't happen).



	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
@ 2016-07-13 10:27           ` Joerg Roedel
  0 siblings, 0 replies; 55+ messages in thread
From: Joerg Roedel @ 2016-07-13 10:27 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Vincent.Wan-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Jul 12, 2016 at 04:34:16PM +0100, Robin Murphy wrote:
> The boundary masks for block devices are tricky to track down through so
> many layers of indirection in the common frameworks, but there are a lot
> of 64K ones there. After some more impromptu digging into the subject
> I've finally satisfied my curiosity - it seems this restriction stems
> from the ATA DMA PRD table format, so it could perhaps still be a real
> concern for anyone using some crusty old PCI IDE card in their modern
> system.

The boundary-mask is a capability of the underlying PCI device, no? The
ATA or whatever-stack above should have no influence on it.
> 
> Indeed, I wasn't suggesting making more than one call, just that
> alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
> here, so there may be some mileage in going directly to the latter, with
> the benefit of then being able to rely on find_iova() later (since you
> know for sure you allocated out of the tree rather than the caches). My
> hunch is that dma_map_sg() tends to be called for bulk data transfer
> (block devices, DRM, etc.) so is probably a less contended path compared
> to the network layer hammering dma_map_single().

Using different functions for allocation would also require special
handling in the queued-freeing code, as I have to track the allocation
then to know wheter I free it with the _fast variant or not.

> > +	mask          = dma_get_seg_boundary(dev);
> > +	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> > +				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
> 
> (mask >> PAGE_SHIFT) + 1 ?

Should make no difference unless some of the first PAGE_SHIFT bits of
mask is 0 (which shouldn't happen).



	Joerg

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2016-07-13 10:27 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08 11:44 [PATCH 00/20] iommu/amd: Use generic IOVA allocator Joerg Roedel
2016-07-08 11:44 ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 01/20] iommu: Add apply_dm_region call-back to iommu-ops Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 02/20] iommu/amd: Select IOMMU_IOVA for AMD IOMMU Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 03/20] iommu/amd: Allocate iova_domain for dma_ops_domain Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 04/20] iommu/amd: Create a list of reserved iova addresses Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 05/20] iommu/amd: Implement apply_dm_region call-back Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 06/20] iommu/amd: Pass gfp-flags to iommu_map_page() Joerg Roedel
2016-07-08 11:44 ` [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-12 10:55   ` Robin Murphy
2016-07-12 10:55     ` Robin Murphy
2016-07-12 11:08     ` Joerg Roedel
2016-07-12 11:42       ` Robin Murphy
2016-07-12 11:48         ` Joerg Roedel
2016-07-12 11:48           ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 08/20] iommu/amd: Make use of the generic IOVA allocator Joerg Roedel
2016-07-08 11:44   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 09/20] iommu/amd: Remove other remains of old address allocator Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 10/20] iommu/amd: Remove align-parameter from __map_single() Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 11/20] iommu/amd: Set up data structures for flush queue Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 12/20] iommu/amd: Allow NULL pointer parameter for domain_flush_complete() Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 13/20] iommu/amd: Implement flush queue Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 14/20] iommu/amd: Implement timeout to flush unmap queues Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 15/20] iommu/amd: Introduce dir2prot() helper Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg Joerg Roedel
2016-07-12 11:33   ` Robin Murphy
2016-07-12 13:30     ` [PATCH 16/20 v2] " Joerg Roedel
2016-07-12 13:30       ` Joerg Roedel
2016-07-12 15:34       ` Robin Murphy
2016-07-13 10:27         ` Joerg Roedel
2016-07-13 10:27           ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 17/20] iommu/amd: Use dev_data->domain in get_domain() Joerg Roedel
2016-07-08 11:45   ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 18/20] iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back Joerg Roedel
2016-07-08 11:45 ` [PATCH 19/20] iommu/amd: Flush iova queue before releasing dma_ops_domain Joerg Roedel
2016-07-08 11:45 ` [PATCH 20/20] iommu/amd: Use container_of to get dma_ops_domain Joerg Roedel
2016-07-12  9:03 ` [PATCH 00/20] iommu/amd: Use generic IOVA allocator Wan Zongshun
2016-07-12 10:55   ` Joerg Roedel
2016-07-13  9:44     ` Wan Zongshun
2016-07-13  9:44       ` Wan Zongshun
2016-07-13  9:51       ` Joerg Roedel
2016-07-13  9:51         ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.