All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-04-07 17:42 ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

Hi all,

Since this area seems to be in vogue at the moment, here's what I was
working on when the related patches[1][2] popped up, which happens to
be more or less the intersection of both. As I recycled some of Will's
old series as a starting point, I've retained the cleanup patches from
that with their original acks - hope that's OK.

Fortunately, this already looks rather like parts of Joerg's plan[3],
so I hope it's a suitable first step. Below is a quick hacked-up example
of the kind of caller-controlled special use-case alluded to, using the
SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
the group-based domain allocation call so the driver could throw the
device at that and get its own non-default DMA ops domain to play with.

Robin.

[1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
[2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
[3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937

Robin Murphy (4):
  iommu: of: enforce const-ness of struct iommu_ops
  iommu: Allow selecting page sizes per domain
  iommu/dma: Finish optimising higher-order allocations
  iommu/arm-smmu: Use per-domain page sizes.

Will Deacon (1):
  iommu: remove unused priv field from struct iommu_ops

 arch/arm/include/asm/dma-mapping.h   |  2 +-
 arch/arm/mm/dma-mapping.c            |  6 +++---
 arch/arm64/include/asm/dma-mapping.h |  2 +-
 arch/arm64/mm/dma-mapping.c          |  8 ++++----
 drivers/iommu/arm-smmu-v3.c          | 19 +++++++++---------
 drivers/iommu/arm-smmu.c             | 26 +++++++++++++-----------
 drivers/iommu/dma-iommu.c            | 39 +++++++++++++++++++++++++++---------
 drivers/iommu/iommu.c                | 22 +++++++++++---------
 drivers/iommu/mtk_iommu.c            |  2 +-
 drivers/iommu/of_iommu.c             | 14 ++++++-------
 drivers/of/device.c                  |  2 +-
 drivers/vfio/vfio_iommu_type1.c      |  2 +-
 include/linux/dma-iommu.h            |  4 ++--
 include/linux/dma-mapping.h          |  2 +-
 include/linux/iommu.h                |  5 ++---
 include/linux/of_iommu.h             |  8 ++++----
 16 files changed, 93 insertions(+), 70 deletions(-)

--->8---
diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c
index 56b829f..0da0f4b 100644
--- a/drivers/gpu/drm/arm/hdlcd_drv.c
+++ b/drivers/gpu/drm/arm/hdlcd_drv.c
@@ -13,6 +13,7 @@
 #include <linux/spinlock.h>
 #include <linux/clk.h>
 #include <linux/component.h>
+#include <linux/iommu.h>
 #include <linux/list.h>
 #include <linux/of_graph.h>
 #include <linux/of_reserved_mem.h>
@@ -34,6 +35,7 @@ static int hdlcd_load(struct drm_device *drm, unsigned long flags)
 {
 	struct hdlcd_drm_private *hdlcd = drm->dev_private;
 	struct platform_device *pdev = to_platform_device(drm->dev);
+	struct iommu_domain *dom;
 	struct resource *res;
 	u32 version;
 	int ret;
@@ -79,6 +81,21 @@ static int hdlcd_load(struct drm_device *drm, unsigned long flags)
 	if (ret)
 		goto setup_fail;
 
+	/*
+	 * EXAMPLE: Let's say that if we're using an SMMU, we'd rather waste
+	 * a little memory by forcing DMA allocation and mapping to section
+	 * granularity so the whole buffer fits in the TLBs, than waste power
+	 * by having the SMMU constantly walking page tables all the time we're
+	 * scanning out. In this case we know our default domain isn't shared
+	 * with any other devices, so we can cheat and mangle that directly.
+	 */
+	dom = iommu_get_domain_for_dev(drm->dev);
+	if (dom) {
+		dom->pgsize_bitmap &= ~(SZ_1M - 1);
+		if (!dom->pgsize_bitmap)
+			goto setup_fail;
+	}
+
 	ret = hdlcd_setup_crtc(drm);
 	if (ret < 0) {
 		DRM_ERROR("failed to create crtc\n");
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-04-07 17:42 ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Since this area seems to be in vogue at the moment, here's what I was
working on when the related patches[1][2] popped up, which happens to
be more or less the intersection of both. As I recycled some of Will's
old series as a starting point, I've retained the cleanup patches from
that with their original acks - hope that's OK.

Fortunately, this already looks rather like parts of Joerg's plan[3],
so I hope it's a suitable first step. Below is a quick hacked-up example
of the kind of caller-controlled special use-case alluded to, using the
SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
the group-based domain allocation call so the driver could throw the
device at that and get its own non-default DMA ops domain to play with.

Robin.

[1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
[2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
[3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937

Robin Murphy (4):
  iommu: of: enforce const-ness of struct iommu_ops
  iommu: Allow selecting page sizes per domain
  iommu/dma: Finish optimising higher-order allocations
  iommu/arm-smmu: Use per-domain page sizes.

Will Deacon (1):
  iommu: remove unused priv field from struct iommu_ops

 arch/arm/include/asm/dma-mapping.h   |  2 +-
 arch/arm/mm/dma-mapping.c            |  6 +++---
 arch/arm64/include/asm/dma-mapping.h |  2 +-
 arch/arm64/mm/dma-mapping.c          |  8 ++++----
 drivers/iommu/arm-smmu-v3.c          | 19 +++++++++---------
 drivers/iommu/arm-smmu.c             | 26 +++++++++++++-----------
 drivers/iommu/dma-iommu.c            | 39 +++++++++++++++++++++++++++---------
 drivers/iommu/iommu.c                | 22 +++++++++++---------
 drivers/iommu/mtk_iommu.c            |  2 +-
 drivers/iommu/of_iommu.c             | 14 ++++++-------
 drivers/of/device.c                  |  2 +-
 drivers/vfio/vfio_iommu_type1.c      |  2 +-
 include/linux/dma-iommu.h            |  4 ++--
 include/linux/dma-mapping.h          |  2 +-
 include/linux/iommu.h                |  5 ++---
 include/linux/of_iommu.h             |  8 ++++----
 16 files changed, 93 insertions(+), 70 deletions(-)

--->8---
diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c
index 56b829f..0da0f4b 100644
--- a/drivers/gpu/drm/arm/hdlcd_drv.c
+++ b/drivers/gpu/drm/arm/hdlcd_drv.c
@@ -13,6 +13,7 @@
 #include <linux/spinlock.h>
 #include <linux/clk.h>
 #include <linux/component.h>
+#include <linux/iommu.h>
 #include <linux/list.h>
 #include <linux/of_graph.h>
 #include <linux/of_reserved_mem.h>
@@ -34,6 +35,7 @@ static int hdlcd_load(struct drm_device *drm, unsigned long flags)
 {
 	struct hdlcd_drm_private *hdlcd = drm->dev_private;
 	struct platform_device *pdev = to_platform_device(drm->dev);
+	struct iommu_domain *dom;
 	struct resource *res;
 	u32 version;
 	int ret;
@@ -79,6 +81,21 @@ static int hdlcd_load(struct drm_device *drm, unsigned long flags)
 	if (ret)
 		goto setup_fail;
 
+	/*
+	 * EXAMPLE: Let's say that if we're using an SMMU, we'd rather waste
+	 * a little memory by forcing DMA allocation and mapping to section
+	 * granularity so the whole buffer fits in the TLBs, than waste power
+	 * by having the SMMU constantly walking page tables all the time we're
+	 * scanning out. In this case we know our default domain isn't shared
+	 * with any other devices, so we can cheat and mangle that directly.
+	 */
+	dom = iommu_get_domain_for_dev(drm->dev);
+	if (dom) {
+		dom->pgsize_bitmap &= ~(SZ_1M - 1);
+		if (!dom->pgsize_bitmap)
+			goto setup_fail;
+	}
+
 	ret = hdlcd_setup_crtc(drm);
 	if (ret < 0) {
 		DRM_ERROR("failed to create crtc\n");
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 1/5] iommu: remove unused priv field from struct iommu_ops
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-07 17:42     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

The priv field from iommu_ops is a hangover from the of_dma_configure
series and isn't actually used. Remove it before it has chance to
spread.

Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
Acked-by: Laurent Pinchart <laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw@public.gmane.org>
---
 include/linux/iommu.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 62a5eae..45b055d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -157,7 +157,6 @@ struct iommu_dm_region {
  * @domain_get_windows: Return the number of windows for a domain
  * @of_xlate: add OF master IDs to iommu grouping
  * @pgsize_bitmap: bitmap of supported page sizes
- * @priv: per-instance data private to the iommu driver
  */
 struct iommu_ops {
 	bool (*capable)(enum iommu_cap);
@@ -199,7 +198,6 @@ struct iommu_ops {
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 
 	unsigned long pgsize_bitmap;
-	void *priv;
 };
 
 #define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 1/5] iommu: remove unused priv field from struct iommu_ops
@ 2016-04-07 17:42     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

The priv field from iommu_ops is a hangover from the of_dma_configure
series and isn't actually used. Remove it before it has chance to
spread.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 include/linux/iommu.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 62a5eae..45b055d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -157,7 +157,6 @@ struct iommu_dm_region {
  * @domain_get_windows: Return the number of windows for a domain
  * @of_xlate: add OF master IDs to iommu grouping
  * @pgsize_bitmap: bitmap of supported page sizes
- * @priv: per-instance data private to the iommu driver
  */
 struct iommu_ops {
 	bool (*capable)(enum iommu_cap);
@@ -199,7 +198,6 @@ struct iommu_ops {
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 
 	unsigned long pgsize_bitmap;
-	void *priv;
 };
 
 #define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/5] iommu: of: enforce const-ness of struct iommu_ops
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-07 17:42     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

From: Robin Murphy <Robin.Murphy-5wv7dgnIgG8@public.gmane.org>

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.

Acked-by: Thierry Reding <treding-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 arch/arm/include/asm/dma-mapping.h   |  2 +-
 arch/arm/mm/dma-mapping.c            |  6 +++---
 arch/arm64/include/asm/dma-mapping.h |  2 +-
 arch/arm64/mm/dma-mapping.c          |  4 ++--
 drivers/iommu/of_iommu.c             | 14 +++++++-------
 drivers/of/device.c                  |  2 +-
 include/linux/dma-mapping.h          |  2 +-
 include/linux/of_iommu.h             |  8 ++++----
 8 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 6ad1ced..02283eb 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -118,7 +118,7 @@ static inline unsigned long dma_max_pfn(struct device *dev)
 
 #define arch_setup_dma_ops arch_setup_dma_ops
 extern void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			       struct iommu_ops *iommu, bool coherent);
+			       const struct iommu_ops *iommu, bool coherent);
 
 #define arch_teardown_dma_ops arch_teardown_dma_ops
 extern void arch_teardown_dma_ops(struct device *dev);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index deac58d..617b0cf 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -2214,7 +2214,7 @@ static struct dma_map_ops *arm_get_iommu_dma_map_ops(bool coherent)
 }
 
 static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				    struct iommu_ops *iommu)
+				    const struct iommu_ops *iommu)
 {
 	struct dma_iommu_mapping *mapping;
 
@@ -2252,7 +2252,7 @@ static void arm_teardown_iommu_dma_ops(struct device *dev)
 #else
 
 static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				    struct iommu_ops *iommu)
+				    const struct iommu_ops *iommu)
 {
 	return false;
 }
@@ -2269,7 +2269,7 @@ static struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent)
+			const struct iommu_ops *iommu, bool coherent)
 {
 	struct dma_map_ops *dma_ops;
 
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index ba437f0..7dbea6c 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -48,7 +48,7 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent);
+			const struct iommu_ops *iommu, bool coherent);
 #define arch_setup_dma_ops	arch_setup_dma_ops
 
 #ifdef CONFIG_IOMMU_DMA
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a6e757c..5d36907 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -979,13 +979,13 @@ void arch_teardown_dma_ops(struct device *dev)
 #else
 
 static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  struct iommu_ops *iommu)
+				  const struct iommu_ops *iommu)
 { }
 
 #endif  /* CONFIG_IOMMU_DMA */
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent)
+			const struct iommu_ops *iommu, bool coherent)
 {
 	if (!dev->archdata.dma_ops)
 		dev->archdata.dma_ops = &swiotlb_dma_ops;
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5fea665..af499ae 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -98,12 +98,12 @@ EXPORT_SYMBOL_GPL(of_get_dma_window);
 struct of_iommu_node {
 	struct list_head list;
 	struct device_node *np;
-	struct iommu_ops *ops;
+	const struct iommu_ops *ops;
 };
 static LIST_HEAD(of_iommu_list);
 static DEFINE_SPINLOCK(of_iommu_lock);
 
-void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops)
+void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops)
 {
 	struct of_iommu_node *iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
 
@@ -119,10 +119,10 @@ void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops)
 	spin_unlock(&of_iommu_lock);
 }
 
-struct iommu_ops *of_iommu_get_ops(struct device_node *np)
+const struct iommu_ops *of_iommu_get_ops(struct device_node *np)
 {
 	struct of_iommu_node *node;
-	struct iommu_ops *ops = NULL;
+	const struct iommu_ops *ops = NULL;
 
 	spin_lock(&of_iommu_lock);
 	list_for_each_entry(node, &of_iommu_list, list)
@@ -134,12 +134,12 @@ struct iommu_ops *of_iommu_get_ops(struct device_node *np)
 	return ops;
 }
 
-struct iommu_ops *of_iommu_configure(struct device *dev,
-				     struct device_node *master_np)
+const struct iommu_ops *of_iommu_configure(struct device *dev,
+					   struct device_node *master_np)
 {
 	struct of_phandle_args iommu_spec;
 	struct device_node *np;
-	struct iommu_ops *ops = NULL;
+	const struct iommu_ops *ops = NULL;
 	int idx = 0;
 
 	/*
diff --git a/drivers/of/device.c b/drivers/of/device.c
index e5f47ce..fd5cfad 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -88,7 +88,7 @@ void of_dma_configure(struct device *dev, struct device_node *np)
 	int ret;
 	bool coherent;
 	unsigned long offset;
-	struct iommu_ops *iommu;
+	const struct iommu_ops *iommu;
 
 	/*
 	 * Set default coherent_dma_mask to 32 bit.  Drivers are expected to
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 9ea9aba..71c1b21 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -514,7 +514,7 @@ extern u64 dma_get_required_mask(struct device *dev);
 
 #ifndef arch_setup_dma_ops
 static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
-				      u64 size, struct iommu_ops *iommu,
+				      u64 size, const struct iommu_ops *iommu,
 				      bool coherent) { }
 #endif
 
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index ffbe470..bd02b44 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -12,7 +12,7 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 			     size_t *size);
 
 extern void of_iommu_init(void);
-extern struct iommu_ops *of_iommu_configure(struct device *dev,
+extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
 #else
@@ -25,7 +25,7 @@ static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
 }
 
 static inline void of_iommu_init(void) { }
-static inline struct iommu_ops *of_iommu_configure(struct device *dev,
+static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 					 struct device_node *master_np)
 {
 	return NULL;
@@ -33,8 +33,8 @@ static inline struct iommu_ops *of_iommu_configure(struct device *dev,
 
 #endif	/* CONFIG_OF_IOMMU */
 
-void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops);
-struct iommu_ops *of_iommu_get_ops(struct device_node *np);
+void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops);
+const struct iommu_ops *of_iommu_get_ops(struct device_node *np);
 
 extern struct of_device_id __iommu_of_table;
 
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/5] iommu: of: enforce const-ness of struct iommu_ops
@ 2016-04-07 17:42     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.

Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm/include/asm/dma-mapping.h   |  2 +-
 arch/arm/mm/dma-mapping.c            |  6 +++---
 arch/arm64/include/asm/dma-mapping.h |  2 +-
 arch/arm64/mm/dma-mapping.c          |  4 ++--
 drivers/iommu/of_iommu.c             | 14 +++++++-------
 drivers/of/device.c                  |  2 +-
 include/linux/dma-mapping.h          |  2 +-
 include/linux/of_iommu.h             |  8 ++++----
 8 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 6ad1ced..02283eb 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -118,7 +118,7 @@ static inline unsigned long dma_max_pfn(struct device *dev)
 
 #define arch_setup_dma_ops arch_setup_dma_ops
 extern void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			       struct iommu_ops *iommu, bool coherent);
+			       const struct iommu_ops *iommu, bool coherent);
 
 #define arch_teardown_dma_ops arch_teardown_dma_ops
 extern void arch_teardown_dma_ops(struct device *dev);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index deac58d..617b0cf 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -2214,7 +2214,7 @@ static struct dma_map_ops *arm_get_iommu_dma_map_ops(bool coherent)
 }
 
 static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				    struct iommu_ops *iommu)
+				    const struct iommu_ops *iommu)
 {
 	struct dma_iommu_mapping *mapping;
 
@@ -2252,7 +2252,7 @@ static void arm_teardown_iommu_dma_ops(struct device *dev)
 #else
 
 static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				    struct iommu_ops *iommu)
+				    const struct iommu_ops *iommu)
 {
 	return false;
 }
@@ -2269,7 +2269,7 @@ static struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent)
+			const struct iommu_ops *iommu, bool coherent)
 {
 	struct dma_map_ops *dma_ops;
 
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index ba437f0..7dbea6c 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -48,7 +48,7 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent);
+			const struct iommu_ops *iommu, bool coherent);
 #define arch_setup_dma_ops	arch_setup_dma_ops
 
 #ifdef CONFIG_IOMMU_DMA
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a6e757c..5d36907 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -979,13 +979,13 @@ void arch_teardown_dma_ops(struct device *dev)
 #else
 
 static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  struct iommu_ops *iommu)
+				  const struct iommu_ops *iommu)
 { }
 
 #endif  /* CONFIG_IOMMU_DMA */
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-			struct iommu_ops *iommu, bool coherent)
+			const struct iommu_ops *iommu, bool coherent)
 {
 	if (!dev->archdata.dma_ops)
 		dev->archdata.dma_ops = &swiotlb_dma_ops;
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5fea665..af499ae 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -98,12 +98,12 @@ EXPORT_SYMBOL_GPL(of_get_dma_window);
 struct of_iommu_node {
 	struct list_head list;
 	struct device_node *np;
-	struct iommu_ops *ops;
+	const struct iommu_ops *ops;
 };
 static LIST_HEAD(of_iommu_list);
 static DEFINE_SPINLOCK(of_iommu_lock);
 
-void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops)
+void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops)
 {
 	struct of_iommu_node *iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
 
@@ -119,10 +119,10 @@ void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops)
 	spin_unlock(&of_iommu_lock);
 }
 
-struct iommu_ops *of_iommu_get_ops(struct device_node *np)
+const struct iommu_ops *of_iommu_get_ops(struct device_node *np)
 {
 	struct of_iommu_node *node;
-	struct iommu_ops *ops = NULL;
+	const struct iommu_ops *ops = NULL;
 
 	spin_lock(&of_iommu_lock);
 	list_for_each_entry(node, &of_iommu_list, list)
@@ -134,12 +134,12 @@ struct iommu_ops *of_iommu_get_ops(struct device_node *np)
 	return ops;
 }
 
-struct iommu_ops *of_iommu_configure(struct device *dev,
-				     struct device_node *master_np)
+const struct iommu_ops *of_iommu_configure(struct device *dev,
+					   struct device_node *master_np)
 {
 	struct of_phandle_args iommu_spec;
 	struct device_node *np;
-	struct iommu_ops *ops = NULL;
+	const struct iommu_ops *ops = NULL;
 	int idx = 0;
 
 	/*
diff --git a/drivers/of/device.c b/drivers/of/device.c
index e5f47ce..fd5cfad 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -88,7 +88,7 @@ void of_dma_configure(struct device *dev, struct device_node *np)
 	int ret;
 	bool coherent;
 	unsigned long offset;
-	struct iommu_ops *iommu;
+	const struct iommu_ops *iommu;
 
 	/*
 	 * Set default coherent_dma_mask to 32 bit.  Drivers are expected to
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 9ea9aba..71c1b21 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -514,7 +514,7 @@ extern u64 dma_get_required_mask(struct device *dev);
 
 #ifndef arch_setup_dma_ops
 static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
-				      u64 size, struct iommu_ops *iommu,
+				      u64 size, const struct iommu_ops *iommu,
 				      bool coherent) { }
 #endif
 
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index ffbe470..bd02b44 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -12,7 +12,7 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 			     size_t *size);
 
 extern void of_iommu_init(void);
-extern struct iommu_ops *of_iommu_configure(struct device *dev,
+extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
 #else
@@ -25,7 +25,7 @@ static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
 }
 
 static inline void of_iommu_init(void) { }
-static inline struct iommu_ops *of_iommu_configure(struct device *dev,
+static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 					 struct device_node *master_np)
 {
 	return NULL;
@@ -33,8 +33,8 @@ static inline struct iommu_ops *of_iommu_configure(struct device *dev,
 
 #endif	/* CONFIG_OF_IOMMU */
 
-void of_iommu_set_ops(struct device_node *np, struct iommu_ops *ops);
-struct iommu_ops *of_iommu_get_ops(struct device_node *np);
+void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops);
+const struct iommu_ops *of_iommu_get_ops(struct device_node *np);
 
 extern struct of_device_id __iommu_of_table;
 
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/5] iommu: Allow selecting page sizes per domain
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-07 17:42     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

Many IOMMUs support multiple page table formats, meaning that any given
domain may only support a subset of the hardware page sizes presented in
iommu_ops->pgsize_bitmap. There are also certain use-cases where the
creator of a domain may want to control which page sizes are used, for
example to force the use of hugepage mappings to reduce pagetable walk
depth.

To this end, add a per-domain pgsize_bitmap to represent the subset of
page sizes actually in use, to make it possible for domains with
different requirements to coexist.

Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
[rm: hijacked and rebased original patch with new commit message]
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c       |  2 +-
 drivers/iommu/iommu.c           | 22 ++++++++++++----------
 drivers/iommu/mtk_iommu.c       |  2 +-
 drivers/vfio/vfio_iommu_type1.c |  2 +-
 include/linux/iommu.h           |  3 ++-
 5 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 58f2fe6..6edc852 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -94,7 +94,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size
 		return -ENODEV;
 
 	/* Use the smallest supported page size for IOVA granularity */
-	order = __ffs(domain->ops->pgsize_bitmap);
+	order = __ffs(domain->pgsize_bitmap);
 	base_pfn = max_t(unsigned long, 1, base >> order);
 	end_pfn = (base + size - 1) >> order;
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b9df141..ab4d014 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -337,9 +337,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	if (!domain || domain->type != IOMMU_DOMAIN_DMA)
 		return 0;
 
-	BUG_ON(!domain->ops->pgsize_bitmap);
+	BUG_ON(!domain->pgsize_bitmap);
 
-	pg_size = 1UL << __ffs(domain->ops->pgsize_bitmap);
+	pg_size = 1UL << __ffs(domain->pgsize_bitmap);
 	INIT_LIST_HEAD(&mappings);
 
 	iommu_get_dm_regions(dev, &mappings);
@@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	/* Assume all sizes by default; the driver may override this later */
+	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
 
 	return domain;
 }
@@ -1297,7 +1299,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 	pgsize = (1UL << (pgsize_idx + 1)) - 1;
 
 	/* throw away page sizes not supported by the hardware */
-	pgsize &= domain->ops->pgsize_bitmap;
+	pgsize &= domain->pgsize_bitmap;
 
 	/* make sure we're still sane */
 	BUG_ON(!pgsize);
@@ -1319,14 +1321,14 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	int ret = 0;
 
 	if (unlikely(domain->ops->map == NULL ||
-		     domain->ops->pgsize_bitmap == 0UL))
+		     domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
 	/* find out the minimum page size supported */
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	/*
 	 * both the virtual address and the physical one, as well as
@@ -1373,14 +1375,14 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 	unsigned long orig_iova = iova;
 
 	if (unlikely(domain->ops->unmap == NULL ||
-		     domain->ops->pgsize_bitmap == 0UL))
+		     domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
 	/* find out the minimum page size supported */
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	/*
 	 * The virtual address, as well as the size of the mapping, must be
@@ -1426,10 +1428,10 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 	unsigned int i, min_pagesz;
 	int ret;
 
-	if (unlikely(domain->ops->pgsize_bitmap == 0UL))
+	if (unlikely(domain->pgsize_bitmap == 0UL))
 		return 0;
 
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	for_each_sg(sg, s, nents, i) {
 		phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset;
@@ -1510,7 +1512,7 @@ int iommu_domain_get_attr(struct iommu_domain *domain,
 		break;
 	case DOMAIN_ATTR_PAGING:
 		paging  = data;
-		*paging = (domain->ops->pgsize_bitmap != 0UL);
+		*paging = (domain->pgsize_bitmap != 0UL);
 		break;
 	case DOMAIN_ATTR_WINDOWS:
 		count = data;
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index db74553..c3043d8 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -269,7 +269,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_data *data)
 	}
 
 	/* Update our support page sizes bitmap */
-	mtk_iommu_ops.pgsize_bitmap = dom->cfg.pgsize_bitmap;
+	dom->domain.pgsize_bitmap = dom->cfg.pgsize_bitmap;
 
 	writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
 	       data->base + REG_MMU_PT_BASE_ADDR);
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 75b24e9..15a6582 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -407,7 +407,7 @@ static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(domain, &iommu->domain_list, next)
-		bitmap &= domain->domain->ops->pgsize_bitmap;
+		bitmap &= domain->domain->pgsize_bitmap;
 	mutex_unlock(&iommu->lock);
 
 	/*
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 45b055d..664683a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -79,6 +79,7 @@ struct iommu_domain_geometry {
 struct iommu_domain {
 	unsigned type;
 	const struct iommu_ops *ops;
+	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
 	iommu_fault_handler_t handler;
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
@@ -156,7 +157,7 @@ struct iommu_dm_region {
  * @domain_set_windows: Set the number of windows for a domain
  * @domain_get_windows: Return the number of windows for a domain
  * @of_xlate: add OF master IDs to iommu grouping
- * @pgsize_bitmap: bitmap of supported page sizes
+ * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
 	bool (*capable)(enum iommu_cap);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/5] iommu: Allow selecting page sizes per domain
@ 2016-04-07 17:42     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

Many IOMMUs support multiple page table formats, meaning that any given
domain may only support a subset of the hardware page sizes presented in
iommu_ops->pgsize_bitmap. There are also certain use-cases where the
creator of a domain may want to control which page sizes are used, for
example to force the use of hugepage mappings to reduce pagetable walk
depth.

To this end, add a per-domain pgsize_bitmap to represent the subset of
page sizes actually in use, to make it possible for domains with
different requirements to coexist.

Signed-off-by: Will Deacon <will.deacon@arm.com>
[rm: hijacked and rebased original patch with new commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c       |  2 +-
 drivers/iommu/iommu.c           | 22 ++++++++++++----------
 drivers/iommu/mtk_iommu.c       |  2 +-
 drivers/vfio/vfio_iommu_type1.c |  2 +-
 include/linux/iommu.h           |  3 ++-
 5 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 58f2fe6..6edc852 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -94,7 +94,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size
 		return -ENODEV;
 
 	/* Use the smallest supported page size for IOVA granularity */
-	order = __ffs(domain->ops->pgsize_bitmap);
+	order = __ffs(domain->pgsize_bitmap);
 	base_pfn = max_t(unsigned long, 1, base >> order);
 	end_pfn = (base + size - 1) >> order;
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b9df141..ab4d014 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -337,9 +337,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
 	if (!domain || domain->type != IOMMU_DOMAIN_DMA)
 		return 0;
 
-	BUG_ON(!domain->ops->pgsize_bitmap);
+	BUG_ON(!domain->pgsize_bitmap);
 
-	pg_size = 1UL << __ffs(domain->ops->pgsize_bitmap);
+	pg_size = 1UL << __ffs(domain->pgsize_bitmap);
 	INIT_LIST_HEAD(&mappings);
 
 	iommu_get_dm_regions(dev, &mappings);
@@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	/* Assume all sizes by default; the driver may override this later */
+	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
 
 	return domain;
 }
@@ -1297,7 +1299,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 	pgsize = (1UL << (pgsize_idx + 1)) - 1;
 
 	/* throw away page sizes not supported by the hardware */
-	pgsize &= domain->ops->pgsize_bitmap;
+	pgsize &= domain->pgsize_bitmap;
 
 	/* make sure we're still sane */
 	BUG_ON(!pgsize);
@@ -1319,14 +1321,14 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	int ret = 0;
 
 	if (unlikely(domain->ops->map == NULL ||
-		     domain->ops->pgsize_bitmap == 0UL))
+		     domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
 	/* find out the minimum page size supported */
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	/*
 	 * both the virtual address and the physical one, as well as
@@ -1373,14 +1375,14 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 	unsigned long orig_iova = iova;
 
 	if (unlikely(domain->ops->unmap == NULL ||
-		     domain->ops->pgsize_bitmap == 0UL))
+		     domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
 	/* find out the minimum page size supported */
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	/*
 	 * The virtual address, as well as the size of the mapping, must be
@@ -1426,10 +1428,10 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 	unsigned int i, min_pagesz;
 	int ret;
 
-	if (unlikely(domain->ops->pgsize_bitmap == 0UL))
+	if (unlikely(domain->pgsize_bitmap == 0UL))
 		return 0;
 
-	min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
 	for_each_sg(sg, s, nents, i) {
 		phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset;
@@ -1510,7 +1512,7 @@ int iommu_domain_get_attr(struct iommu_domain *domain,
 		break;
 	case DOMAIN_ATTR_PAGING:
 		paging  = data;
-		*paging = (domain->ops->pgsize_bitmap != 0UL);
+		*paging = (domain->pgsize_bitmap != 0UL);
 		break;
 	case DOMAIN_ATTR_WINDOWS:
 		count = data;
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index db74553..c3043d8 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -269,7 +269,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_data *data)
 	}
 
 	/* Update our support page sizes bitmap */
-	mtk_iommu_ops.pgsize_bitmap = dom->cfg.pgsize_bitmap;
+	dom->domain.pgsize_bitmap = dom->cfg.pgsize_bitmap;
 
 	writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
 	       data->base + REG_MMU_PT_BASE_ADDR);
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 75b24e9..15a6582 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -407,7 +407,7 @@ static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
 
 	mutex_lock(&iommu->lock);
 	list_for_each_entry(domain, &iommu->domain_list, next)
-		bitmap &= domain->domain->ops->pgsize_bitmap;
+		bitmap &= domain->domain->pgsize_bitmap;
 	mutex_unlock(&iommu->lock);
 
 	/*
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 45b055d..664683a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -79,6 +79,7 @@ struct iommu_domain_geometry {
 struct iommu_domain {
 	unsigned type;
 	const struct iommu_ops *ops;
+	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
 	iommu_fault_handler_t handler;
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
@@ -156,7 +157,7 @@ struct iommu_dm_region {
  * @domain_set_windows: Set the number of windows for a domain
  * @domain_get_windows: Return the number of windows for a domain
  * @of_xlate: add OF master IDs to iommu grouping
- * @pgsize_bitmap: bitmap of supported page sizes
+ * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
 	bool (*capable)(enum iommu_cap);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-07 17:42     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 arch/arm64/mm/dma-mapping.c |  4 ++--
 drivers/iommu/dma-iommu.c   | 37 ++++++++++++++++++++++++++++---------
 include/linux/dma-iommu.h   |  4 ++--
 3 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 5d36907..41d19a0 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 6edc852..6dc8dfc 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,16 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count,
+		unsigned long pgsize_orders, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+	unsigned int min_order = __ffs(pgsize_orders);
+
+	pgsize_orders &= (2U << MAX_ORDER) - 1;
+	if (!pgsize_orders)
+		return NULL;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -208,6 +213,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
+		unsigned int order;
 		int j;
 
 		/*
@@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 		 * than a necessity, hence using __GFP_NORETRY until
 		 * falling back to single-page allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
+		for (pgsize_orders &= (2U << __fls(count)) - 1;
+		     (order = __fls(pgsize_orders)) > min_order;
+		     pgsize_orders &= (1U << order) - 1) {
 			page = alloc_pages(gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
@@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 			}
 		}
 		if (!page)
-			page = alloc_page(gfp);
+			page = alloc_pages(gfp, order);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
@@ -267,6 +274,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
+ * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
@@ -278,8 +286,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -288,11 +296,22 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 	struct page **pages;
 	struct sg_table sgt;
 	dma_addr_t dma_addr;
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned int count, min_pgsize, pgsizes = domain->pgsize_bitmap;
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	if (pgsizes & (PAGE_SIZE - 1)) {
+		pgsizes &= PAGE_MASK;
+		pgsizes |= PAGE_SIZE;
+	}
+
+	min_pgsize = pgsizes ^ (pgsizes & (pgsizes - 1));
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		pgsizes = min_pgsize;
+
+	size = ALIGN(size, min_pgsize);
+	count = size >> PAGE_SHIFT;
+	pages = __iommu_dma_alloc_pages(count, pgsizes >> PAGE_SHIFT, gfp);
 	if (!pages)
 		return NULL;
 
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..8443bbb 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
@ 2016-04-07 17:42     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/mm/dma-mapping.c |  4 ++--
 drivers/iommu/dma-iommu.c   | 37 ++++++++++++++++++++++++++++---------
 include/linux/dma-iommu.h   |  4 ++--
 3 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 5d36907..41d19a0 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 6edc852..6dc8dfc 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,16 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count,
+		unsigned long pgsize_orders, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+	unsigned int min_order = __ffs(pgsize_orders);
+
+	pgsize_orders &= (2U << MAX_ORDER) - 1;
+	if (!pgsize_orders)
+		return NULL;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -208,6 +213,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
+		unsigned int order;
 		int j;
 
 		/*
@@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 		 * than a necessity, hence using __GFP_NORETRY until
 		 * falling back to single-page allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
+		for (pgsize_orders &= (2U << __fls(count)) - 1;
+		     (order = __fls(pgsize_orders)) > min_order;
+		     pgsize_orders &= (1U << order) - 1) {
 			page = alloc_pages(gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
@@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 			}
 		}
 		if (!page)
-			page = alloc_page(gfp);
+			page = alloc_pages(gfp, order);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
@@ -267,6 +274,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
+ * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
@@ -278,8 +286,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -288,11 +296,22 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 	struct page **pages;
 	struct sg_table sgt;
 	dma_addr_t dma_addr;
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned int count, min_pgsize, pgsizes = domain->pgsize_bitmap;
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	if (pgsizes & (PAGE_SIZE - 1)) {
+		pgsizes &= PAGE_MASK;
+		pgsizes |= PAGE_SIZE;
+	}
+
+	min_pgsize = pgsizes ^ (pgsizes & (pgsizes - 1));
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		pgsizes = min_pgsize;
+
+	size = ALIGN(size, min_pgsize);
+	count = size >> PAGE_SHIFT;
+	pages = __iommu_dma_alloc_pages(count, pgsizes >> PAGE_SHIFT, gfp);
 	if (!pages)
 		return NULL;
 
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..8443bbb 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/5] iommu/arm-smmu: Use per-domain page sizes.
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-07 17:42     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

Now that we can accurately reflect the context format we choose for each
domain, do that instead of imposing the global lowest-common-denominator
restriction and potentially ending up with nothing. We currently have a
strict 1:1 correspondence between domains and context banks, so we don't
need to entertain the possibility of multiple formats _within_ a domain.

Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
[rm: split from original patch, added SMMUv3]
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
 drivers/iommu/arm-smmu.c    | 26 ++++++++++++++------------
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..ebab33e 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {
 
 	unsigned long			ias; /* IPA */
 	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;
 
 #define ARM_SMMU_MAX_ASIDS		(1 << 16)
 	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static struct iommu_ops arm_smmu_ops;
-
 static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 {
 	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
 	bool coherent;
-	unsigned long pgsize_bitmap = 0;
 
 	/* IDR0 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 
 	/* Page sizes */
 	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 
-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
 
 	/* Output address size */
 	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 2409e3b..e9535f0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -322,6 +322,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
 	u32				num_context_irqs;
@@ -357,8 +358,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-static struct iommu_ops arm_smmu_ops;
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -894,7 +893,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -908,8 +907,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1690,24 +1689,27 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	if (smmu->version == ARM_SMMU_V1) {
 		smmu->va_size = smmu->ipa_size;
-		size = SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap = SZ_4K | SZ_2M | SZ_1G;
 	} else {
 		size = (id >> ID2_UBS_SHIFT) & ID2_UBS_MASK;
 		smmu->va_size = arm_smmu_id_size_to_bits(size);
 #ifndef CONFIG_64BIT
 		smmu->va_size = min(32UL, smmu->va_size);
 #endif
-		size = 0;
 		if (id & ID2_PTFS_4K)
-			size |= SZ_4K | SZ_2M | SZ_1G;
+			smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 		if (id & ID2_PTFS_16K)
-			size |= SZ_16K | SZ_32M;
+			smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 		if (id & ID2_PTFS_64K)
-			size |= SZ_64K | SZ_512M;
+			smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	}
 
-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);
 
 	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
 		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/5] iommu/arm-smmu: Use per-domain page sizes.
@ 2016-04-07 17:42     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-07 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we can accurately reflect the context format we choose for each
domain, do that instead of imposing the global lowest-common-denominator
restriction and potentially ending up with nothing. We currently have a
strict 1:1 correspondence between domains and context banks, so we don't
need to entertain the possibility of multiple formats _within_ a domain.

Signed-off-by: Will Deacon <will.deacon@arm.com>
[rm: split from original patch, added SMMUv3]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
 drivers/iommu/arm-smmu.c    | 26 ++++++++++++++------------
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..ebab33e 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {
 
 	unsigned long			ias; /* IPA */
 	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;
 
 #define ARM_SMMU_MAX_ASIDS		(1 << 16)
 	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static struct iommu_ops arm_smmu_ops;
-
 static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 {
 	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
 	bool coherent;
-	unsigned long pgsize_bitmap = 0;
 
 	/* IDR0 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 
 	/* Page sizes */
 	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 
-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
 
 	/* Output address size */
 	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 2409e3b..e9535f0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -322,6 +322,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
 	u32				num_context_irqs;
@@ -357,8 +358,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-static struct iommu_ops arm_smmu_ops;
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -894,7 +893,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -908,8 +907,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1690,24 +1689,27 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	if (smmu->version == ARM_SMMU_V1) {
 		smmu->va_size = smmu->ipa_size;
-		size = SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap = SZ_4K | SZ_2M | SZ_1G;
 	} else {
 		size = (id >> ID2_UBS_SHIFT) & ID2_UBS_MASK;
 		smmu->va_size = arm_smmu_id_size_to_bits(size);
 #ifndef CONFIG_64BIT
 		smmu->va_size = min(32UL, smmu->va_size);
 #endif
-		size = 0;
 		if (id & ID2_PTFS_4K)
-			size |= SZ_4K | SZ_2M | SZ_1G;
+			smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 		if (id & ID2_PTFS_16K)
-			size |= SZ_16K | SZ_32M;
+			smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 		if (id & ID2_PTFS_64K)
-			size |= SZ_64K | SZ_512M;
+			smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	}
 
-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);
 
 	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
 		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
  2016-04-07 17:42     ` Robin Murphy
@ 2016-04-08  5:32         ` Yong Wu
  -1 siblings, 0 replies; 36+ messages in thread
From: Yong Wu @ 2016-04-08  5:32 UTC (permalink / raw)
  To: Robin Murphy
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Thu, 2016-04-07 at 18:42 +0100, Robin Murphy wrote:
>  		/*
> @@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  		 * than a necessity, hence using __GFP_NORETRY until
>  		 * falling back to single-page allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> +		for (pgsize_orders &= (2U << __fls(count)) - 1;
> +		     (order = __fls(pgsize_orders)) > min_order;
> +		     pgsize_orders &= (1U << order) - 1) {
>  			page = alloc_pages(gfp | __GFP_NORETRY, order);
>  			if (!page)
>  				continue;
> @@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  			}
>  		}
>  		if (!page)
> -			page = alloc_page(gfp);
> +			page = alloc_pages(gfp, order);

A small question: Do we need split it too if order != 0 here?


>  		if (!page) {
>  			__iommu_dma_free_pages(pages, i);
>  			return NULL;
[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
@ 2016-04-08  5:32         ` Yong Wu
  0 siblings, 0 replies; 36+ messages in thread
From: Yong Wu @ 2016-04-08  5:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2016-04-07 at 18:42 +0100, Robin Murphy wrote:
>  		/*
> @@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  		 * than a necessity, hence using __GFP_NORETRY until
>  		 * falling back to single-page allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> +		for (pgsize_orders &= (2U << __fls(count)) - 1;
> +		     (order = __fls(pgsize_orders)) > min_order;
> +		     pgsize_orders &= (1U << order) - 1) {
>  			page = alloc_pages(gfp | __GFP_NORETRY, order);
>  			if (!page)
>  				continue;
> @@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  			}
>  		}
>  		if (!page)
> -			page = alloc_page(gfp);
> +			page = alloc_pages(gfp, order);

A small question: Do we need split it too if order != 0 here?


>  		if (!page) {
>  			__iommu_dma_free_pages(pages, i);
>  			return NULL;
[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
  2016-04-08  5:32         ` Yong Wu
@ 2016-04-08 16:33           ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-08 16:33 UTC (permalink / raw)
  To: Yong Wu
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 08/04/16 06:32, Yong Wu wrote:
> On Thu, 2016-04-07 at 18:42 +0100, Robin Murphy wrote:
>>   		/*
>> @@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>   		 * than a necessity, hence using __GFP_NORETRY until
>>   		 * falling back to single-page allocations.
>>   		 */
>> -		for (order = min_t(unsigned int, order, __fls(count));
>> -		     order > 0; order--) {
>> +		for (pgsize_orders &= (2U << __fls(count)) - 1;
>> +		     (order = __fls(pgsize_orders)) > min_order;
>> +		     pgsize_orders &= (1U << order) - 1) {
>>   			page = alloc_pages(gfp | __GFP_NORETRY, order);
>>   			if (!page)
>>   				continue;
>> @@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>   			}
>>   		}
>>   		if (!page)
>> -			page = alloc_page(gfp);
>> +			page = alloc_pages(gfp, order);
>
> A small question: Do we need split it too if order != 0 here?

Ah, good point, somehow I missed that. It didn't stop my framebuffer 
console working kernel-side, but indeed I can't mmap it due to the 
un-split pages. I'll take that as an excuse to have a go at refactoring 
the whole thing to maybe not reach 5 levels of indentation.

Thanks,
Robin.

>
>
>>   		if (!page) {
>>   			__iommu_dma_free_pages(pages, i);
>>   			return NULL;
> [...]
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations
@ 2016-04-08 16:33           ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-08 16:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/04/16 06:32, Yong Wu wrote:
> On Thu, 2016-04-07 at 18:42 +0100, Robin Murphy wrote:
>>   		/*
>> @@ -215,8 +221,9 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>   		 * than a necessity, hence using __GFP_NORETRY until
>>   		 * falling back to single-page allocations.
>>   		 */
>> -		for (order = min_t(unsigned int, order, __fls(count));
>> -		     order > 0; order--) {
>> +		for (pgsize_orders &= (2U << __fls(count)) - 1;
>> +		     (order = __fls(pgsize_orders)) > min_order;
>> +		     pgsize_orders &= (1U << order) - 1) {
>>   			page = alloc_pages(gfp | __GFP_NORETRY, order);
>>   			if (!page)
>>   				continue;
>> @@ -230,7 +237,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>   			}
>>   		}
>>   		if (!page)
>> -			page = alloc_page(gfp);
>> +			page = alloc_pages(gfp, order);
>
> A small question: Do we need split it too if order != 0 here?

Ah, good point, somehow I missed that. It didn't stop my framebuffer 
console working kernel-side, but indeed I can't mmap it due to the 
un-split pages. I'll take that as an excuse to have a go at refactoring 
the whole thing to maybe not reach 5 levels of indentation.

Thanks,
Robin.

>
>
>>   		if (!page) {
>>   			__iommu_dma_free_pages(pages, i);
>>   			return NULL;
> [...]
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/dma: Finish optimising higher-order allocations
  2016-04-07 17:42     ` Robin Murphy
@ 2016-04-13 16:29         ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-13 16:29 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	dianders-F7+t8E8rja9g9hUCZPvPmw

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

Just throwing this out as a quick solo update as I'm still expecting
discussion on the rest of the series.

Since v1:
- Rearrange things so that all allocations happen within the loop,
  and always get split as appropriate.
- Rename a bunch of things to make more sense.
- Got rid of a fair amount of the gratuitous bit-twiddling.

 arch/arm64/mm/dma-mapping.c |  4 +--
 drivers/iommu/dma-iommu.c   | 60 +++++++++++++++++++++++++++++----------------
 include/linux/dma-iommu.h   |  4 +--
 3 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 5d36907..41d19a0 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 6edc852..c67713f 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,15 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count,
+		unsigned long order_mask, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+
+	order_mask &= (2U << MAX_ORDER) - 1;
+	if (!order_mask)
+		return NULL;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -208,36 +212,38 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
-		int j;
+		unsigned int order_size;
 
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
-		 * falling back to single-page allocations.
+		 * falling back to minimum-order allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
-			page = alloc_pages(gfp | __GFP_NORETRY, order);
+		for (order_mask &= (2U << __fls(count)) - 1;
+		     order_mask; order_mask &= ~order_size) {
+			unsigned int order = __fls(order_mask);
+
+			order_size = 1U << order;
+			page = alloc_pages((order_mask - order_size) ?
+					   gfp | __GFP_NORETRY : gfp, order);
 			if (!page)
 				continue;
-			if (PageCompound(page)) {
-				if (!split_huge_page(page))
-					break;
-				__free_pages(page, order);
-			} else {
+			if (!order)
+				break;
+			if (!PageCompound(page)) {
 				split_page(page, order);
 				break;
+			} else if (!split_huge_page(page)) {
+				break;
 			}
+			__free_pages(page, order);
 		}
-		if (!page)
-			page = alloc_page(gfp);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
 		}
-		j = 1 << order;
-		count -= j;
-		while (j--)
+		count -= order_size;
+		while (order_size--)
 			pages[i++] = page++;
 	}
 	return pages;
@@ -267,6 +273,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
+ * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
@@ -278,8 +285,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -288,11 +295,22 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 	struct page **pages;
 	struct sg_table sgt;
 	dma_addr_t dma_addr;
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	min_size = alloc_sizes & -alloc_sizes;
+	if (min_size < PAGE_SIZE) {
+		min_size = PAGE_SIZE;
+		alloc_sizes |= PAGE_SIZE;
+	} else {
+		size = ALIGN(size, min_size);
+	}
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		alloc_sizes = min_size;
+
+	count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pages = __iommu_dma_alloc_pages(count, alloc_sizes >> PAGE_SHIFT, gfp);
 	if (!pages)
 		return NULL;
 
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..8443bbb 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/dma: Finish optimising higher-order allocations
@ 2016-04-13 16:29         ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-04-13 16:29 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Just throwing this out as a quick solo update as I'm still expecting
discussion on the rest of the series.

Since v1:
- Rearrange things so that all allocations happen within the loop,
  and always get split as appropriate.
- Rename a bunch of things to make more sense.
- Got rid of a fair amount of the gratuitous bit-twiddling.

 arch/arm64/mm/dma-mapping.c |  4 +--
 drivers/iommu/dma-iommu.c   | 60 +++++++++++++++++++++++++++++----------------
 include/linux/dma-iommu.h   |  4 +--
 3 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 5d36907..41d19a0 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 6edc852..c67713f 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,15 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count,
+		unsigned long order_mask, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+
+	order_mask &= (2U << MAX_ORDER) - 1;
+	if (!order_mask)
+		return NULL;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -208,36 +212,38 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
-		int j;
+		unsigned int order_size;
 
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
-		 * falling back to single-page allocations.
+		 * falling back to minimum-order allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
-			page = alloc_pages(gfp | __GFP_NORETRY, order);
+		for (order_mask &= (2U << __fls(count)) - 1;
+		     order_mask; order_mask &= ~order_size) {
+			unsigned int order = __fls(order_mask);
+
+			order_size = 1U << order;
+			page = alloc_pages((order_mask - order_size) ?
+					   gfp | __GFP_NORETRY : gfp, order);
 			if (!page)
 				continue;
-			if (PageCompound(page)) {
-				if (!split_huge_page(page))
-					break;
-				__free_pages(page, order);
-			} else {
+			if (!order)
+				break;
+			if (!PageCompound(page)) {
 				split_page(page, order);
 				break;
+			} else if (!split_huge_page(page)) {
+				break;
 			}
+			__free_pages(page, order);
 		}
-		if (!page)
-			page = alloc_page(gfp);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
 		}
-		j = 1 << order;
-		count -= j;
-		while (j--)
+		count -= order_size;
+		while (order_size--)
 			pages[i++] = page++;
 	}
 	return pages;
@@ -267,6 +273,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
+ * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
@@ -278,8 +285,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -288,11 +295,22 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 	struct page **pages;
 	struct sg_table sgt;
 	dma_addr_t dma_addr;
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	min_size = alloc_sizes & -alloc_sizes;
+	if (min_size < PAGE_SIZE) {
+		min_size = PAGE_SIZE;
+		alloc_sizes |= PAGE_SIZE;
+	} else {
+		size = ALIGN(size, min_size);
+	}
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		alloc_sizes = min_size;
+
+	count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pages = __iommu_dma_alloc_pages(count, alloc_sizes >> PAGE_SHIFT, gfp);
 	if (!pages)
 		return NULL;
 
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..8443bbb 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		struct dma_attrs *attrs, int prot, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
2.7.3.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] iommu/dma: Finish optimising higher-order allocations
  2016-04-13 16:29         ` Robin Murphy
@ 2016-04-21  5:47             ` Yong Wu
  -1 siblings, 0 replies; 36+ messages in thread
From: Yong Wu @ 2016-04-21  5:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Wed, 2016-04-13 at 17:29 +0100, Robin Murphy wrote:
> Now that we know exactly which page sizes our caller wants to use in the
> given domain, we can restrict higher-order allocation attempts to just
> those sizes, if any, and avoid wasting any time or effort on other sizes
> which offer no benefit. In the same vein, this also lets us accommodate
> a minimum order greater than 0 for special cases.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>

Hi Robin,

    Thanks very much for this patch. It works well on our MT8173.

    Tested-by: Yong Wu <yong.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> 

> ---
> 
> Just throwing this out as a quick solo update as I'm still expecting
> discussion on the rest of the series.
> 
[...]
>  	while (count) {
>  		struct page *page = NULL;
> -		int j;
> +		unsigned int order_size;
>  
>  		/*
>  		 * Higher-order allocations are a convenience rather
>  		 * than a necessity, hence using __GFP_NORETRY until
> -		 * falling back to single-page allocations.
> +		 * falling back to minimum-order allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> -			page = alloc_pages(gfp | __GFP_NORETRY, order);
> +		for (order_mask &= (2U << __fls(count)) - 1;
> +		     order_mask; order_mask &= ~order_size) {
> +			unsigned int order = __fls(order_mask);
> +
> +			order_size = 1U << order;
> +			page = alloc_pages((order_mask - order_size) ?
> +					   gfp | __GFP_NORETRY : gfp, order);
>  			if (!page)
>  				continue;
> -			if (PageCompound(page)) {
> -				if (!split_huge_page(page))
> -					break;
> -				__free_pages(page, order);
> -			} else {
> +			if (!order)
> +				break;

I also added this "if" in my old code. I don't know much about
PageCompound and split_page, but from Will's suggestion[1], this "if" is
unnecessary.

[1]:http://lists.linuxfoundation.org/pipermail/iommu/2016-April/016422.html

> +			if (!PageCompound(page)) {
>  				split_page(page, order);
>  				break;
> +			} else if (!split_huge_page(page)) {
> +				break;
>  			}
> +			__free_pages(page, order);
>  		}
> -		if (!page)
> -			page = alloc_page(gfp);
>  		if (!page) {
>  			__iommu_dma_free_pages(pages, i);
>  			return NULL;
>  		}
> -		j = 1 << order;
> -		count -= j;
> -		while (j--)
> +		count -= order_size;
> +		while (order_size--)
>  			pages[i++] = page++;
>  	}
[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/dma: Finish optimising higher-order allocations
@ 2016-04-21  5:47             ` Yong Wu
  0 siblings, 0 replies; 36+ messages in thread
From: Yong Wu @ 2016-04-21  5:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2016-04-13 at 17:29 +0100, Robin Murphy wrote:
> Now that we know exactly which page sizes our caller wants to use in the
> given domain, we can restrict higher-order allocation attempts to just
> those sizes, if any, and avoid wasting any time or effort on other sizes
> which offer no benefit. In the same vein, this also lets us accommodate
> a minimum order greater than 0 for special cases.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Hi Robin,

    Thanks very much for this patch. It works well on our MT8173.

    Tested-by: Yong Wu <yong.wu@mediatek.com> 

> ---
> 
> Just throwing this out as a quick solo update as I'm still expecting
> discussion on the rest of the series.
> 
[...]
>  	while (count) {
>  		struct page *page = NULL;
> -		int j;
> +		unsigned int order_size;
>  
>  		/*
>  		 * Higher-order allocations are a convenience rather
>  		 * than a necessity, hence using __GFP_NORETRY until
> -		 * falling back to single-page allocations.
> +		 * falling back to minimum-order allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> -			page = alloc_pages(gfp | __GFP_NORETRY, order);
> +		for (order_mask &= (2U << __fls(count)) - 1;
> +		     order_mask; order_mask &= ~order_size) {
> +			unsigned int order = __fls(order_mask);
> +
> +			order_size = 1U << order;
> +			page = alloc_pages((order_mask - order_size) ?
> +					   gfp | __GFP_NORETRY : gfp, order);
>  			if (!page)
>  				continue;
> -			if (PageCompound(page)) {
> -				if (!split_huge_page(page))
> -					break;
> -				__free_pages(page, order);
> -			} else {
> +			if (!order)
> +				break;

I also added this "if" in my old code. I don't know much about
PageCompound and split_page, but from Will's suggestion[1], this "if" is
unnecessary.

[1]:http://lists.linuxfoundation.org/pipermail/iommu/2016-April/016422.html

> +			if (!PageCompound(page)) {
>  				split_page(page, order);
>  				break;
> +			} else if (!split_huge_page(page)) {
> +				break;
>  			}
> +			__free_pages(page, order);
>  		}
> -		if (!page)
> -			page = alloc_page(gfp);
>  		if (!page) {
>  			__iommu_dma_free_pages(pages, i);
>  			return NULL;
>  		}
> -		j = 1 << order;
> -		count -= j;
> -		while (j--)
> +		count -= order_size;
> +		while (order_size--)
>  			pages[i++] = page++;
>  	}
[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-04-07 17:42 ` Robin Murphy
@ 2016-04-21 16:38     ` Will Deacon
  -1 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-04-21 16:38 UTC (permalink / raw)
  To: Robin Murphy
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
> Hi all,

Hi Robin,

> Since this area seems to be in vogue at the moment, here's what I was
> working on when the related patches[1][2] popped up, which happens to
> be more or less the intersection of both. As I recycled some of Will's
> old series as a starting point, I've retained the cleanup patches from
> that with their original acks - hope that's OK.
> 
> Fortunately, this already looks rather like parts of Joerg's plan[3],
> so I hope it's a suitable first step. Below is a quick hacked-up example
> of the kind of caller-controlled special use-case alluded to, using the
> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
> the group-based domain allocation call so the driver could throw the
> device at that and get its own non-default DMA ops domain to play with.

I like this series a lot and it moves us a significant step closer to
being able to request specific IOMMU geometry at domain initialisation
time. For the series:

Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-04-21 16:38     ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-04-21 16:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
> Hi all,

Hi Robin,

> Since this area seems to be in vogue at the moment, here's what I was
> working on when the related patches[1][2] popped up, which happens to
> be more or less the intersection of both. As I recycled some of Will's
> old series as a starting point, I've retained the cleanup patches from
> that with their original acks - hope that's OK.
> 
> Fortunately, this already looks rather like parts of Joerg's plan[3],
> so I hope it's a suitable first step. Below is a quick hacked-up example
> of the kind of caller-controlled special use-case alluded to, using the
> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
> the group-based domain allocation call so the driver could throw the
> device at that and get its own non-default DMA ops domain to play with.

I like this series a lot and it moves us a significant step closer to
being able to request specific IOMMU geometry at domain initialisation
time. For the series:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-04-07 17:42 ` Robin Murphy
@ 2016-05-09 11:21     ` Joerg Roedel
  -1 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 11:21 UTC (permalink / raw)
  To: Robin Murphy
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8

On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
> Hi all,
> 
> Since this area seems to be in vogue at the moment, here's what I was
> working on when the related patches[1][2] popped up, which happens to
> be more or less the intersection of both. As I recycled some of Will's
> old series as a starting point, I've retained the cleanup patches from
> that with their original acks - hope that's OK.
> 
> Fortunately, this already looks rather like parts of Joerg's plan[3],
> so I hope it's a suitable first step. Below is a quick hacked-up example
> of the kind of caller-controlled special use-case alluded to, using the
> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
> the group-based domain allocation call so the driver could throw the
> device at that and get its own non-default DMA ops domain to play with.
> 
> Robin.
> 
> [1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
> [2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
> [3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937
> 
> Robin Murphy (4):
>   iommu: of: enforce const-ness of struct iommu_ops
>   iommu: Allow selecting page sizes per domain
>   iommu/dma: Finish optimising higher-order allocations
>   iommu/arm-smmu: Use per-domain page sizes.
> 
> Will Deacon (1):
>   iommu: remove unused priv field from struct iommu_ops

Okay, I am still no happy that this lifts the requirements of the
iommu-api for the arm-smmu driver. But to get there we need more core
changes and this code is a step in the right direction, so I applied it.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-05-09 11:21     ` Joerg Roedel
  0 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
> Hi all,
> 
> Since this area seems to be in vogue at the moment, here's what I was
> working on when the related patches[1][2] popped up, which happens to
> be more or less the intersection of both. As I recycled some of Will's
> old series as a starting point, I've retained the cleanup patches from
> that with their original acks - hope that's OK.
> 
> Fortunately, this already looks rather like parts of Joerg's plan[3],
> so I hope it's a suitable first step. Below is a quick hacked-up example
> of the kind of caller-controlled special use-case alluded to, using the
> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
> the group-based domain allocation call so the driver could throw the
> device at that and get its own non-default DMA ops domain to play with.
> 
> Robin.
> 
> [1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
> [2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
> [3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937
> 
> Robin Murphy (4):
>   iommu: of: enforce const-ness of struct iommu_ops
>   iommu: Allow selecting page sizes per domain
>   iommu/dma: Finish optimising higher-order allocations
>   iommu/arm-smmu: Use per-domain page sizes.
> 
> Will Deacon (1):
>   iommu: remove unused priv field from struct iommu_ops

Okay, I am still no happy that this lifts the requirements of the
iommu-api for the arm-smmu driver. But to get there we need more core
changes and this code is a step in the right direction, so I applied it.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-05-09 11:21     ` Joerg Roedel
@ 2016-05-09 11:45         ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 11:45 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Joerg,

On 09/05/16 12:21, Joerg Roedel wrote:
> On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
>> Hi all,
>>
>> Since this area seems to be in vogue at the moment, here's what I was
>> working on when the related patches[1][2] popped up, which happens to
>> be more or less the intersection of both. As I recycled some of Will's
>> old series as a starting point, I've retained the cleanup patches from
>> that with their original acks - hope that's OK.
>>
>> Fortunately, this already looks rather like parts of Joerg's plan[3],
>> so I hope it's a suitable first step. Below is a quick hacked-up example
>> of the kind of caller-controlled special use-case alluded to, using the
>> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
>> the group-based domain allocation call so the driver could throw the
>> device at that and get its own non-default DMA ops domain to play with.
>>
>> Robin.
>>
>> [1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
>> [2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
>> [3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937
>>
>> Robin Murphy (4):
>>    iommu: of: enforce const-ness of struct iommu_ops
>>    iommu: Allow selecting page sizes per domain
>>    iommu/dma: Finish optimising higher-order allocations
>>    iommu/arm-smmu: Use per-domain page sizes.
>>
>> Will Deacon (1):
>>    iommu: remove unused priv field from struct iommu_ops
>
> Okay, I am still no happy that this lifts the requirements of the
> iommu-api for the arm-smmu driver. But to get there we need more core
> changes and this code is a step in the right direction, so I applied it.

Thanks a lot! I was expecting to pick this up again after the merge 
window and post an updated version then; as you may already have found, 
patch 5 conflicts somewhat with the SMMUv2 context format changes in 
Will's updates branch. The correct resolution requires a bit of 
rewriting, so below is what that patch looks like when rebased on top of 
Will's branch. If you'd prefer it in actual merge resolution format, 
shout and I'll give that a go.

Thanks,
Robin.

--->8---
commit a940dae4e124523bc1cf282c8f36c79f960a0805
Author: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Date:   Mon Mar 14 14:25:07 2016 +0000

     iommu/arm-smmu: Use per-domain page sizes.

     Now that we can accurately reflect the context format we choose for 
each
     domain, do that instead of imposing the global 
lowest-common-denominator
     restriction and potentially ending up with nothing. We currently have a
     strict 1:1 correspondence between domains and context banks, so we 
don't
     need to entertain the possibility of multiple formats _within_ a 
domain.

     Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
     [rm: split from original patch, added SMMUv3]
     Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff64e49..ebab33e77d67 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {

  	unsigned long			ias; /* IPA */
  	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;

  #define ARM_SMMU_MAX_ASIDS		(1 << 16)
  	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct 
arm_smmu_domain *smmu_domain,
  	return 0;
  }

-static struct iommu_ops arm_smmu_ops;
-
  static int arm_smmu_domain_finalise(struct iommu_domain *domain)
  {
  	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct 
iommu_domain *domain)
  	}

  	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
  		.ias		= ias,
  		.oas		= oas,
  		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct 
iommu_domain *domain)
  	if (!pgtbl_ops)
  		return -ENOMEM;

-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
  	smmu_domain->pgtbl_ops = pgtbl_ops;

  	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct 
arm_smmu_device *smmu)
  {
  	u32 reg;
  	bool coherent;
-	unsigned long pgsize_bitmap = 0;

  	/* IDR0 */
  	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct 
arm_smmu_device *smmu)

  	/* Page sizes */
  	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
  	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
  	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;

-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;

  	/* Output address size */
  	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0bda956a025c..bf30490d6b18 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -343,6 +343,7 @@ struct arm_smmu_device {
  	unsigned long			va_size;
  	unsigned long			ipa_size;
  	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;

  	u32				num_global_irqs;
  	u32				num_context_irqs;
@@ -388,8 +389,6 @@ struct arm_smmu_domain {
  	struct iommu_domain		domain;
  };

-static struct iommu_ops arm_smmu_ops;
-
  static DEFINE_SPINLOCK(arm_smmu_devices_lock);
  static LIST_HEAD(arm_smmu_devices);

@@ -949,7 +948,7 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
  	}

  	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
  		.ias		= ias,
  		.oas		= oas,
  		.tlb		= &arm_smmu_gather_ops,
@@ -963,8 +962,8 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
  		goto out_clear_smmu;
  	}

-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;

  	/* Initialise the context bank with our page table cfg */
  	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1793,19 +1792,23 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
  	}

  	/* Now we've corralled the various formats, what'll it do? */
-	size = 0;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S)
-		size |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
  	if (smmu->features &
  	    (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K))
-		size |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K)
-		size |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
-		size |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
+
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);

-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);

  	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
  		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-05-09 11:45         ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Joerg,

On 09/05/16 12:21, Joerg Roedel wrote:
> On Thu, Apr 07, 2016 at 06:42:03PM +0100, Robin Murphy wrote:
>> Hi all,
>>
>> Since this area seems to be in vogue at the moment, here's what I was
>> working on when the related patches[1][2] popped up, which happens to
>> be more or less the intersection of both. As I recycled some of Will's
>> old series as a starting point, I've retained the cleanup patches from
>> that with their original acks - hope that's OK.
>>
>> Fortunately, this already looks rather like parts of Joerg's plan[3],
>> so I hope it's a suitable first step. Below is a quick hacked-up example
>> of the kind of caller-controlled special use-case alluded to, using the
>> SMMU/HDLCD combo on Juno - for a 'real' implementation of this we'd want
>> the group-based domain allocation call so the driver could throw the
>> device at that and get its own non-default DMA ops domain to play with.
>>
>> Robin.
>>
>> [1]:http://thread.gmane.org/gmane.linux.kernel.iommu/12774
>> [2]:http://thread.gmane.org/gmane.linux.kernel.iommu/12901
>> [3]:http://article.gmane.org/gmane.linux.kernel.iommu/12937
>>
>> Robin Murphy (4):
>>    iommu: of: enforce const-ness of struct iommu_ops
>>    iommu: Allow selecting page sizes per domain
>>    iommu/dma: Finish optimising higher-order allocations
>>    iommu/arm-smmu: Use per-domain page sizes.
>>
>> Will Deacon (1):
>>    iommu: remove unused priv field from struct iommu_ops
>
> Okay, I am still no happy that this lifts the requirements of the
> iommu-api for the arm-smmu driver. But to get there we need more core
> changes and this code is a step in the right direction, so I applied it.

Thanks a lot! I was expecting to pick this up again after the merge 
window and post an updated version then; as you may already have found, 
patch 5 conflicts somewhat with the SMMUv2 context format changes in 
Will's updates branch. The correct resolution requires a bit of 
rewriting, so below is what that patch looks like when rebased on top of 
Will's branch. If you'd prefer it in actual merge resolution format, 
shout and I'll give that a go.

Thanks,
Robin.

--->8---
commit a940dae4e124523bc1cf282c8f36c79f960a0805
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Mon Mar 14 14:25:07 2016 +0000

     iommu/arm-smmu: Use per-domain page sizes.

     Now that we can accurately reflect the context format we choose for 
each
     domain, do that instead of imposing the global 
lowest-common-denominator
     restriction and potentially ending up with nothing. We currently have a
     strict 1:1 correspondence between domains and context banks, so we 
don't
     need to entertain the possibility of multiple formats _within_ a 
domain.

     Signed-off-by: Will Deacon <will.deacon@arm.com>
     [rm: split from original patch, added SMMUv3]
     Signed-off-by: Robin Murphy <robin.murphy@arm.com>

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff64e49..ebab33e77d67 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {

  	unsigned long			ias; /* IPA */
  	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;

  #define ARM_SMMU_MAX_ASIDS		(1 << 16)
  	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct 
arm_smmu_domain *smmu_domain,
  	return 0;
  }

-static struct iommu_ops arm_smmu_ops;
-
  static int arm_smmu_domain_finalise(struct iommu_domain *domain)
  {
  	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct 
iommu_domain *domain)
  	}

  	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
  		.ias		= ias,
  		.oas		= oas,
  		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct 
iommu_domain *domain)
  	if (!pgtbl_ops)
  		return -ENOMEM;

-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
  	smmu_domain->pgtbl_ops = pgtbl_ops;

  	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct 
arm_smmu_device *smmu)
  {
  	u32 reg;
  	bool coherent;
-	unsigned long pgsize_bitmap = 0;

  	/* IDR0 */
  	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct 
arm_smmu_device *smmu)

  	/* Page sizes */
  	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
  	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
  	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;

-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;

  	/* Output address size */
  	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0bda956a025c..bf30490d6b18 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -343,6 +343,7 @@ struct arm_smmu_device {
  	unsigned long			va_size;
  	unsigned long			ipa_size;
  	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;

  	u32				num_global_irqs;
  	u32				num_context_irqs;
@@ -388,8 +389,6 @@ struct arm_smmu_domain {
  	struct iommu_domain		domain;
  };

-static struct iommu_ops arm_smmu_ops;
-
  static DEFINE_SPINLOCK(arm_smmu_devices_lock);
  static LIST_HEAD(arm_smmu_devices);

@@ -949,7 +948,7 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
  	}

  	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
  		.ias		= ias,
  		.oas		= oas,
  		.tlb		= &arm_smmu_gather_ops,
@@ -963,8 +962,8 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
  		goto out_clear_smmu;
  	}

-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;

  	/* Initialise the context bank with our page table cfg */
  	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1793,19 +1792,23 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
  	}

  	/* Now we've corralled the various formats, what'll it do? */
-	size = 0;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S)
-		size |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
  	if (smmu->features &
  	    (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K))
-		size |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K)
-		size |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
  	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
-		size |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
+
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);

-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);

  	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
  		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-05-09 11:45         ` Robin Murphy
@ 2016-05-09 14:51             ` Joerg Roedel
  -1 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 14:51 UTC (permalink / raw)
  To: Robin Murphy
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Mon, May 09, 2016 at 12:45:39PM +0100, Robin Murphy wrote:
> Thanks a lot! I was expecting to pick this up again after the merge
> window and post an updated version then; as you may already have
> found, patch 5 conflicts somewhat with the SMMUv2 context format
> changes in Will's updates branch. The correct resolution requires a
> bit of rewriting, so below is what that patch looks like when
> rebased on top of Will's branch. If you'd prefer it in actual merge
> resolution format, shout and I'll give that a go.

Okay, hmm, but the patch does not apply here, even after fixing the line
breaks. I'll push my tree soon, can you rebase this patch on-top the
core branch then? It should already contain patches 1-4.


	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-05-09 14:51             ` Joerg Roedel
  0 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 14:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 09, 2016 at 12:45:39PM +0100, Robin Murphy wrote:
> Thanks a lot! I was expecting to pick this up again after the merge
> window and post an updated version then; as you may already have
> found, patch 5 conflicts somewhat with the SMMUv2 context format
> changes in Will's updates branch. The correct resolution requires a
> bit of rewriting, so below is what that patch looks like when
> rebased on top of Will's branch. If you'd prefer it in actual merge
> resolution format, shout and I'll give that a go.

Okay, hmm, but the patch does not apply here, even after fixing the line
breaks. I'll push my tree soon, can you rebase this patch on-top the
core branch then? It should already contain patches 1-4.


	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-05-09 14:51             ` Joerg Roedel
@ 2016-05-09 15:18                 ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 15:18 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: laurent.pinchart+renesas-ryLnwIuWjnjg/C1BVhZhaw,
	will.deacon-5wv7dgnIgG8, dianders-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	treding-DDmLM1+adcrQT0dZR+AlfA, brian.starkey-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 09/05/16 15:51, Joerg Roedel wrote:
> On Mon, May 09, 2016 at 12:45:39PM +0100, Robin Murphy wrote:
>> Thanks a lot! I was expecting to pick this up again after the merge
>> window and post an updated version then; as you may already have
>> found, patch 5 conflicts somewhat with the SMMUv2 context format
>> changes in Will's updates branch. The correct resolution requires a
>> bit of rewriting, so below is what that patch looks like when
>> rebased on top of Will's branch. If you'd prefer it in actual merge
>> resolution format, shout and I'll give that a go.
>
> Okay, hmm, but the patch does not apply here, even after fixing the line
> breaks. I'll push my tree soon, can you rebase this patch on-top the
> core branch then? It should already contain patches 1-4.

Bah, I saw last week's "Merge remote-tracking branch 
'will/for-joerg/arm-smmu/updates' into HEAD" commit on that branch 
without spotting it was no longer up to date, sorry. Yes, it's probably 
safest if I rebase on top of your merge of everything else - I'll keep 
an eye out.

Thanks,
Robin.

>
>
> 	Joerg
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-05-09 15:18                 ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/05/16 15:51, Joerg Roedel wrote:
> On Mon, May 09, 2016 at 12:45:39PM +0100, Robin Murphy wrote:
>> Thanks a lot! I was expecting to pick this up again after the merge
>> window and post an updated version then; as you may already have
>> found, patch 5 conflicts somewhat with the SMMUv2 context format
>> changes in Will's updates branch. The correct resolution requires a
>> bit of rewriting, so below is what that patch looks like when
>> rebased on top of Will's branch. If you'd prefer it in actual merge
>> resolution format, shout and I'll give that a go.
>
> Okay, hmm, but the patch does not apply here, even after fixing the line
> breaks. I'll push my tree soon, can you rebase this patch on-top the
> core branch then? It should already contain patches 1-4.

Bah, I saw last week's "Merge remote-tracking branch 
'will/for-joerg/arm-smmu/updates' into HEAD" commit on that branch 
without spotting it was no longer up to date, sorry. Yes, it's probably 
safest if I rebase on top of your merge of everything else - I'll keep 
an eye out.

Thanks,
Robin.

>
>
> 	Joerg
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/5] Introduce per-domain page sizes
  2016-05-09 15:18                 ` Robin Murphy
@ 2016-05-09 15:50                   ` Joerg Roedel
  -1 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 15:50 UTC (permalink / raw)
  To: Robin Murphy
  Cc: laurent.pinchart+renesas, will.deacon, dianders, iommu, treding,
	brian.starkey, linux-arm-kernel

On Mon, May 09, 2016 at 04:18:39PM +0100, Robin Murphy wrote:
> Bah, I saw last week's "Merge remote-tracking branch
> 'will/for-joerg/arm-smmu/updates' into HEAD" commit on that branch
> without spotting it was no longer up to date, sorry. Yes, it's
> probably safest if I rebase on top of your merge of everything else
> - I'll keep an eye out.

My tree is pushed now, you can rebase this one patch on the core branch
and resend.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/5] Introduce per-domain page sizes
@ 2016-05-09 15:50                   ` Joerg Roedel
  0 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-09 15:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 09, 2016 at 04:18:39PM +0100, Robin Murphy wrote:
> Bah, I saw last week's "Merge remote-tracking branch
> 'will/for-joerg/arm-smmu/updates' into HEAD" commit on that branch
> without spotting it was no longer up to date, sorry. Yes, it's
> probably safest if I rebase on top of your merge of everything else
> - I'll keep an eye out.

My tree is pushed now, you can rebase this one patch on the core branch
and resend.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/arm-smmu: Use per-domain page sizes.
  2016-04-07 17:42 ` Robin Murphy
@ 2016-05-09 16:20     ` Robin Murphy
  -1 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 16:20 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Now that we can accurately reflect the context format we choose for each
domain, do that instead of imposing the global lowest-common-denominator
restriction and potentially ending up with nothing. We currently have a
strict 1:1 correspondence between domains and context banks, so we don't
need to entertain the possibility of multiple formats _within_ a domain.

Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
[rm: split from original patch, added SMMUv3]
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

Changes from v1: Rebased onto iommu/core to accommodate SMMUv2 changes.

 drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
 drivers/iommu/arm-smmu.c    | 27 +++++++++++++++------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff64e49..ebab33e77d67 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {
 
 	unsigned long			ias; /* IPA */
 	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;
 
 #define ARM_SMMU_MAX_ASIDS		(1 << 16)
 	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static struct iommu_ops arm_smmu_ops;
-
 static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 {
 	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
 	bool coherent;
-	unsigned long pgsize_bitmap = 0;
 
 	/* IDR0 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 
 	/* Page sizes */
 	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 
-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
 
 	/* Output address size */
 	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7cd4ad98904a..0360919a5737 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -351,6 +351,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
 	u32				num_context_irqs;
@@ -396,8 +397,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-static struct iommu_ops arm_smmu_ops;
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -957,7 +956,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -971,8 +970,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1814,19 +1813,23 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	}
 
 	/* Now we've corralled the various formats, what'll it do? */
-	size = 0;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S)
-		size |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
 	if (smmu->features &
 	    (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K))
-		size |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K)
-		size |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
-		size |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
+
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);
 
-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);
 
 	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
 		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/arm-smmu: Use per-domain page sizes.
@ 2016-05-09 16:20     ` Robin Murphy
  0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2016-05-09 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we can accurately reflect the context format we choose for each
domain, do that instead of imposing the global lowest-common-denominator
restriction and potentially ending up with nothing. We currently have a
strict 1:1 correspondence between domains and context banks, so we don't
need to entertain the possibility of multiple formats _within_ a domain.

Signed-off-by: Will Deacon <will.deacon@arm.com>
[rm: split from original patch, added SMMUv3]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Changes from v1: Rebased onto iommu/core to accommodate SMMUv2 changes.

 drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
 drivers/iommu/arm-smmu.c    | 27 +++++++++++++++------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff64e49..ebab33e77d67 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -590,6 +590,7 @@ struct arm_smmu_device {
 
 	unsigned long			ias; /* IPA */
 	unsigned long			oas; /* PA */
+	unsigned long			pgsize_bitmap;
 
 #define ARM_SMMU_MAX_ASIDS		(1 << 16)
 	unsigned int			asid_bits;
@@ -1516,8 +1517,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static struct iommu_ops arm_smmu_ops;
-
 static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 {
 	int ret;
@@ -1555,7 +1554,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -1566,7 +1565,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
@@ -2410,7 +2409,6 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
 	bool coherent;
-	unsigned long pgsize_bitmap = 0;
 
 	/* IDR0 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2541,13 +2539,16 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 
 	/* Page sizes */
 	if (reg & IDR5_GRAN64K)
-		pgsize_bitmap |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
 	if (reg & IDR5_GRAN16K)
-		pgsize_bitmap |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (reg & IDR5_GRAN4K)
-		pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 
-	arm_smmu_ops.pgsize_bitmap &= pgsize_bitmap;
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
 
 	/* Output address size */
 	switch (reg & IDR5_OAS_MASK << IDR5_OAS_SHIFT) {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7cd4ad98904a..0360919a5737 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -351,6 +351,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
 	u32				num_context_irqs;
@@ -396,8 +397,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-static struct iommu_ops arm_smmu_ops;
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -957,7 +956,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
-		.pgsize_bitmap	= arm_smmu_ops.pgsize_bitmap,
+		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
@@ -971,8 +970,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
-	/* Update our support page sizes to reflect the page table format */
-	arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
@@ -1814,19 +1813,23 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	}
 
 	/* Now we've corralled the various formats, what'll it do? */
-	size = 0;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S)
-		size |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
 	if (smmu->features &
 	    (ARM_SMMU_FEAT_FMT_AARCH32_L | ARM_SMMU_FEAT_FMT_AARCH64_4K))
-		size |= SZ_4K | SZ_2M | SZ_1G;
+		smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_16K)
-		size |= SZ_16K | SZ_32M;
+		smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
 	if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
-		size |= SZ_64K | SZ_512M;
+		smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
+
+	if (arm_smmu_ops.pgsize_bitmap == -1UL)
+		arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
+	else
+		arm_smmu_ops.pgsize_bitmap |= smmu->pgsize_bitmap;
+	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n",
+		   smmu->pgsize_bitmap);
 
-	arm_smmu_ops.pgsize_bitmap &= size;
-	dev_notice(smmu->dev, "\tSupported page sizes: 0x%08lx\n", size);
 
 	if (smmu->features & ARM_SMMU_FEAT_TRANS_S1)
 		dev_notice(smmu->dev, "\tStage-1: %lu-bit VA -> %lu-bit IPA\n",
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] iommu/arm-smmu: Use per-domain page sizes.
  2016-05-09 16:20     ` Robin Murphy
@ 2016-05-10  9:45         ` Joerg Roedel
  -1 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-10  9:45 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Will Deacon,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Mon, May 09, 2016 at 05:20:09PM +0100, Robin Murphy wrote:
> Changes from v1: Rebased onto iommu/core to accommodate SMMUv2 changes.
> 
>  drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
>  drivers/iommu/arm-smmu.c    | 27 +++++++++++++++------------
>  2 files changed, 25 insertions(+), 21 deletions(-)

Applied, thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2] iommu/arm-smmu: Use per-domain page sizes.
@ 2016-05-10  9:45         ` Joerg Roedel
  0 siblings, 0 replies; 36+ messages in thread
From: Joerg Roedel @ 2016-05-10  9:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 09, 2016 at 05:20:09PM +0100, Robin Murphy wrote:
> Changes from v1: Rebased onto iommu/core to accommodate SMMUv2 changes.
> 
>  drivers/iommu/arm-smmu-v3.c | 19 ++++++++++---------
>  drivers/iommu/arm-smmu.c    | 27 +++++++++++++++------------
>  2 files changed, 25 insertions(+), 21 deletions(-)

Applied, thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-05-10  9:45 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-07 17:42 [PATCH 0/5] Introduce per-domain page sizes Robin Murphy
2016-04-07 17:42 ` Robin Murphy
     [not found] ` <cover.1460048991.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-04-07 17:42   ` [PATCH 1/5] iommu: remove unused priv field from struct iommu_ops Robin Murphy
2016-04-07 17:42     ` Robin Murphy
2016-04-07 17:42   ` [PATCH 2/5] iommu: of: enforce const-ness of " Robin Murphy
2016-04-07 17:42     ` Robin Murphy
2016-04-07 17:42   ` [PATCH 3/5] iommu: Allow selecting page sizes per domain Robin Murphy
2016-04-07 17:42     ` Robin Murphy
2016-04-07 17:42   ` [PATCH 4/5] iommu/dma: Finish optimising higher-order allocations Robin Murphy
2016-04-07 17:42     ` Robin Murphy
     [not found]     ` <89763f6b1ac684c3d8712e38760bec55b7885e3b.1460048991.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-04-08  5:32       ` Yong Wu
2016-04-08  5:32         ` Yong Wu
2016-04-08 16:33         ` Robin Murphy
2016-04-08 16:33           ` Robin Murphy
2016-04-13 16:29       ` [PATCH v2] " Robin Murphy
2016-04-13 16:29         ` Robin Murphy
     [not found]         ` <3e4572cb0a175061c1c4b436e3806ba9d7b9f199.1460563676.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-04-21  5:47           ` Yong Wu
2016-04-21  5:47             ` Yong Wu
2016-04-07 17:42   ` [PATCH 5/5] iommu/arm-smmu: Use per-domain page sizes Robin Murphy
2016-04-07 17:42     ` Robin Murphy
2016-04-21 16:38   ` [PATCH 0/5] Introduce " Will Deacon
2016-04-21 16:38     ` Will Deacon
2016-05-09 11:21   ` Joerg Roedel
2016-05-09 11:21     ` Joerg Roedel
     [not found]     ` <20160509112138.GB13275-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-05-09 11:45       ` Robin Murphy
2016-05-09 11:45         ` Robin Murphy
     [not found]         ` <57307863.1070706-5wv7dgnIgG8@public.gmane.org>
2016-05-09 14:51           ` Joerg Roedel
2016-05-09 14:51             ` Joerg Roedel
     [not found]             ` <20160509145157.GD13971-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-05-09 15:18               ` Robin Murphy
2016-05-09 15:18                 ` Robin Murphy
2016-05-09 15:50                 ` Joerg Roedel
2016-05-09 15:50                   ` Joerg Roedel
     [not found] ` <ea520b8c72b5a72a1731bd35f6e3e50872fe6764.1460048991.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-05-09 16:20   ` [PATCH v2] iommu/arm-smmu: Use " Robin Murphy
2016-05-09 16:20     ` Robin Murphy
     [not found]     ` <112fc0e5f9bbe08007778b8438b35025d8e876a4.1462810410.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-05-10  9:45       ` Joerg Roedel
2016-05-10  9:45         ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.