All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-28  6:32 ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: Joerg Roedel, Catalin Marinas, Will Deacon
  Cc: Matthias Brugger, Robin Murphy, Douglas Anderson, Daniel Kurtz,
	Tomasz Figa, Arnd Bergmann, Lucas Stach, Marek Szyprowski,
	linux-mediatek, srv_heupstream, linux-kernel, linux-arm-kernel,
	iommu, Yong Wu

Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
the granule of PAGE_SIZE. It call alloc_page to try allocating memory
in the last time. Fortunately the mininum pagesize in all the
current IOMMU is SZ_4K, so this works well.

But there may be a case in which the mininum granule of IOMMU may be
larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
discontinuous memory within a granule. For example, the pgsize_bitmap
of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
have to prepare SZ_16K continuous memory at least for each a granule
iommu mapping.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>

---
v2:
 -rebase on v4.6-rc1.
 -add a new patch([1/2] add pgsize_bitmap) here.

 drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 72d6182..75ce71e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+	int min_order = get_order(1 << __ffs(pgsize_bitmap));
+	int order = MAX_ORDER;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
-		 * falling back to single-page allocations.
+		 * falling back to min size allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
-			page = alloc_pages(gfp | __GFP_NORETRY, order);
+		for (order = min_t(int, order, __fls(count));
+		     order >= min_order; order--) {
+			page = alloc_pages((order == min_order) ? gfp :
+					   gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
+			if (!order)
+				break;
 			if (PageCompound(page)) {
 				if (!split_huge_page(page))
 					break;
@@ -229,8 +234,6 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 				break;
 			}
 		}
-		if (!page)
-			page = alloc_page(gfp);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
@@ -292,7 +295,8 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	pages = __iommu_dma_alloc_pages(count, gfp,
+					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
 
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-28  6:32 ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: Joerg Roedel, Catalin Marinas, Will Deacon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Douglas Anderson, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, Matthias Brugger,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Lucas Stach

Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
the granule of PAGE_SIZE. It call alloc_page to try allocating memory
in the last time. Fortunately the mininum pagesize in all the
current IOMMU is SZ_4K, so this works well.

But there may be a case in which the mininum granule of IOMMU may be
larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
discontinuous memory within a granule. For example, the pgsize_bitmap
of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
have to prepare SZ_16K continuous memory at least for each a granule
iommu mapping.

Signed-off-by: Yong Wu <yong.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>

---
v2:
 -rebase on v4.6-rc1.
 -add a new patch([1/2] add pgsize_bitmap) here.

 drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 72d6182..75ce71e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+	int min_order = get_order(1 << __ffs(pgsize_bitmap));
+	int order = MAX_ORDER;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
-		 * falling back to single-page allocations.
+		 * falling back to min size allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
-			page = alloc_pages(gfp | __GFP_NORETRY, order);
+		for (order = min_t(int, order, __fls(count));
+		     order >= min_order; order--) {
+			page = alloc_pages((order == min_order) ? gfp :
+					   gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
+			if (!order)
+				break;
 			if (PageCompound(page)) {
 				if (!split_huge_page(page))
 					break;
@@ -229,8 +234,6 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 				break;
 			}
 		}
-		if (!page)
-			page = alloc_page(gfp);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
@@ -292,7 +295,8 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	pages = __iommu_dma_alloc_pages(count, gfp,
+					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
 
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-28  6:32 ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: linux-arm-kernel

Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
the granule of PAGE_SIZE. It call alloc_page to try allocating memory
in the last time. Fortunately the mininum pagesize in all the
current IOMMU is SZ_4K, so this works well.

But there may be a case in which the mininum granule of IOMMU may be
larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
discontinuous memory within a granule. For example, the pgsize_bitmap
of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
have to prepare SZ_16K continuous memory at least for each a granule
iommu mapping.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>

---
v2:
 -rebase on v4.6-rc1.
 -add a new patch([1/2] add pgsize_bitmap) here.

 drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 72d6182..75ce71e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 	kvfree(pages);
 }
 
-static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
+static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
-	unsigned int order = MAX_ORDER;
+	int min_order = get_order(1 << __ffs(pgsize_bitmap));
+	int order = MAX_ORDER;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
-		 * falling back to single-page allocations.
+		 * falling back to min size allocations.
 		 */
-		for (order = min_t(unsigned int, order, __fls(count));
-		     order > 0; order--) {
-			page = alloc_pages(gfp | __GFP_NORETRY, order);
+		for (order = min_t(int, order, __fls(count));
+		     order >= min_order; order--) {
+			page = alloc_pages((order == min_order) ? gfp :
+					   gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
+			if (!order)
+				break;
 			if (PageCompound(page)) {
 				if (!split_huge_page(page))
 					break;
@@ -229,8 +234,6 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 				break;
 			}
 		}
-		if (!page)
-			page = alloc_page(gfp);
 		if (!page) {
 			__iommu_dma_free_pages(pages, i);
 			return NULL;
@@ -292,7 +295,8 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp);
+	pages = __iommu_dma_alloc_pages(count, gfp,
+					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
 
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/2] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support
@ 2016-03-28  6:32   ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: Joerg Roedel, Catalin Marinas, Will Deacon
  Cc: Matthias Brugger, Robin Murphy, Douglas Anderson, Daniel Kurtz,
	Tomasz Figa, Arnd Bergmann, Lucas Stach, Marek Szyprowski,
	linux-mediatek, srv_heupstream, linux-kernel, linux-arm-kernel,
	iommu, Yong Wu

Sometimes it is not worth for the iommu allocating big chunks.
Here we enable DMA_ATTR_ALLOC_SINGLE_PAGES which could help avoid
to allocate big chunks while iommu allocating buffer.

More information about this attribute, please check Doug's
commit df05c6f6e0bb ("ARM: 8506/1: common: DMA-mapping:
add DMA_ATTR_ALLOC_SINGLE_PAGES attribute")

Cc: Robin Murphy <robin.murphy@arm.com>
Suggested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>

---
 Our video driver may use this soon.

 arch/arm64/mm/dma-mapping.c |  4 ++--
 drivers/iommu/dma-iommu.c   | 12 +++++++++---
 include/linux/dma-iommu.h   |  4 ++--
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a6e757c..5b104fe 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, attrs,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 75ce71e..c77ef66 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -191,6 +191,7 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 }
 
 static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     struct dma_attrs *attrs,
 					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
@@ -205,6 +206,10 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
+	/* Go straight to min_order if caller need SINGLE_PAGES */
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		order = min_order;
+
 	/* IOMMU can map any pages, so himem can also be used here */
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
@@ -271,6 +276,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
  * @prot: IOMMU mapping flags
+ * @attrs: DMA attributes flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
  *		given VA/PA are visible to the given non-coherent device.
@@ -281,8 +287,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -295,7 +301,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp,
+	pages = __iommu_dma_alloc_pages(count, gfp, attrs,
 					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..08d9603 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/2] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support
@ 2016-03-28  6:32   ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: Joerg Roedel, Catalin Marinas, Will Deacon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Douglas Anderson, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, Matthias Brugger,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Lucas Stach

Sometimes it is not worth for the iommu allocating big chunks.
Here we enable DMA_ATTR_ALLOC_SINGLE_PAGES which could help avoid
to allocate big chunks while iommu allocating buffer.

More information about this attribute, please check Doug's
commit df05c6f6e0bb ("ARM: 8506/1: common: DMA-mapping:
add DMA_ATTR_ALLOC_SINGLE_PAGES attribute")

Cc: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Suggested-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Signed-off-by: Yong Wu <yong.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>

---
 Our video driver may use this soon.

 arch/arm64/mm/dma-mapping.c |  4 ++--
 drivers/iommu/dma-iommu.c   | 12 +++++++++---
 include/linux/dma-iommu.h   |  4 ++--
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a6e757c..5b104fe 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, attrs,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 75ce71e..c77ef66 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -191,6 +191,7 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 }
 
 static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     struct dma_attrs *attrs,
 					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
@@ -205,6 +206,10 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
+	/* Go straight to min_order if caller need SINGLE_PAGES */
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		order = min_order;
+
 	/* IOMMU can map any pages, so himem can also be used here */
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
@@ -271,6 +276,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
  * @prot: IOMMU mapping flags
+ * @attrs: DMA attributes flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
  *		given VA/PA are visible to the given non-coherent device.
@@ -281,8 +287,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -295,7 +301,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp,
+	pages = __iommu_dma_alloc_pages(count, gfp, attrs,
 					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..08d9603 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/2] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support
@ 2016-03-28  6:32   ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-03-28  6:32 UTC (permalink / raw)
  To: linux-arm-kernel

Sometimes it is not worth for the iommu allocating big chunks.
Here we enable DMA_ATTR_ALLOC_SINGLE_PAGES which could help avoid
to allocate big chunks while iommu allocating buffer.

More information about this attribute, please check Doug's
commit df05c6f6e0bb ("ARM: 8506/1: common: DMA-mapping:
add DMA_ATTR_ALLOC_SINGLE_PAGES attribute")

Cc: Robin Murphy <robin.murphy@arm.com>
Suggested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>

---
 Our video driver may use this soon.

 arch/arm64/mm/dma-mapping.c |  4 ++--
 drivers/iommu/dma-iommu.c   | 12 +++++++++---
 include/linux/dma-iommu.h   |  4 ++--
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index a6e757c..5b104fe 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
 
-		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
-					flush_page);
+		pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, attrs,
+					handle, flush_page);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 75ce71e..c77ef66 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -191,6 +191,7 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
 }
 
 static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
+					     struct dma_attrs *attrs,
 					     unsigned long pgsize_bitmap)
 {
 	struct page **pages;
@@ -205,6 +206,10 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
+	/* Go straight to min_order if caller need SINGLE_PAGES */
+	if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
+		order = min_order;
+
 	/* IOMMU can map any pages, so himem can also be used here */
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
@@ -271,6 +276,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * @size: Size of buffer in bytes
  * @gfp: Allocation flags
  * @prot: IOMMU mapping flags
+ * @attrs: DMA attributes flags
  * @handle: Out argument for allocated DMA handle
  * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
  *		given VA/PA are visible to the given non-coherent device.
@@ -281,8 +287,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
@@ -295,7 +301,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size,
 
 	*handle = DMA_ERROR_CODE;
 
-	pages = __iommu_dma_alloc_pages(count, gfp,
+	pages = __iommu_dma_alloc_pages(count, gfp, attrs,
 					domain->ops->pgsize_bitmap);
 	if (!pages)
 		return NULL;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index fc48103..08d9603 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -38,8 +38,8 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
  * These implement the bulk of the relevant DMA mapping callbacks, but require
  * the arch code to take care of attributes and cache maintenance
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, int prot, dma_addr_t *handle,
+struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
+		int prot, struct dma_attrs *attrs, dma_addr_t *handle,
 		void (*flush_page)(struct device *, const void *, phys_addr_t));
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
-- 
1.8.1.1.dirty

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-29 17:02   ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-03-29 17:02 UTC (permalink / raw)
  To: Yong Wu
  Cc: Joerg Roedel, Catalin Marinas, Matthias Brugger, Robin Murphy,
	Douglas Anderson, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski, linux-mediatek, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
> in the last time. Fortunately the mininum pagesize in all the
> current IOMMU is SZ_4K, so this works well.
> 
> But there may be a case in which the mininum granule of IOMMU may be
> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
> discontinuous memory within a granule. For example, the pgsize_bitmap
> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
> have to prepare SZ_16K continuous memory at least for each a granule
> iommu mapping.

Did you find a driver/platform that actually needs this? I certainly
think it's possible, but we don't necessarily need to fix it if it's
not broken yet! Just adding a comment would be enough.

Either way, a couple of review comments inline.

> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> 
> ---
> v2:
>  -rebase on v4.6-rc1.
>  -add a new patch([1/2] add pgsize_bitmap) here.
> 
>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 72d6182..75ce71e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>  	kvfree(pages);
>  }
>  
> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
> +					     unsigned long pgsize_bitmap)
>  {
>  	struct page **pages;
>  	unsigned int i = 0, array_size = count * sizeof(*pages);
> -	unsigned int order = MAX_ORDER;
> +	int min_order = get_order(1 << __ffs(pgsize_bitmap));

1UL

> +	int order = MAX_ORDER;
>  
>  	if (array_size <= PAGE_SIZE)
>  		pages = kzalloc(array_size, GFP_KERNEL);
> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  		/*
>  		 * Higher-order allocations are a convenience rather
>  		 * than a necessity, hence using __GFP_NORETRY until
> -		 * falling back to single-page allocations.
> +		 * falling back to min size allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> -			page = alloc_pages(gfp | __GFP_NORETRY, order);
> +		for (order = min_t(int, order, __fls(count));
> +		     order >= min_order; order--) {
> +			page = alloc_pages((order == min_order) ? gfp :
> +					   gfp | __GFP_NORETRY, order);
>  			if (!page)
>  				continue;
> +			if (!order)
> +				break;

Isn't this handled by the loop condition?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-29 17:02   ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-03-29 17:02 UTC (permalink / raw)
  To: Yong Wu
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tomasz Figa,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Daniel Kurtz,
	Matthias Brugger,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Lucas Stach

On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
> in the last time. Fortunately the mininum pagesize in all the
> current IOMMU is SZ_4K, so this works well.
> 
> But there may be a case in which the mininum granule of IOMMU may be
> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
> discontinuous memory within a granule. For example, the pgsize_bitmap
> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
> have to prepare SZ_16K continuous memory at least for each a granule
> iommu mapping.

Did you find a driver/platform that actually needs this? I certainly
think it's possible, but we don't necessarily need to fix it if it's
not broken yet! Just adding a comment would be enough.

Either way, a couple of review comments inline.

> 
> Signed-off-by: Yong Wu <yong.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
> 
> ---
> v2:
>  -rebase on v4.6-rc1.
>  -add a new patch([1/2] add pgsize_bitmap) here.
> 
>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 72d6182..75ce71e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>  	kvfree(pages);
>  }
>  
> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
> +					     unsigned long pgsize_bitmap)
>  {
>  	struct page **pages;
>  	unsigned int i = 0, array_size = count * sizeof(*pages);
> -	unsigned int order = MAX_ORDER;
> +	int min_order = get_order(1 << __ffs(pgsize_bitmap));

1UL

> +	int order = MAX_ORDER;
>  
>  	if (array_size <= PAGE_SIZE)
>  		pages = kzalloc(array_size, GFP_KERNEL);
> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  		/*
>  		 * Higher-order allocations are a convenience rather
>  		 * than a necessity, hence using __GFP_NORETRY until
> -		 * falling back to single-page allocations.
> +		 * falling back to min size allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> -			page = alloc_pages(gfp | __GFP_NORETRY, order);
> +		for (order = min_t(int, order, __fls(count));
> +		     order >= min_order; order--) {
> +			page = alloc_pages((order == min_order) ? gfp :
> +					   gfp | __GFP_NORETRY, order);
>  			if (!page)
>  				continue;
> +			if (!order)
> +				break;

Isn't this handled by the loop condition?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-03-29 17:02   ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-03-29 17:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
> in the last time. Fortunately the mininum pagesize in all the
> current IOMMU is SZ_4K, so this works well.
> 
> But there may be a case in which the mininum granule of IOMMU may be
> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
> discontinuous memory within a granule. For example, the pgsize_bitmap
> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
> have to prepare SZ_16K continuous memory at least for each a granule
> iommu mapping.

Did you find a driver/platform that actually needs this? I certainly
think it's possible, but we don't necessarily need to fix it if it's
not broken yet! Just adding a comment would be enough.

Either way, a couple of review comments inline.

> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> 
> ---
> v2:
>  -rebase on v4.6-rc1.
>  -add a new patch([1/2] add pgsize_bitmap) here.
> 
>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 72d6182..75ce71e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>  	kvfree(pages);
>  }
>  
> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
> +					     unsigned long pgsize_bitmap)
>  {
>  	struct page **pages;
>  	unsigned int i = 0, array_size = count * sizeof(*pages);
> -	unsigned int order = MAX_ORDER;
> +	int min_order = get_order(1 << __ffs(pgsize_bitmap));

1UL

> +	int order = MAX_ORDER;
>  
>  	if (array_size <= PAGE_SIZE)
>  		pages = kzalloc(array_size, GFP_KERNEL);
> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>  		/*
>  		 * Higher-order allocations are a convenience rather
>  		 * than a necessity, hence using __GFP_NORETRY until
> -		 * falling back to single-page allocations.
> +		 * falling back to min size allocations.
>  		 */
> -		for (order = min_t(unsigned int, order, __fls(count));
> -		     order > 0; order--) {
> -			page = alloc_pages(gfp | __GFP_NORETRY, order);
> +		for (order = min_t(int, order, __fls(count));
> +		     order >= min_order; order--) {
> +			page = alloc_pages((order == min_order) ? gfp :
> +					   gfp | __GFP_NORETRY, order);
>  			if (!page)
>  				continue;
> +			if (!order)
> +				break;

Isn't this handled by the loop condition?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-05 17:03     ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-05 17:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

Will,

On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
>> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
>> in the last time. Fortunately the mininum pagesize in all the
>> current IOMMU is SZ_4K, so this works well.
>>
>> But there may be a case in which the mininum granule of IOMMU may be
>> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
>> discontinuous memory within a granule. For example, the pgsize_bitmap
>> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
>> have to prepare SZ_16K continuous memory at least for each a granule
>> iommu mapping.
>
> Did you find a driver/platform that actually needs this? I certainly
> think it's possible, but we don't necessarily need to fix it if it's
> not broken yet! Just adding a comment would be enough.

I think Yong Wu was adding this patch in response to your request from
v1 of <https://patchwork.kernel.org/patch/8487511/>.  Fixing this
seemed like a separate issue so I asked Yong Wu to make two patches
rather than sticking this fix in the patch as his original.  Hopefully
this makes sense.

> Either way, a couple of review comments inline.
>
>>
>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>>
>> ---
>> v2:
>>  -rebase on v4.6-rc1.
>>  -add a new patch([1/2] add pgsize_bitmap) here.
>>
>>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>>  1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 72d6182..75ce71e 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>>       kvfree(pages);
>>  }
>>
>> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
>> +                                          unsigned long pgsize_bitmap)
>>  {
>>       struct page **pages;
>>       unsigned int i = 0, array_size = count * sizeof(*pages);
>> -     unsigned int order = MAX_ORDER;
>> +     int min_order = get_order(1 << __ffs(pgsize_bitmap));
>
> 1UL

Yong: presumably you will spin with this fix?


>> +     int order = MAX_ORDER;
>>
>>       if (array_size <= PAGE_SIZE)
>>               pages = kzalloc(array_size, GFP_KERNEL);
>> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>               /*
>>                * Higher-order allocations are a convenience rather
>>                * than a necessity, hence using __GFP_NORETRY until
>> -              * falling back to single-page allocations.
>> +              * falling back to min size allocations.
>>                */
>> -             for (order = min_t(unsigned int, order, __fls(count));
>> -                  order > 0; order--) {
>> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> +             for (order = min_t(int, order, __fls(count));
>> +                  order >= min_order; order--) {
>> +                     page = alloc_pages((order == min_order) ? gfp :
>> +                                        gfp | __GFP_NORETRY, order);
>>                       if (!page)
>>                               continue;
>> +                     if (!order)
>> +                             break;
>
> Isn't this handled by the loop condition?

He changed the loop condition to be ">= min_order" instead of "> 0",
so now we can get here with an order == 0.  This makes sense because
when min_order is not 0 you still want to run the code to split the
pages and it is sane not to duplicate that below.

Maybe I'm misunderstanding, though.  Perhaps you can explain how you
think this code should look?

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-05 17:03     ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-05 17:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Matthias Brugger, moderated list:ARM/Mediatek SoC support,
	Lucas Stach

Will,

On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
>> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
>> in the last time. Fortunately the mininum pagesize in all the
>> current IOMMU is SZ_4K, so this works well.
>>
>> But there may be a case in which the mininum granule of IOMMU may be
>> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
>> discontinuous memory within a granule. For example, the pgsize_bitmap
>> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
>> have to prepare SZ_16K continuous memory at least for each a granule
>> iommu mapping.
>
> Did you find a driver/platform that actually needs this? I certainly
> think it's possible, but we don't necessarily need to fix it if it's
> not broken yet! Just adding a comment would be enough.

I think Yong Wu was adding this patch in response to your request from
v1 of <https://patchwork.kernel.org/patch/8487511/>.  Fixing this
seemed like a separate issue so I asked Yong Wu to make two patches
rather than sticking this fix in the patch as his original.  Hopefully
this makes sense.

> Either way, a couple of review comments inline.
>
>>
>> Signed-off-by: Yong Wu <yong.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
>>
>> ---
>> v2:
>>  -rebase on v4.6-rc1.
>>  -add a new patch([1/2] add pgsize_bitmap) here.
>>
>>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>>  1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 72d6182..75ce71e 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>>       kvfree(pages);
>>  }
>>
>> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
>> +                                          unsigned long pgsize_bitmap)
>>  {
>>       struct page **pages;
>>       unsigned int i = 0, array_size = count * sizeof(*pages);
>> -     unsigned int order = MAX_ORDER;
>> +     int min_order = get_order(1 << __ffs(pgsize_bitmap));
>
> 1UL

Yong: presumably you will spin with this fix?


>> +     int order = MAX_ORDER;
>>
>>       if (array_size <= PAGE_SIZE)
>>               pages = kzalloc(array_size, GFP_KERNEL);
>> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>               /*
>>                * Higher-order allocations are a convenience rather
>>                * than a necessity, hence using __GFP_NORETRY until
>> -              * falling back to single-page allocations.
>> +              * falling back to min size allocations.
>>                */
>> -             for (order = min_t(unsigned int, order, __fls(count));
>> -                  order > 0; order--) {
>> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> +             for (order = min_t(int, order, __fls(count));
>> +                  order >= min_order; order--) {
>> +                     page = alloc_pages((order == min_order) ? gfp :
>> +                                        gfp | __GFP_NORETRY, order);
>>                       if (!page)
>>                               continue;
>> +                     if (!order)
>> +                             break;
>
> Isn't this handled by the loop condition?

He changed the loop condition to be ">= min_order" instead of "> 0",
so now we can get here with an order == 0.  This makes sense because
when min_order is not 0 you still want to run the code to split the
pages and it is sane not to duplicate that below.

Maybe I'm misunderstanding, though.  Perhaps you can explain how you
think this code should look?

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-05 17:03     ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-05 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

Will,

On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> Currently __iommu_dma_alloc_pages assumes that all the IOMMU support
>> the granule of PAGE_SIZE. It call alloc_page to try allocating memory
>> in the last time. Fortunately the mininum pagesize in all the
>> current IOMMU is SZ_4K, so this works well.
>>
>> But there may be a case in which the mininum granule of IOMMU may be
>> larger than PAGE_SIZE, then it will abort as the IOMMU cann't map the
>> discontinuous memory within a granule. For example, the pgsize_bitmap
>> of the IOMMU only has SZ_16K while the PAGE_SIZE is SZ_4K, then we
>> have to prepare SZ_16K continuous memory at least for each a granule
>> iommu mapping.
>
> Did you find a driver/platform that actually needs this? I certainly
> think it's possible, but we don't necessarily need to fix it if it's
> not broken yet! Just adding a comment would be enough.

I think Yong Wu was adding this patch in response to your request from
v1 of <https://patchwork.kernel.org/patch/8487511/>.  Fixing this
seemed like a separate issue so I asked Yong Wu to make two patches
rather than sticking this fix in the patch as his original.  Hopefully
this makes sense.

> Either way, a couple of review comments inline.
>
>>
>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>>
>> ---
>> v2:
>>  -rebase on v4.6-rc1.
>>  -add a new patch([1/2] add pgsize_bitmap) here.
>>
>>  drivers/iommu/dma-iommu.c | 22 +++++++++++++---------
>>  1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 72d6182..75ce71e 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -190,11 +190,13 @@ static void __iommu_dma_free_pages(struct page **pages, int count)
>>       kvfree(pages);
>>  }
>>
>> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp,
>> +                                          unsigned long pgsize_bitmap)
>>  {
>>       struct page **pages;
>>       unsigned int i = 0, array_size = count * sizeof(*pages);
>> -     unsigned int order = MAX_ORDER;
>> +     int min_order = get_order(1 << __ffs(pgsize_bitmap));
>
> 1UL

Yong: presumably you will spin with this fix?


>> +     int order = MAX_ORDER;
>>
>>       if (array_size <= PAGE_SIZE)
>>               pages = kzalloc(array_size, GFP_KERNEL);
>> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>>               /*
>>                * Higher-order allocations are a convenience rather
>>                * than a necessity, hence using __GFP_NORETRY until
>> -              * falling back to single-page allocations.
>> +              * falling back to min size allocations.
>>                */
>> -             for (order = min_t(unsigned int, order, __fls(count));
>> -                  order > 0; order--) {
>> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> +             for (order = min_t(int, order, __fls(count));
>> +                  order >= min_order; order--) {
>> +                     page = alloc_pages((order == min_order) ? gfp :
>> +                                        gfp | __GFP_NORETRY, order);
>>                       if (!page)
>>                               continue;
>> +                     if (!order)
>> +                             break;
>
> Isn't this handled by the loop condition?

He changed the loop condition to be ">= min_order" instead of "> 0",
so now we can get here with an order == 0.  This makes sense because
when min_order is not 0 you still want to run the code to split the
pages and it is sane not to duplicate that below.

Maybe I'm misunderstanding, though.  Perhaps you can explain how you
think this code should look?

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
  2016-04-05 17:03     ` Doug Anderson
  (?)
@ 2016-04-08 13:07       ` Will Deacon
  -1 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 13:07 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >>               /*
> >>                * Higher-order allocations are a convenience rather
> >>                * than a necessity, hence using __GFP_NORETRY until
> >> -              * falling back to single-page allocations.
> >> +              * falling back to min size allocations.
> >>                */
> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> -                  order > 0; order--) {
> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> +             for (order = min_t(int, order, __fls(count));
> >> +                  order >= min_order; order--) {
> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> +                                        gfp | __GFP_NORETRY, order);
> >>                       if (!page)
> >>                               continue;
> >> +                     if (!order)
> >> +                             break;
> >
> > Isn't this handled by the loop condition?
> 
> He changed the loop condition to be ">= min_order" instead of "> 0",
> so now we can get here with an order == 0.  This makes sense because
> when min_order is not 0 you still want to run the code to split the
> pages and it is sane not to duplicate that below.
> 
> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> think this code should look?

My reading of the code was that we require order >= min_order to enter
the loop. Given that order doesn't change between the loop header and the
if (!order) check, then that must mean we can enter the loop body with
order == 0 and order >= min_order, which means that min_order is allowed
to be negative. That feels weird.

Am I barking up the wrong tree?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 13:07       ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 13:07 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >>               /*
> >>                * Higher-order allocations are a convenience rather
> >>                * than a necessity, hence using __GFP_NORETRY until
> >> -              * falling back to single-page allocations.
> >> +              * falling back to min size allocations.
> >>                */
> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> -                  order > 0; order--) {
> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> +             for (order = min_t(int, order, __fls(count));
> >> +                  order >= min_order; order--) {
> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> +                                        gfp | __GFP_NORETRY, order);
> >>                       if (!page)
> >>                               continue;
> >> +                     if (!order)
> >> +                             break;
> >
> > Isn't this handled by the loop condition?
> 
> He changed the loop condition to be ">= min_order" instead of "> 0",
> so now we can get here with an order == 0.  This makes sense because
> when min_order is not 0 you still want to run the code to split the
> pages and it is sane not to duplicate that below.
> 
> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> think this code should look?

My reading of the code was that we require order >= min_order to enter
the loop. Given that order doesn't change between the loop header and the
if (!order) check, then that must mean we can enter the loop body with
order == 0 and order >= min_order, which means that min_order is allowed
to be negative. That feels weird.

Am I barking up the wrong tree?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 13:07       ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 13:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >>               /*
> >>                * Higher-order allocations are a convenience rather
> >>                * than a necessity, hence using __GFP_NORETRY until
> >> -              * falling back to single-page allocations.
> >> +              * falling back to min size allocations.
> >>                */
> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> -                  order > 0; order--) {
> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> +             for (order = min_t(int, order, __fls(count));
> >> +                  order >= min_order; order--) {
> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> +                                        gfp | __GFP_NORETRY, order);
> >>                       if (!page)
> >>                               continue;
> >> +                     if (!order)
> >> +                             break;
> >
> > Isn't this handled by the loop condition?
> 
> He changed the loop condition to be ">= min_order" instead of "> 0",
> so now we can get here with an order == 0.  This makes sense because
> when min_order is not 0 you still want to run the code to split the
> pages and it is sane not to duplicate that below.
> 
> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> think this code should look?

My reading of the code was that we require order >= min_order to enter
the loop. Given that order doesn't change between the loop header and the
if (!order) check, then that must mean we can enter the loop body with
order == 0 and order >= min_order, which means that min_order is allowed
to be negative. That feels weird.

Am I barking up the wrong tree?

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 16:50         ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 16:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

Will,

On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
>> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> >>               /*
>> >>                * Higher-order allocations are a convenience rather
>> >>                * than a necessity, hence using __GFP_NORETRY until
>> >> -              * falling back to single-page allocations.
>> >> +              * falling back to min size allocations.
>> >>                */
>> >> -             for (order = min_t(unsigned int, order, __fls(count));
>> >> -                  order > 0; order--) {
>> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> >> +             for (order = min_t(int, order, __fls(count));
>> >> +                  order >= min_order; order--) {
>> >> +                     page = alloc_pages((order == min_order) ? gfp :
>> >> +                                        gfp | __GFP_NORETRY, order);
>> >>                       if (!page)
>> >>                               continue;
>> >> +                     if (!order)
>> >> +                             break;
>> >
>> > Isn't this handled by the loop condition?
>>
>> He changed the loop condition to be ">= min_order" instead of "> 0",
>> so now we can get here with an order == 0.  This makes sense because
>> when min_order is not 0 you still want to run the code to split the
>> pages and it is sane not to duplicate that below.
>>
>> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
>> think this code should look?
>
> My reading of the code was that we require order >= min_order to enter
> the loop. Given that order doesn't change between the loop header and the
> if (!order) check, then that must mean we can enter the loop body with
> order == 0 and order >= min_order, which means that min_order is allowed
> to be negative. That feels weird.
>
> Am I barking up the wrong tree?

I don't think min_order can be negative.  Certainly we could enter the
loop with order == 0 and min_order == 0, though.


Some examples:

order = 0, min_order = 0
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, don't need splitting since single page.  OK

order = 1, min_order = 1
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 1, min_order = 0
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 0.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 2, min_order = 1
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 1.  OK
-> If alloc pages succeeds, DO need splitting.  OK


I think those are all right.  Did I mess up?  You could certainly
structure the loop in a different way but you need to make sure you
handle all of those cases.  If you have an alternate structure that
handles all those, let's consider it.

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 16:50         ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 16:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Matthias Brugger, moderated list:ARM/Mediatek SoC support,
	Lucas Stach

Will,

On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
>> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
>> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> >>               /*
>> >>                * Higher-order allocations are a convenience rather
>> >>                * than a necessity, hence using __GFP_NORETRY until
>> >> -              * falling back to single-page allocations.
>> >> +              * falling back to min size allocations.
>> >>                */
>> >> -             for (order = min_t(unsigned int, order, __fls(count));
>> >> -                  order > 0; order--) {
>> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> >> +             for (order = min_t(int, order, __fls(count));
>> >> +                  order >= min_order; order--) {
>> >> +                     page = alloc_pages((order == min_order) ? gfp :
>> >> +                                        gfp | __GFP_NORETRY, order);
>> >>                       if (!page)
>> >>                               continue;
>> >> +                     if (!order)
>> >> +                             break;
>> >
>> > Isn't this handled by the loop condition?
>>
>> He changed the loop condition to be ">= min_order" instead of "> 0",
>> so now we can get here with an order == 0.  This makes sense because
>> when min_order is not 0 you still want to run the code to split the
>> pages and it is sane not to duplicate that below.
>>
>> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
>> think this code should look?
>
> My reading of the code was that we require order >= min_order to enter
> the loop. Given that order doesn't change between the loop header and the
> if (!order) check, then that must mean we can enter the loop body with
> order == 0 and order >= min_order, which means that min_order is allowed
> to be negative. That feels weird.
>
> Am I barking up the wrong tree?

I don't think min_order can be negative.  Certainly we could enter the
loop with order == 0 and min_order == 0, though.


Some examples:

order = 0, min_order = 0
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, don't need splitting since single page.  OK

order = 1, min_order = 1
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 1, min_order = 0
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 0.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 2, min_order = 1
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 1.  OK
-> If alloc pages succeeds, DO need splitting.  OK


I think those are all right.  Did I mess up?  You could certainly
structure the loop in a different way but you need to make sure you
handle all of those cases.  If you have an alternate structure that
handles all those, let's consider it.

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 16:50         ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

Will,

On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
>> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
>> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
>> >>               /*
>> >>                * Higher-order allocations are a convenience rather
>> >>                * than a necessity, hence using __GFP_NORETRY until
>> >> -              * falling back to single-page allocations.
>> >> +              * falling back to min size allocations.
>> >>                */
>> >> -             for (order = min_t(unsigned int, order, __fls(count));
>> >> -                  order > 0; order--) {
>> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
>> >> +             for (order = min_t(int, order, __fls(count));
>> >> +                  order >= min_order; order--) {
>> >> +                     page = alloc_pages((order == min_order) ? gfp :
>> >> +                                        gfp | __GFP_NORETRY, order);
>> >>                       if (!page)
>> >>                               continue;
>> >> +                     if (!order)
>> >> +                             break;
>> >
>> > Isn't this handled by the loop condition?
>>
>> He changed the loop condition to be ">= min_order" instead of "> 0",
>> so now we can get here with an order == 0.  This makes sense because
>> when min_order is not 0 you still want to run the code to split the
>> pages and it is sane not to duplicate that below.
>>
>> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
>> think this code should look?
>
> My reading of the code was that we require order >= min_order to enter
> the loop. Given that order doesn't change between the loop header and the
> if (!order) check, then that must mean we can enter the loop body with
> order == 0 and order >= min_order, which means that min_order is allowed
> to be negative. That feels weird.
>
> Am I barking up the wrong tree?

I don't think min_order can be negative.  Certainly we could enter the
loop with order == 0 and min_order == 0, though.


Some examples:

order = 0, min_order = 0
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, don't need splitting since single page.  OK

order = 1, min_order = 1
-> Want alloc_pages _without_ __GFP_NORETRY.  OK
-> If alloc_pages fails, return NULL.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 1, min_order = 0
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 0.  OK
-> If alloc pages succeeds, DO need splitting.  OK

order = 2, min_order = 1
-> Want alloc_pages with __GFP_NORETRY.  OK
-> If alloc_pages fails, try order = 1.  OK
-> If alloc pages succeeds, DO need splitting.  OK


I think those are all right.  Did I mess up?  You could certainly
structure the loop in a different way but you need to make sure you
handle all of those cases.  If you have an alternate structure that
handles all those, let's consider it.

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:30           ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 17:30 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

On Fri, Apr 08, 2016 at 09:50:43AM -0700, Doug Anderson wrote:
> On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> >> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >> >>               /*
> >> >>                * Higher-order allocations are a convenience rather
> >> >>                * than a necessity, hence using __GFP_NORETRY until
> >> >> -              * falling back to single-page allocations.
> >> >> +              * falling back to min size allocations.
> >> >>                */
> >> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> >> -                  order > 0; order--) {
> >> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> >> +             for (order = min_t(int, order, __fls(count));
> >> >> +                  order >= min_order; order--) {
> >> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> >> +                                        gfp | __GFP_NORETRY, order);
> >> >>                       if (!page)
> >> >>                               continue;
> >> >> +                     if (!order)
> >> >> +                             break;
> >> >
> >> > Isn't this handled by the loop condition?
> >>
> >> He changed the loop condition to be ">= min_order" instead of "> 0",
> >> so now we can get here with an order == 0.  This makes sense because
> >> when min_order is not 0 you still want to run the code to split the
> >> pages and it is sane not to duplicate that below.
> >>
> >> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> >> think this code should look?
> >
> > My reading of the code was that we require order >= min_order to enter
> > the loop. Given that order doesn't change between the loop header and the
> > if (!order) check, then that must mean we can enter the loop body with
> > order == 0 and order >= min_order, which means that min_order is allowed
> > to be negative. That feels weird.
> >
> > Am I barking up the wrong tree?
> 
> I don't think min_order can be negative.  Certainly we could enter the
> loop with order == 0 and min_order == 0, though.

... and in that case, PageCompound will be false, and we'll call split_page
which won't do anything, so we break out.

> 
> Some examples:
> 
> order = 0, min_order = 0
> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> -> If alloc_pages fails, return NULL.  OK
> -> If alloc pages succeeds, don't need splitting since single page.  OK

[...]

> I think those are all right.  Did I mess up?  You could certainly
> structure the loop in a different way but you need to make sure you
> handle all of those cases.  If you have an alternate structure that
> handles all those, let's consider it.

Right, I don't think the code is broken, I just think the !order check is
confusing and not needed.

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:30           ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 17:30 UTC (permalink / raw)
  To: Doug Anderson
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Matthias Brugger, moderated list:ARM/Mediatek SoC support,
	Lucas Stach

On Fri, Apr 08, 2016 at 09:50:43AM -0700, Doug Anderson wrote:
> On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> > On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> >> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> >> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >> >>               /*
> >> >>                * Higher-order allocations are a convenience rather
> >> >>                * than a necessity, hence using __GFP_NORETRY until
> >> >> -              * falling back to single-page allocations.
> >> >> +              * falling back to min size allocations.
> >> >>                */
> >> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> >> -                  order > 0; order--) {
> >> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> >> +             for (order = min_t(int, order, __fls(count));
> >> >> +                  order >= min_order; order--) {
> >> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> >> +                                        gfp | __GFP_NORETRY, order);
> >> >>                       if (!page)
> >> >>                               continue;
> >> >> +                     if (!order)
> >> >> +                             break;
> >> >
> >> > Isn't this handled by the loop condition?
> >>
> >> He changed the loop condition to be ">= min_order" instead of "> 0",
> >> so now we can get here with an order == 0.  This makes sense because
> >> when min_order is not 0 you still want to run the code to split the
> >> pages and it is sane not to duplicate that below.
> >>
> >> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> >> think this code should look?
> >
> > My reading of the code was that we require order >= min_order to enter
> > the loop. Given that order doesn't change between the loop header and the
> > if (!order) check, then that must mean we can enter the loop body with
> > order == 0 and order >= min_order, which means that min_order is allowed
> > to be negative. That feels weird.
> >
> > Am I barking up the wrong tree?
> 
> I don't think min_order can be negative.  Certainly we could enter the
> loop with order == 0 and min_order == 0, though.

... and in that case, PageCompound will be false, and we'll call split_page
which won't do anything, so we break out.

> 
> Some examples:
> 
> order = 0, min_order = 0
> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> -> If alloc_pages fails, return NULL.  OK
> -> If alloc pages succeeds, don't need splitting since single page.  OK

[...]

> I think those are all right.  Did I mess up?  You could certainly
> structure the loop in a different way but you need to make sure you
> handle all of those cases.  If you have an alternate structure that
> handles all those, let's consider it.

Right, I don't think the code is broken, I just think the !order check is
confusing and not needed.

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:30           ` Will Deacon
  0 siblings, 0 replies; 27+ messages in thread
From: Will Deacon @ 2016-04-08 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 08, 2016 at 09:50:43AM -0700, Doug Anderson wrote:
> On Fri, Apr 8, 2016 at 6:07 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Apr 05, 2016 at 10:03:32AM -0700, Doug Anderson wrote:
> >> On Tue, Mar 29, 2016 at 10:02 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> > On Mon, Mar 28, 2016 at 02:32:11PM +0800, Yong Wu wrote:
> >> >> @@ -213,13 +215,16 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
> >> >>               /*
> >> >>                * Higher-order allocations are a convenience rather
> >> >>                * than a necessity, hence using __GFP_NORETRY until
> >> >> -              * falling back to single-page allocations.
> >> >> +              * falling back to min size allocations.
> >> >>                */
> >> >> -             for (order = min_t(unsigned int, order, __fls(count));
> >> >> -                  order > 0; order--) {
> >> >> -                     page = alloc_pages(gfp | __GFP_NORETRY, order);
> >> >> +             for (order = min_t(int, order, __fls(count));
> >> >> +                  order >= min_order; order--) {
> >> >> +                     page = alloc_pages((order == min_order) ? gfp :
> >> >> +                                        gfp | __GFP_NORETRY, order);
> >> >>                       if (!page)
> >> >>                               continue;
> >> >> +                     if (!order)
> >> >> +                             break;
> >> >
> >> > Isn't this handled by the loop condition?
> >>
> >> He changed the loop condition to be ">= min_order" instead of "> 0",
> >> so now we can get here with an order == 0.  This makes sense because
> >> when min_order is not 0 you still want to run the code to split the
> >> pages and it is sane not to duplicate that below.
> >>
> >> Maybe I'm misunderstanding, though.  Perhaps you can explain how you
> >> think this code should look?
> >
> > My reading of the code was that we require order >= min_order to enter
> > the loop. Given that order doesn't change between the loop header and the
> > if (!order) check, then that must mean we can enter the loop body with
> > order == 0 and order >= min_order, which means that min_order is allowed
> > to be negative. That feels weird.
> >
> > Am I barking up the wrong tree?
> 
> I don't think min_order can be negative.  Certainly we could enter the
> loop with order == 0 and min_order == 0, though.

... and in that case, PageCompound will be false, and we'll call split_page
which won't do anything, so we break out.

> 
> Some examples:
> 
> order = 0, min_order = 0
> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> -> If alloc_pages fails, return NULL.  OK
> -> If alloc pages succeeds, don't need splitting since single page.  OK

[...]

> I think those are all right.  Did I mess up?  You could certainly
> structure the loop in a different way but you need to make sure you
> handle all of those cases.  If you have an alternate structure that
> handles all those, let's consider it.

Right, I don't think the code is broken, I just think the !order check is
confusing and not needed.

Will

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:34             ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 17:34 UTC (permalink / raw)
  To: Will Deacon
  Cc: Yong Wu, Joerg Roedel, Catalin Marinas, Matthias Brugger,
	Robin Murphy, Daniel Kurtz, Tomasz Figa, Arnd Bergmann,
	Lucas Stach, Marek Szyprowski,
	moderated list:ARM/Mediatek SoC support, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu

Hi,

On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > Am I barking up the wrong tree?
>>
>> I don't think min_order can be negative.  Certainly we could enter the
>> loop with order == 0 and min_order == 0, though.
>
> ... and in that case, PageCompound will be false, and we'll call split_page
> which won't do anything, so we break out.
>
>>
>> Some examples:
>>
>> order = 0, min_order = 0
>> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
>> -> If alloc_pages fails, return NULL.  OK
>> -> If alloc pages succeeds, don't need splitting since single page.  OK
>
> [...]
>
>> I think those are all right.  Did I mess up?  You could certainly
>> structure the loop in a different way but you need to make sure you
>> handle all of those cases.  If you have an alternate structure that
>> handles all those, let's consider it.
>
> Right, I don't think the code is broken, I just think the !order check is
> confusing and not needed.

Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
no-op when "order == 0".  I just know that the old code didn't call
split_page() with order == 0 so I assumed that was wise to keep.  If
we don't need to keep that then agreed that the "if" test can simply
be removed.  :)

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:34             ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 17:34 UTC (permalink / raw)
  To: Will Deacon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Matthias Brugger, moderated list:ARM/Mediatek SoC support,
	Lucas Stach

Hi,

On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
>> > Am I barking up the wrong tree?
>>
>> I don't think min_order can be negative.  Certainly we could enter the
>> loop with order == 0 and min_order == 0, though.
>
> ... and in that case, PageCompound will be false, and we'll call split_page
> which won't do anything, so we break out.
>
>>
>> Some examples:
>>
>> order = 0, min_order = 0
>> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
>> -> If alloc_pages fails, return NULL.  OK
>> -> If alloc pages succeeds, don't need splitting since single page.  OK
>
> [...]
>
>> I think those are all right.  Did I mess up?  You could certainly
>> structure the loop in a different way but you need to make sure you
>> handle all of those cases.  If you have an alternate structure that
>> handles all those, let's consider it.
>
> Right, I don't think the code is broken, I just think the !order check is
> confusing and not needed.

Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
no-op when "order == 0".  I just know that the old code didn't call
split_page() with order == 0 so I assumed that was wise to keep.  If
we don't need to keep that then agreed that the "if" test can simply
be removed.  :)

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-08 17:34             ` Doug Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Doug Anderson @ 2016-04-08 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > Am I barking up the wrong tree?
>>
>> I don't think min_order can be negative.  Certainly we could enter the
>> loop with order == 0 and min_order == 0, though.
>
> ... and in that case, PageCompound will be false, and we'll call split_page
> which won't do anything, so we break out.
>
>>
>> Some examples:
>>
>> order = 0, min_order = 0
>> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
>> -> If alloc_pages fails, return NULL.  OK
>> -> If alloc pages succeeds, don't need splitting since single page.  OK
>
> [...]
>
>> I think those are all right.  Did I mess up?  You could certainly
>> structure the loop in a different way but you need to make sure you
>> handle all of those cases.  If you have an alternate structure that
>> handles all those, let's consider it.
>
> Right, I don't think the code is broken, I just think the !order check is
> confusing and not needed.

Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
no-op when "order == 0".  I just know that the old code didn't call
split_page() with order == 0 so I assumed that was wise to keep.  If
we don't need to keep that then agreed that the "if" test can simply
be removed.  :)

-Doug

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-11  7:40               ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-04-11  7:40 UTC (permalink / raw)
  To: Will Deacon, Doug Anderson
  Cc: Joerg Roedel, Catalin Marinas, Matthias Brugger, Robin Murphy,
	Daniel Kurtz, Tomasz Figa, Arnd Bergmann, Lucas Stach,
	Marek Szyprowski, moderated list:ARM/Mediatek SoC support,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu

On Fri, 2016-04-08 at 10:34 -0700, Doug Anderson wrote:
> Hi,
> 
> On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> > Am I barking up the wrong tree?
> >>
> >> I don't think min_order can be negative.  Certainly we could enter the
> >> loop with order == 0 and min_order == 0, though.
> >
> > ... and in that case, PageCompound will be false, and we'll call split_page
> > which won't do anything, so we break out.
> >
> >>
> >> Some examples:
> >>
> >> order = 0, min_order = 0
> >> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> >> -> If alloc_pages fails, return NULL.  OK
> >> -> If alloc pages succeeds, don't need splitting since single page.  OK
> >
> > [...]
> >
> >> I think those are all right.  Did I mess up?  You could certainly
> >> structure the loop in a different way but you need to make sure you
> >> handle all of those cases.  If you have an alternate structure that
> >> handles all those, let's consider it.
> >
> > Right, I don't think the code is broken, I just think the !order check is
> > confusing and not needed.
> 
> Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
> no-op when "order == 0".  I just know that the old code didn't call
> split_page() with order == 0 so I assumed that was wise to keep.  If
> we don't need to keep that then agreed that the "if" test can simply
> be removed.  :)
> 
> -Doug

Hi Will, Doug,

Thanks very much for review this patch, and Thanks Robin's work,
Currently this one is obsoleted since it's included in [1].

[1]:http://lists.linuxfoundation.org/pipermail/iommu/2016-April/016402.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-11  7:40               ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-04-11  7:40 UTC (permalink / raw)
  To: Will Deacon, Doug Anderson
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w, Arnd Bergmann,
	Catalin Marinas, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tomasz Figa, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daniel Kurtz, Matthias Brugger,
	moderated list:ARM/Mediatek SoC support,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Lucas Stach

On Fri, 2016-04-08 at 10:34 -0700, Doug Anderson wrote:
> Hi,
> 
> On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> >> > Am I barking up the wrong tree?
> >>
> >> I don't think min_order can be negative.  Certainly we could enter the
> >> loop with order == 0 and min_order == 0, though.
> >
> > ... and in that case, PageCompound will be false, and we'll call split_page
> > which won't do anything, so we break out.
> >
> >>
> >> Some examples:
> >>
> >> order = 0, min_order = 0
> >> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> >> -> If alloc_pages fails, return NULL.  OK
> >> -> If alloc pages succeeds, don't need splitting since single page.  OK
> >
> > [...]
> >
> >> I think those are all right.  Did I mess up?  You could certainly
> >> structure the loop in a different way but you need to make sure you
> >> handle all of those cases.  If you have an alternate structure that
> >> handles all those, let's consider it.
> >
> > Right, I don't think the code is broken, I just think the !order check is
> > confusing and not needed.
> 
> Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
> no-op when "order == 0".  I just know that the old code didn't call
> split_page() with order == 0 so I assumed that was wise to keep.  If
> we don't need to keep that then agreed that the "if" test can simply
> be removed.  :)
> 
> -Doug

Hi Will, Doug,

Thanks very much for review this patch, and Thanks Robin's work,
Currently this one is obsoleted since it's included in [1].

[1]:http://lists.linuxfoundation.org/pipermail/iommu/2016-April/016402.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages
@ 2016-04-11  7:40               ` Yong Wu
  0 siblings, 0 replies; 27+ messages in thread
From: Yong Wu @ 2016-04-11  7:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2016-04-08 at 10:34 -0700, Doug Anderson wrote:
> Hi,
> 
> On Fri, Apr 8, 2016 at 10:30 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> > Am I barking up the wrong tree?
> >>
> >> I don't think min_order can be negative.  Certainly we could enter the
> >> loop with order == 0 and min_order == 0, though.
> >
> > ... and in that case, PageCompound will be false, and we'll call split_page
> > which won't do anything, so we break out.
> >
> >>
> >> Some examples:
> >>
> >> order = 0, min_order = 0
> >> -> Want alloc_pages _without_ __GFP_NORETRY.  OK
> >> -> If alloc_pages fails, return NULL.  OK
> >> -> If alloc pages succeeds, don't need splitting since single page.  OK
> >
> > [...]
> >
> >> I think those are all right.  Did I mess up?  You could certainly
> >> structure the loop in a different way but you need to make sure you
> >> handle all of those cases.  If you have an alternate structure that
> >> handles all those, let's consider it.
> >
> > Right, I don't think the code is broken, I just think the !order check is
> > confusing and not needed.
> 
> Ah ha!  Got it.  I didn't dig into split_page() to see that it was a
> no-op when "order == 0".  I just know that the old code didn't call
> split_page() with order == 0 so I assumed that was wise to keep.  If
> we don't need to keep that then agreed that the "if" test can simply
> be removed.  :)
> 
> -Doug

Hi Will, Doug,

Thanks very much for review this patch, and Thanks Robin's work,
Currently this one is obsoleted since it's included in [1].

[1]:http://lists.linuxfoundation.org/pipermail/iommu/2016-April/016402.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-04-11  7:40 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-28  6:32 [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages Yong Wu
2016-03-28  6:32 ` Yong Wu
2016-03-28  6:32 ` Yong Wu
2016-03-28  6:32 ` [PATCH v2 2/2] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support Yong Wu
2016-03-28  6:32   ` Yong Wu
2016-03-28  6:32   ` Yong Wu
2016-03-29 17:02 ` [PATCH v2 1/2] dma/iommu: Add pgsize_bitmap confirmation in __iommu_dma_alloc_pages Will Deacon
2016-03-29 17:02   ` Will Deacon
2016-03-29 17:02   ` Will Deacon
2016-04-05 17:03   ` Doug Anderson
2016-04-05 17:03     ` Doug Anderson
2016-04-05 17:03     ` Doug Anderson
2016-04-08 13:07     ` Will Deacon
2016-04-08 13:07       ` Will Deacon
2016-04-08 13:07       ` Will Deacon
2016-04-08 16:50       ` Doug Anderson
2016-04-08 16:50         ` Doug Anderson
2016-04-08 16:50         ` Doug Anderson
2016-04-08 17:30         ` Will Deacon
2016-04-08 17:30           ` Will Deacon
2016-04-08 17:30           ` Will Deacon
2016-04-08 17:34           ` Doug Anderson
2016-04-08 17:34             ` Doug Anderson
2016-04-08 17:34             ` Doug Anderson
2016-04-11  7:40             ` Yong Wu
2016-04-11  7:40               ` Yong Wu
2016-04-11  7:40               ` Yong Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.