All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] iommu/dma: Add some missing #includes
@ 2015-12-18 17:01 ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

dma-iommu.c was naughtily relying on an implicit transitive #include of
linux/vmalloc.h, which is apparently not present on some architectures.
Add that, plus a couple more headers for other functions which are used
similarly.

Reported-by: kbuild test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

Hi Joerg,

here are a couple more minor fixes for some obscure subtleties in the
common DMA code. I see I've just missed the rc5 fixes pull, but I hope
you can pick these up for rc6 (unless of course you're also just about
to disappear for 2 weeks like I am).

Thanks,
Robin.


 drivers/iommu/dma-iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 427fdc1..982e716 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -21,10 +21,13 @@
 
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
+#include <linux/gfp.h>
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
 #include <linux/iova.h>
 #include <linux/mm.h>
+#include <linux/scatterlist.h>
+#include <linux/vmalloc.h>
 
 int iommu_dma_init(void)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/3] iommu/dma: Add some missing #includes
@ 2015-12-18 17:01 ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

dma-iommu.c was naughtily relying on an implicit transitive #include of
linux/vmalloc.h, which is apparently not present on some architectures.
Add that, plus a couple more headers for other functions which are used
similarly.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Hi Joerg,

here are a couple more minor fixes for some obscure subtleties in the
common DMA code. I see I've just missed the rc5 fixes pull, but I hope
you can pick these up for rc6 (unless of course you're also just about
to disappear for 2 weeks like I am).

Thanks,
Robin.


 drivers/iommu/dma-iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 427fdc1..982e716 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -21,10 +21,13 @@
 
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
+#include <linux/gfp.h>
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
 #include <linux/iova.h>
 #include <linux/mm.h>
+#include <linux/scatterlist.h>
+#include <linux/vmalloc.h>
 
 int iommu_dma_init(void)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2015-12-18 17:01 ` Robin Murphy
@ 2015-12-18 17:01     ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

When mapping a non-page-aligned scatterlist entry, we copy the original
offset to the output DMA address before aligning it to hand off to
iommu_map_sg(), then later adding the IOVA page address portion to get
the final mapped address. However, when the IOVA page size is smaller
than the CPU page size, it is the offset within the IOVA page we want,
not that within the CPU page, which can easily be larger than an IOVA
page and thus result in an incorrect final address.

Fix the bug by taking only the IOVA-aligned part of the offset as the
basis of the DMA address, not the whole thing.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 982e716..03811e3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		size_t s_length = s->length;
 		size_t pad_len = (mask - iova_len + 1) & mask;
 
-		sg_dma_address(s) = s->offset;
+		sg_dma_address(s) = s_offset;
 		sg_dma_len(s) = s_length;
 		s->offset -= s_offset;
 		s_length = iova_align(iovad, s_length + s_offset);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2015-12-18 17:01     ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

When mapping a non-page-aligned scatterlist entry, we copy the original
offset to the output DMA address before aligning it to hand off to
iommu_map_sg(), then later adding the IOVA page address portion to get
the final mapped address. However, when the IOVA page size is smaller
than the CPU page size, it is the offset within the IOVA page we want,
not that within the CPU page, which can easily be larger than an IOVA
page and thus result in an incorrect final address.

Fix the bug by taking only the IOVA-aligned part of the offset as the
basis of the DMA address, not the whole thing.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 982e716..03811e3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		size_t s_length = s->length;
 		size_t pad_len = (mask - iova_len + 1) & mask;
 
-		sg_dma_address(s) = s->offset;
+		sg_dma_address(s) = s_offset;
 		sg_dma_len(s) = s_length;
 		s->offset -= s_offset;
 		s_length = iova_align(iovad, s_length + s_offset);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
  2015-12-18 17:01 ` Robin Murphy
@ 2015-12-18 17:01     ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Doug reports that the equivalent page allocator on 32-bit ARM exhibits
particularly pathalogical behaviour under memory pressure when
fragmentation is high, where allocating a 4MB buffer takes tens of
seconds and the number of calls to alloc_pages() is over 9000![1]

We can drastically improve that situation without losing the other
benefits of high-order allocations when they would succeed, by assuming
memory pressure is relatively constant over the course of an allocation,
and not retrying allocations at orders we know to have failed before.
This way, the best-case behaviour remains unchanged, and in the worst
case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
before falling back to single pages for the remainder of the buffer.

[1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html

Reported-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 03811e3..e37516a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -194,6 +194,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
+	unsigned int order = MAX_ORDER;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -207,14 +208,15 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
-		int j, order = __fls(count);
+		int j;
 
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
 		 * falling back to single-page allocations.
 		 */
-		for (order = min(order, MAX_ORDER); order > 0; order--) {
+		for (order = min_t(unsigned int, order, __fls(count));
+		     order > 0; order--) {
 			page = alloc_pages(gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
@ 2015-12-18 17:01     ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2015-12-18 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

Doug reports that the equivalent page allocator on 32-bit ARM exhibits
particularly pathalogical behaviour under memory pressure when
fragmentation is high, where allocating a 4MB buffer takes tens of
seconds and the number of calls to alloc_pages() is over 9000![1]

We can drastically improve that situation without losing the other
benefits of high-order allocations when they would succeed, by assuming
memory pressure is relatively constant over the course of an allocation,
and not retrying allocations at orders we know to have failed before.
This way, the best-case behaviour remains unchanged, and in the worst
case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
before falling back to single pages for the remainder of the buffer.

[1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html

Reported-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 03811e3..e37516a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -194,6 +194,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 {
 	struct page **pages;
 	unsigned int i = 0, array_size = count * sizeof(*pages);
+	unsigned int order = MAX_ORDER;
 
 	if (array_size <= PAGE_SIZE)
 		pages = kzalloc(array_size, GFP_KERNEL);
@@ -207,14 +208,15 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp)
 
 	while (count) {
 		struct page *page = NULL;
-		int j, order = __fls(count);
+		int j;
 
 		/*
 		 * Higher-order allocations are a convenience rather
 		 * than a necessity, hence using __GFP_NORETRY until
 		 * falling back to single-page allocations.
 		 */
-		for (order = min(order, MAX_ORDER); order > 0; order--) {
+		for (order = min_t(unsigned int, order, __fls(count));
+		     order > 0; order--) {
 			page = alloc_pages(gfp | __GFP_NORETRY, order);
 			if (!page)
 				continue;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
  2015-12-18 17:01     ` Robin Murphy
@ 2015-12-18 22:35         ` Doug Anderson
  -1 siblings, 0 replies; 26+ messages in thread
From: Doug Anderson @ 2015-12-18 22:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Robin,

On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
> Doug reports that the equivalent page allocator on 32-bit ARM exhibits
> particularly pathalogical behaviour under memory pressure when
> fragmentation is high, where allocating a 4MB buffer takes tens of
> seconds and the number of calls to alloc_pages() is over 9000![1]
>
> We can drastically improve that situation without losing the other
> benefits of high-order allocations when they would succeed, by assuming
> memory pressure is relatively constant over the course of an allocation,
> and not retrying allocations at orders we know to have failed before.
> This way, the best-case behaviour remains unchanged, and in the worst
> case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
> before falling back to single pages for the remainder of the buffer.
>
> [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html
>
> Reported-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Presumably it makes sense to update this based on v2 of my patch to
the original code?  AKA: <https://patchwork.kernel.org/patch/7888851/>
?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
@ 2015-12-18 22:35         ` Doug Anderson
  0 siblings, 0 replies; 26+ messages in thread
From: Doug Anderson @ 2015-12-18 22:35 UTC (permalink / raw)
  To: linux-arm-kernel

Robin,

On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy@arm.com> wrote:
> Doug reports that the equivalent page allocator on 32-bit ARM exhibits
> particularly pathalogical behaviour under memory pressure when
> fragmentation is high, where allocating a 4MB buffer takes tens of
> seconds and the number of calls to alloc_pages() is over 9000![1]
>
> We can drastically improve that situation without losing the other
> benefits of high-order allocations when they would succeed, by assuming
> memory pressure is relatively constant over the course of an allocation,
> and not retrying allocations at orders we know to have failed before.
> This way, the best-case behaviour remains unchanged, and in the worst
> case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
> before falling back to single pages for the remainder of the buffer.
>
> [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html
>
> Reported-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Presumably it makes sense to update this based on v2 of my patch to
the original code?  AKA: <https://patchwork.kernel.org/patch/7888851/>
?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2015-12-18 17:01     ` Robin Murphy
@ 2015-12-28 16:05         ` Joerg Roedel
  -1 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-12-28 16:05 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Fri, Dec 18, 2015 at 05:01:47PM +0000, Robin Murphy wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
> 
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 982e716..03811e3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		size_t s_length = s->length;
>  		size_t pad_len = (mask - iova_len + 1) & mask;
>  
> -		sg_dma_address(s) = s->offset;
> +		sg_dma_address(s) = s_offset;

Does not apply on v4.4-rc7.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2015-12-28 16:05         ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-12-28 16:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 18, 2015 at 05:01:47PM +0000, Robin Murphy wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
> 
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 982e716..03811e3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		size_t s_length = s->length;
>  		size_t pad_len = (mask - iova_len + 1) & mask;
>  
> -		sg_dma_address(s) = s->offset;
> +		sg_dma_address(s) = s_offset;

Does not apply on v4.4-rc7.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] iommu/dma: Add some missing #includes
  2015-12-18 17:01 ` Robin Murphy
@ 2015-12-28 16:08     ` Joerg Roedel
  -1 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-12-28 16:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Fri, Dec 18, 2015 at 05:01:46PM +0000, Robin Murphy wrote:
> Hi Joerg,
> 
> here are a couple more minor fixes for some obscure subtleties in the
> common DMA code. I see I've just missed the rc5 fixes pull, but I hope
> you can pick these up for rc6 (unless of course you're also just about
> to disappear for 2 weeks like I am).

Applied 1 and 3 to iommu/fixes, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/3] iommu/dma: Add some missing #includes
@ 2015-12-28 16:08     ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-12-28 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 18, 2015 at 05:01:46PM +0000, Robin Murphy wrote:
> Hi Joerg,
> 
> here are a couple more minor fixes for some obscure subtleties in the
> common DMA code. I see I've just missed the rc5 fixes pull, but I hope
> you can pick these up for rc6 (unless of course you're also just about
> to disappear for 2 weeks like I am).

Applied 1 and 3 to iommu/fixes, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2015-12-28 16:05         ` Joerg Roedel
@ 2016-01-04 16:08             ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-01-04 16:08 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 28/12/15 16:05, Joerg Roedel wrote:
> On Fri, Dec 18, 2015 at 05:01:47PM +0000, Robin Murphy wrote:
>> When mapping a non-page-aligned scatterlist entry, we copy the original
>> offset to the output DMA address before aligning it to hand off to
>> iommu_map_sg(), then later adding the IOVA page address portion to get
>> the final mapped address. However, when the IOVA page size is smaller
>> than the CPU page size, it is the offset within the IOVA page we want,
>> not that within the CPU page, which can easily be larger than an IOVA
>> page and thus result in an incorrect final address.
>>
>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>> basis of the DMA address, not the whole thing.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>   drivers/iommu/dma-iommu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 982e716..03811e3 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>>   		size_t s_length = s->length;
>>   		size_t pad_len = (mask - iova_len + 1) & mask;
>>
>> -		sg_dma_address(s) = s->offset;
>> +		sg_dma_address(s) = s_offset;
>
> Does not apply on v4.4-rc7.

Ah, I did these on top of the map_sg tweak[1] locally, and failed to 
spot that that leaks into a context line here. There's no actual 
dependency so I'll rebase and repost this bugfix momentarily - as that 
other thread seems to have stalled, I'll probably have another crack at 
a proper scatterlist segment merging implementation sometime after the 
merge window.

Thanks,
Robin.

[1]:http://thread.gmane.org/gmane.linux.kernel.iommu/11512

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2016-01-04 16:08             ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-01-04 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 28/12/15 16:05, Joerg Roedel wrote:
> On Fri, Dec 18, 2015 at 05:01:47PM +0000, Robin Murphy wrote:
>> When mapping a non-page-aligned scatterlist entry, we copy the original
>> offset to the output DMA address before aligning it to hand off to
>> iommu_map_sg(), then later adding the IOVA page address portion to get
>> the final mapped address. However, when the IOVA page size is smaller
>> than the CPU page size, it is the offset within the IOVA page we want,
>> not that within the CPU page, which can easily be larger than an IOVA
>> page and thus result in an incorrect final address.
>>
>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>> basis of the DMA address, not the whole thing.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/dma-iommu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 982e716..03811e3 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>>   		size_t s_length = s->length;
>>   		size_t pad_len = (mask - iova_len + 1) & mask;
>>
>> -		sg_dma_address(s) = s->offset;
>> +		sg_dma_address(s) = s_offset;
>
> Does not apply on v4.4-rc7.

Ah, I did these on top of the map_sg tweak[1] locally, and failed to 
spot that that leaks into a context line here. There's no actual 
dependency so I'll rebase and repost this bugfix momentarily - as that 
other thread seems to have stalled, I'll probably have another crack at 
a proper scatterlist segment merging implementation sometime after the 
merge window.

Thanks,
Robin.

[1]:http://thread.gmane.org/gmane.linux.kernel.iommu/11512

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND] iommu/dma: Use correct offset in map_sg
  2015-12-18 17:01     ` Robin Murphy
@ 2016-01-04 16:19         ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-01-04 16:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

When mapping a non-page-aligned scatterlist entry, we copy the original
offset to the output DMA address before aligning it to hand off to
iommu_map_sg(), then later adding the IOVA page address portion to get
the final mapped address. However, when the IOVA page size is smaller
than the CPU page size, it is the offset within the IOVA page we want,
not that within the CPU page, which can easily be larger than an IOVA
page and thus result in an incorrect final address.

Fix the bug by taking only the IOVA-aligned part of the offset as the
basis of the DMA address, not the whole thing.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

Rebased onto iommu/fixes.

 drivers/iommu/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2e7417f..72d6182 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		size_t s_offset = iova_offset(iovad, s->offset);
 		size_t s_length = s->length;
 
-		sg_dma_address(s) = s->offset;
+		sg_dma_address(s) = s_offset;
 		sg_dma_len(s) = s_length;
 		s->offset -= s_offset;
 		s_length = iova_align(iovad, s_length + s_offset);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH RESEND] iommu/dma: Use correct offset in map_sg
@ 2016-01-04 16:19         ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-01-04 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

When mapping a non-page-aligned scatterlist entry, we copy the original
offset to the output DMA address before aligning it to hand off to
iommu_map_sg(), then later adding the IOVA page address portion to get
the final mapped address. However, when the IOVA page size is smaller
than the CPU page size, it is the offset within the IOVA page we want,
not that within the CPU page, which can easily be larger than an IOVA
page and thus result in an incorrect final address.

Fix the bug by taking only the IOVA-aligned part of the offset as the
basis of the DMA address, not the whole thing.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Rebased onto iommu/fixes.

 drivers/iommu/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2e7417f..72d6182 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		size_t s_offset = iova_offset(iovad, s->offset);
 		size_t s_length = s->length;
 
-		sg_dma_address(s) = s->offset;
+		sg_dma_address(s) = s_offset;
 		sg_dma_len(s) = s_length;
 		s->offset -= s_offset;
 		s_length = iova_align(iovad, s_length + s_offset);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH RESEND] iommu/dma: Use correct offset in map_sg
  2016-01-04 16:19         ` Robin Murphy
@ 2016-01-07 12:38             ` Joerg Roedel
  -1 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2016-01-07 12:38 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Mon, Jan 04, 2016 at 04:19:42PM +0000, Robin Murphy wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
> 
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>

Applied, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH RESEND] iommu/dma: Use correct offset in map_sg
@ 2016-01-07 12:38             ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2016-01-07 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 04, 2016 at 04:19:42PM +0000, Robin Murphy wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
> 
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
  2015-12-18 22:35         ` Doug Anderson
@ 2016-01-19  2:55             ` Yong Wu
  -1 siblings, 0 replies; 26+ messages in thread
From: Yong Wu @ 2016-01-19  2:55 UTC (permalink / raw)
  To: Robin Murphy, Doug Anderson
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Daniel Kurtz,
	srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Fri, 2015-12-18 at 14:35 -0800, Doug Anderson wrote:
> Robin,
> 
> On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
> > Doug reports that the equivalent page allocator on 32-bit ARM exhibits
> > particularly pathalogical behaviour under memory pressure when
> > fragmentation is high, where allocating a 4MB buffer takes tens of
> > seconds and the number of calls to alloc_pages() is over 9000![1]
> >
> > We can drastically improve that situation without losing the other
> > benefits of high-order allocations when they would succeed, by assuming
> > memory pressure is relatively constant over the course of an allocation,
> > and not retrying allocations at orders we know to have failed before.
> > This way, the best-case behaviour remains unchanged, and in the worst
> > case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
> > before falling back to single pages for the remainder of the buffer.
> >
> > [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html
> >
> > Reported-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/iommu/dma-iommu.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Presumably it makes sense to update this based on v2 of my patch to
> the original code?  AKA: <https://patchwork.kernel.org/patch/7888851/>
> ?

Hi Robin,

    Thanks very much for this patch.

    And Douglas has sent his v2 which improve the flow of alloc_pages.
Do you have any plan to update this patch according to this v2, or any
concern about it?


> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations
@ 2016-01-19  2:55             ` Yong Wu
  0 siblings, 0 replies; 26+ messages in thread
From: Yong Wu @ 2016-01-19  2:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-12-18 at 14:35 -0800, Doug Anderson wrote:
> Robin,
> 
> On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy@arm.com> wrote:
> > Doug reports that the equivalent page allocator on 32-bit ARM exhibits
> > particularly pathalogical behaviour under memory pressure when
> > fragmentation is high, where allocating a 4MB buffer takes tens of
> > seconds and the number of calls to alloc_pages() is over 9000![1]
> >
> > We can drastically improve that situation without losing the other
> > benefits of high-order allocations when they would succeed, by assuming
> > memory pressure is relatively constant over the course of an allocation,
> > and not retrying allocations at orders we know to have failed before.
> > This way, the best-case behaviour remains unchanged, and in the worst
> > case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
> > before falling back to single pages for the remainder of the buffer.
> >
> > [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html
> >
> > Reported-by: Douglas Anderson <dianders@chromium.org>
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  drivers/iommu/dma-iommu.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Presumably it makes sense to update this based on v2 of my patch to
> the original code?  AKA: <https://patchwork.kernel.org/patch/7888851/>
> ?

Hi Robin,

    Thanks very much for this patch.

    And Douglas has sent his v2 which improve the flow of alloc_pages.
Do you have any plan to update this patch according to this v2, or any
concern about it?


> _______________________________________________
> iommu mailing list
> iommu at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2015-12-18 17:01     ` Robin Murphy
@ 2016-03-09  7:50         ` Magnus Damm
  -1 siblings, 0 replies; 26+ messages in thread
From: Magnus Damm @ 2016-03-09  7:50 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
>
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
>
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 982e716..03811e3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>                 size_t s_length = s->length;
>                 size_t pad_len = (mask - iova_len + 1) & mask;
>
> -               sg_dma_address(s) = s->offset;
> +               sg_dma_address(s) = s_offset;
>                 sg_dma_len(s) = s_length;
>                 s->offset -= s_offset;
>                 s_length = iova_align(iovad, s_length + s_offset);
> --
> 1.9.1

Hi Robin,

Thanks a lot for your fix! While I don't have any doubt that your
patch fixes a real issue I wonder if another update is needed.
Depending on what is expected perhaps just the comment above the code
wants an update or maybe the "un-swizzling" needs more work. With this
patch applied the code looks semi-complete to me at this point.

Currently the comment just above the hunk says:

    /*
     * Work out how much IOVA space we need, and align the segments to
     * IOVA granules for the IOMMU driver to handle. With some clever
     * trickery we can modify the list in-place, but reversibly, by
     * hiding the original data in the as-yet-unused DMA fields.
     */

With your fix the "original data" is no longer stored in the unused
DMA fields. Instead the s_offset value is stored as modified in
sg_dma_address() which in turn will make the iommu_dma_map_sg()
function return with modified sg->s_offset both on success and
failure.

Perhaps this is intentional design, or maybe __invalidate_sg() and
__finalize_sg() both need to support roll back? Any ideas?

Thanks,

/ magnus

My untested hack to support roll back on top of next-20160308 does
something like this...

--- 0001/drivers/iommu/dma-iommu.c
+++ work/drivers/iommu/dma-iommu.c    2016-03-09 16:33:21.250513000 +0900
@@ -392,7 +392,7 @@ void iommu_dma_unmap_page(struct device
  * Handling IOVA concatenation can come later, if needed
  */
 static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents,
-        dma_addr_t dma_addr)
+             dma_addr_t dma_addr, struct iova_domain *iovad)
 {
     struct scatterlist *s;
     int i;
@@ -405,7 +405,7 @@ static int __finalise_sg(struct device *

         s->offset = s_offset;
         s->length = s_length;
-        sg_dma_address(s) = dma_addr + s_offset;
+        sg_dma_address(s) = dma_addr + iova_offset(iovad, s_offset);
         dma_addr += s_dma_len;
     }
     return i;
@@ -455,11 +455,13 @@ int iommu_dma_map_sg(struct device *dev,
      * hiding the original data in the as-yet-unused DMA fields.
      */
     for_each_sg(sg, s, nents, i) {
-        size_t s_offset = iova_offset(iovad, s->offset);
+        size_t s_offset = s->offset;
         size_t s_length = s->length;

         sg_dma_address(s) = s_offset;
         sg_dma_len(s) = s_length;
+
+        s_offset = iova_offset(iovad, s_offset);
         s->offset -= s_offset;
         s_length = iova_align(iovad, s_length + s_offset);
         s->length = s_length;
@@ -494,7 +496,7 @@ int iommu_dma_map_sg(struct device *dev,
     if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len)
         goto out_free_iova;

-    return __finalise_sg(dev, sg, nents, dma_addr);
+    return __finalise_sg(dev, sg, nents, dma_addr, iovad);

 out_free_iova:
     __free_iova(iovad, iova);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2016-03-09  7:50         ` Magnus Damm
  0 siblings, 0 replies; 26+ messages in thread
From: Magnus Damm @ 2016-03-09  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy@arm.com> wrote:
> When mapping a non-page-aligned scatterlist entry, we copy the original
> offset to the output DMA address before aligning it to hand off to
> iommu_map_sg(), then later adding the IOVA page address portion to get
> the final mapped address. However, when the IOVA page size is smaller
> than the CPU page size, it is the offset within the IOVA page we want,
> not that within the CPU page, which can easily be larger than an IOVA
> page and thus result in an incorrect final address.
>
> Fix the bug by taking only the IOVA-aligned part of the offset as the
> basis of the DMA address, not the whole thing.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 982e716..03811e3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>                 size_t s_length = s->length;
>                 size_t pad_len = (mask - iova_len + 1) & mask;
>
> -               sg_dma_address(s) = s->offset;
> +               sg_dma_address(s) = s_offset;
>                 sg_dma_len(s) = s_length;
>                 s->offset -= s_offset;
>                 s_length = iova_align(iovad, s_length + s_offset);
> --
> 1.9.1

Hi Robin,

Thanks a lot for your fix! While I don't have any doubt that your
patch fixes a real issue I wonder if another update is needed.
Depending on what is expected perhaps just the comment above the code
wants an update or maybe the "un-swizzling" needs more work. With this
patch applied the code looks semi-complete to me at this point.

Currently the comment just above the hunk says:

    /*
     * Work out how much IOVA space we need, and align the segments to
     * IOVA granules for the IOMMU driver to handle. With some clever
     * trickery we can modify the list in-place, but reversibly, by
     * hiding the original data in the as-yet-unused DMA fields.
     */

With your fix the "original data" is no longer stored in the unused
DMA fields. Instead the s_offset value is stored as modified in
sg_dma_address() which in turn will make the iommu_dma_map_sg()
function return with modified sg->s_offset both on success and
failure.

Perhaps this is intentional design, or maybe __invalidate_sg() and
__finalize_sg() both need to support roll back? Any ideas?

Thanks,

/ magnus

My untested hack to support roll back on top of next-20160308 does
something like this...

--- 0001/drivers/iommu/dma-iommu.c
+++ work/drivers/iommu/dma-iommu.c    2016-03-09 16:33:21.250513000 +0900
@@ -392,7 +392,7 @@ void iommu_dma_unmap_page(struct device
  * Handling IOVA concatenation can come later, if needed
  */
 static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents,
-        dma_addr_t dma_addr)
+             dma_addr_t dma_addr, struct iova_domain *iovad)
 {
     struct scatterlist *s;
     int i;
@@ -405,7 +405,7 @@ static int __finalise_sg(struct device *

         s->offset = s_offset;
         s->length = s_length;
-        sg_dma_address(s) = dma_addr + s_offset;
+        sg_dma_address(s) = dma_addr + iova_offset(iovad, s_offset);
         dma_addr += s_dma_len;
     }
     return i;
@@ -455,11 +455,13 @@ int iommu_dma_map_sg(struct device *dev,
      * hiding the original data in the as-yet-unused DMA fields.
      */
     for_each_sg(sg, s, nents, i) {
-        size_t s_offset = iova_offset(iovad, s->offset);
+        size_t s_offset = s->offset;
         size_t s_length = s->length;

         sg_dma_address(s) = s_offset;
         sg_dma_len(s) = s_length;
+
+        s_offset = iova_offset(iovad, s_offset);
         s->offset -= s_offset;
         s_length = iova_align(iovad, s_length + s_offset);
         s->length = s_length;
@@ -494,7 +496,7 @@ int iommu_dma_map_sg(struct device *dev,
     if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len)
         goto out_free_iova;

-    return __finalise_sg(dev, sg, nents, dma_addr);
+    return __finalise_sg(dev, sg, nents, dma_addr, iovad);

 out_free_iova:
     __free_iova(iovad, iova);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2016-03-09  7:50         ` Magnus Damm
@ 2016-03-09 15:00             ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-03-09 15:00 UTC (permalink / raw)
  To: Magnus Damm
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Magnus,

Thanks for bringing this up...

On 09/03/16 07:50, Magnus Damm wrote:
> On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>> When mapping a non-page-aligned scatterlist entry, we copy the original
>> offset to the output DMA address before aligning it to hand off to
>> iommu_map_sg(), then later adding the IOVA page address portion to get
>> the final mapped address. However, when the IOVA page size is smaller
>> than the CPU page size, it is the offset within the IOVA page we want,
>> not that within the CPU page, which can easily be larger than an IOVA
>> page and thus result in an incorrect final address.
>>
>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>> basis of the DMA address, not the whole thing.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>   drivers/iommu/dma-iommu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 982e716..03811e3 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>>                  size_t s_length = s->length;
>>                  size_t pad_len = (mask - iova_len + 1) & mask;
>>
>> -               sg_dma_address(s) = s->offset;
>> +               sg_dma_address(s) = s_offset;
>>                  sg_dma_len(s) = s_length;
>>                  s->offset -= s_offset;
>>                  s_length = iova_align(iovad, s_length + s_offset);
>> --
>> 1.9.1
>
> Hi Robin,
>
> Thanks a lot for your fix! While I don't have any doubt that your
> patch fixes a real issue I wonder if another update is needed.
> Depending on what is expected perhaps just the comment above the code
> wants an update or maybe the "un-swizzling" needs more work. With this
> patch applied the code looks semi-complete to me at this point.
>
> Currently the comment just above the hunk says:
>
>      /*
>       * Work out how much IOVA space we need, and align the segments to
>       * IOVA granules for the IOMMU driver to handle. With some clever
>       * trickery we can modify the list in-place, but reversibly, by
>       * hiding the original data in the as-yet-unused DMA fields.
>       */
>
> With your fix the "original data" is no longer stored in the unused
> DMA fields.

OK, so we're now  moving some of the data rather than taking a literal 
copy, but the point remains that we're not throwing any information away 
- we can move the remainder back again if necessary. As far as I'm 
concerned the comment is still valid, but if it's open to 
misinterpretation I can try rephrasing it.

> Instead the s_offset value is stored as modified in
> sg_dma_address() which in turn will make the iommu_dma_map_sg()
> function return with modified sg->s_offset both on success and
> failure.
>
> Perhaps this is intentional design, or maybe __invalidate_sg() and
> __finalize_sg() both need to support roll back? Any ideas?

What's missing is that some idiot forgot about the hard-to-exercise 
failure path and didn't update __invalidate_sg() to match. I'll get 
right on that...

Robin.

> Thanks,
>
> / magnus
>
> My untested hack to support roll back on top of next-20160308 does
> something like this...
>
> --- 0001/drivers/iommu/dma-iommu.c
> +++ work/drivers/iommu/dma-iommu.c    2016-03-09 16:33:21.250513000 +0900
> @@ -392,7 +392,7 @@ void iommu_dma_unmap_page(struct device
>    * Handling IOVA concatenation can come later, if needed
>    */
>   static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents,
> -        dma_addr_t dma_addr)
> +             dma_addr_t dma_addr, struct iova_domain *iovad)
>   {
>       struct scatterlist *s;
>       int i;
> @@ -405,7 +405,7 @@ static int __finalise_sg(struct device *
>
>           s->offset = s_offset;
>           s->length = s_length;
> -        sg_dma_address(s) = dma_addr + s_offset;
> +        sg_dma_address(s) = dma_addr + iova_offset(iovad, s_offset);
>           dma_addr += s_dma_len;
>       }
>       return i;
> @@ -455,11 +455,13 @@ int iommu_dma_map_sg(struct device *dev,
>        * hiding the original data in the as-yet-unused DMA fields.
>        */
>       for_each_sg(sg, s, nents, i) {
> -        size_t s_offset = iova_offset(iovad, s->offset);
> +        size_t s_offset = s->offset;
>           size_t s_length = s->length;
>
>           sg_dma_address(s) = s_offset;
>           sg_dma_len(s) = s_length;
> +
> +        s_offset = iova_offset(iovad, s_offset);
>           s->offset -= s_offset;
>           s_length = iova_align(iovad, s_length + s_offset);
>           s->length = s_length;
> @@ -494,7 +496,7 @@ int iommu_dma_map_sg(struct device *dev,
>       if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len)
>           goto out_free_iova;
>
> -    return __finalise_sg(dev, sg, nents, dma_addr);
> +    return __finalise_sg(dev, sg, nents, dma_addr, iovad);
>
>   out_free_iova:
>       __free_iova(iovad, iova);
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2016-03-09 15:00             ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2016-03-09 15:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Magnus,

Thanks for bringing this up...

On 09/03/16 07:50, Magnus Damm wrote:
> On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy@arm.com> wrote:
>> When mapping a non-page-aligned scatterlist entry, we copy the original
>> offset to the output DMA address before aligning it to hand off to
>> iommu_map_sg(), then later adding the IOVA page address portion to get
>> the final mapped address. However, when the IOVA page size is smaller
>> than the CPU page size, it is the offset within the IOVA page we want,
>> not that within the CPU page, which can easily be larger than an IOVA
>> page and thus result in an incorrect final address.
>>
>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>> basis of the DMA address, not the whole thing.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/dma-iommu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 982e716..03811e3 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>>                  size_t s_length = s->length;
>>                  size_t pad_len = (mask - iova_len + 1) & mask;
>>
>> -               sg_dma_address(s) = s->offset;
>> +               sg_dma_address(s) = s_offset;
>>                  sg_dma_len(s) = s_length;
>>                  s->offset -= s_offset;
>>                  s_length = iova_align(iovad, s_length + s_offset);
>> --
>> 1.9.1
>
> Hi Robin,
>
> Thanks a lot for your fix! While I don't have any doubt that your
> patch fixes a real issue I wonder if another update is needed.
> Depending on what is expected perhaps just the comment above the code
> wants an update or maybe the "un-swizzling" needs more work. With this
> patch applied the code looks semi-complete to me at this point.
>
> Currently the comment just above the hunk says:
>
>      /*
>       * Work out how much IOVA space we need, and align the segments to
>       * IOVA granules for the IOMMU driver to handle. With some clever
>       * trickery we can modify the list in-place, but reversibly, by
>       * hiding the original data in the as-yet-unused DMA fields.
>       */
>
> With your fix the "original data" is no longer stored in the unused
> DMA fields.

OK, so we're now  moving some of the data rather than taking a literal 
copy, but the point remains that we're not throwing any information away 
- we can move the remainder back again if necessary. As far as I'm 
concerned the comment is still valid, but if it's open to 
misinterpretation I can try rephrasing it.

> Instead the s_offset value is stored as modified in
> sg_dma_address() which in turn will make the iommu_dma_map_sg()
> function return with modified sg->s_offset both on success and
> failure.
>
> Perhaps this is intentional design, or maybe __invalidate_sg() and
> __finalize_sg() both need to support roll back? Any ideas?

What's missing is that some idiot forgot about the hard-to-exercise 
failure path and didn't update __invalidate_sg() to match. I'll get 
right on that...

Robin.

> Thanks,
>
> / magnus
>
> My untested hack to support roll back on top of next-20160308 does
> something like this...
>
> --- 0001/drivers/iommu/dma-iommu.c
> +++ work/drivers/iommu/dma-iommu.c    2016-03-09 16:33:21.250513000 +0900
> @@ -392,7 +392,7 @@ void iommu_dma_unmap_page(struct device
>    * Handling IOVA concatenation can come later, if needed
>    */
>   static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents,
> -        dma_addr_t dma_addr)
> +             dma_addr_t dma_addr, struct iova_domain *iovad)
>   {
>       struct scatterlist *s;
>       int i;
> @@ -405,7 +405,7 @@ static int __finalise_sg(struct device *
>
>           s->offset = s_offset;
>           s->length = s_length;
> -        sg_dma_address(s) = dma_addr + s_offset;
> +        sg_dma_address(s) = dma_addr + iova_offset(iovad, s_offset);
>           dma_addr += s_dma_len;
>       }
>       return i;
> @@ -455,11 +455,13 @@ int iommu_dma_map_sg(struct device *dev,
>        * hiding the original data in the as-yet-unused DMA fields.
>        */
>       for_each_sg(sg, s, nents, i) {
> -        size_t s_offset = iova_offset(iovad, s->offset);
> +        size_t s_offset = s->offset;
>           size_t s_length = s->length;
>
>           sg_dma_address(s) = s_offset;
>           sg_dma_len(s) = s_length;
> +
> +        s_offset = iova_offset(iovad, s_offset);
>           s->offset -= s_offset;
>           s_length = iova_align(iovad, s_length + s_offset);
>           s->length = s_length;
> @@ -494,7 +496,7 @@ int iommu_dma_map_sg(struct device *dev,
>       if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len)
>           goto out_free_iova;
>
> -    return __finalise_sg(dev, sg, nents, dma_addr);
> +    return __finalise_sg(dev, sg, nents, dma_addr, iovad);
>
>   out_free_iova:
>       __free_iova(iovad, iova);
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] iommu/dma: Use correct offset in map_sg
  2016-03-09 15:00             ` Robin Murphy
@ 2016-03-10  7:47                 ` Magnus Damm
  -1 siblings, 0 replies; 26+ messages in thread
From: Magnus Damm @ 2016-03-10  7:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Robin,

On Thu, Mar 10, 2016 at 12:00 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
> Hi Magnus,
>
> Thanks for bringing this up...

No worries!

> On 09/03/16 07:50, Magnus Damm wrote:
>>
>> On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> wrote:
>>>
>>> When mapping a non-page-aligned scatterlist entry, we copy the original
>>> offset to the output DMA address before aligning it to hand off to
>>> iommu_map_sg(), then later adding the IOVA page address portion to get
>>> the final mapped address. However, when the IOVA page size is smaller
>>> than the CPU page size, it is the offset within the IOVA page we want,
>>> not that within the CPU page, which can easily be larger than an IOVA
>>> page and thus result in an incorrect final address.
>>>
>>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>>> basis of the DMA address, not the whole thing.
>>>
>>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>>> ---
>>>   drivers/iommu/dma-iommu.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index 982e716..03811e3 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct
>>> scatterlist *sg,
>>>                  size_t s_length = s->length;
>>>                  size_t pad_len = (mask - iova_len + 1) & mask;
>>>
>>> -               sg_dma_address(s) = s->offset;
>>> +               sg_dma_address(s) = s_offset;
>>>                  sg_dma_len(s) = s_length;
>>>                  s->offset -= s_offset;
>>>                  s_length = iova_align(iovad, s_length + s_offset);
>>> --
>>> 1.9.1
>>
>>
>> Hi Robin,
>>
>> Thanks a lot for your fix! While I don't have any doubt that your
>> patch fixes a real issue I wonder if another update is needed.
>> Depending on what is expected perhaps just the comment above the code
>> wants an update or maybe the "un-swizzling" needs more work. With this
>> patch applied the code looks semi-complete to me at this point.
>>
>> Currently the comment just above the hunk says:
>>
>>      /*
>>       * Work out how much IOVA space we need, and align the segments to
>>       * IOVA granules for the IOMMU driver to handle. With some clever
>>       * trickery we can modify the list in-place, but reversibly, by
>>       * hiding the original data in the as-yet-unused DMA fields.
>>       */
>>
>> With your fix the "original data" is no longer stored in the unused
>> DMA fields.
>
>
> OK, so we're now  moving some of the data rather than taking a literal copy,
> but the point remains that we're not throwing any information away - we can
> move the remainder back again if necessary. As far as I'm concerned the
> comment is still valid, but if it's open to misinterpretation I can try
> rephrasing it.

Thanks, I agree with you about the comment! As long as the fields can
be restored everything is fine.

>> Instead the s_offset value is stored as modified in
>> sg_dma_address() which in turn will make the iommu_dma_map_sg()
>> function return with modified sg->s_offset both on success and
>> failure.
>>
>> Perhaps this is intentional design, or maybe __invalidate_sg() and
>> __finalize_sg() both need to support roll back? Any ideas?
>
>
> What's missing is that some idiot forgot about the hard-to-exercise failure
> path and didn't update __invalidate_sg() to match. I'll get right on that...

Oh well. Fixing the error case sounds good. I don't have any special
test case to trigger anything, so testing is a bit difficult for me.
Apart from that I'm happy to help - let me know if you can think of
something.

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] iommu/dma: Use correct offset in map_sg
@ 2016-03-10  7:47                 ` Magnus Damm
  0 siblings, 0 replies; 26+ messages in thread
From: Magnus Damm @ 2016-03-10  7:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,

On Thu, Mar 10, 2016 at 12:00 AM, Robin Murphy <robin.murphy@arm.com> wrote:
> Hi Magnus,
>
> Thanks for bringing this up...

No worries!

> On 09/03/16 07:50, Magnus Damm wrote:
>>
>> On Sat, Dec 19, 2015 at 2:01 AM, Robin Murphy <robin.murphy@arm.com>
>> wrote:
>>>
>>> When mapping a non-page-aligned scatterlist entry, we copy the original
>>> offset to the output DMA address before aligning it to hand off to
>>> iommu_map_sg(), then later adding the IOVA page address portion to get
>>> the final mapped address. However, when the IOVA page size is smaller
>>> than the CPU page size, it is the offset within the IOVA page we want,
>>> not that within the CPU page, which can easily be larger than an IOVA
>>> page and thus result in an incorrect final address.
>>>
>>> Fix the bug by taking only the IOVA-aligned part of the offset as the
>>> basis of the DMA address, not the whole thing.
>>>
>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>> ---
>>>   drivers/iommu/dma-iommu.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index 982e716..03811e3 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -458,7 +458,7 @@ int iommu_dma_map_sg(struct device *dev, struct
>>> scatterlist *sg,
>>>                  size_t s_length = s->length;
>>>                  size_t pad_len = (mask - iova_len + 1) & mask;
>>>
>>> -               sg_dma_address(s) = s->offset;
>>> +               sg_dma_address(s) = s_offset;
>>>                  sg_dma_len(s) = s_length;
>>>                  s->offset -= s_offset;
>>>                  s_length = iova_align(iovad, s_length + s_offset);
>>> --
>>> 1.9.1
>>
>>
>> Hi Robin,
>>
>> Thanks a lot for your fix! While I don't have any doubt that your
>> patch fixes a real issue I wonder if another update is needed.
>> Depending on what is expected perhaps just the comment above the code
>> wants an update or maybe the "un-swizzling" needs more work. With this
>> patch applied the code looks semi-complete to me at this point.
>>
>> Currently the comment just above the hunk says:
>>
>>      /*
>>       * Work out how much IOVA space we need, and align the segments to
>>       * IOVA granules for the IOMMU driver to handle. With some clever
>>       * trickery we can modify the list in-place, but reversibly, by
>>       * hiding the original data in the as-yet-unused DMA fields.
>>       */
>>
>> With your fix the "original data" is no longer stored in the unused
>> DMA fields.
>
>
> OK, so we're now  moving some of the data rather than taking a literal copy,
> but the point remains that we're not throwing any information away - we can
> move the remainder back again if necessary. As far as I'm concerned the
> comment is still valid, but if it's open to misinterpretation I can try
> rephrasing it.

Thanks, I agree with you about the comment! As long as the fields can
be restored everything is fine.

>> Instead the s_offset value is stored as modified in
>> sg_dma_address() which in turn will make the iommu_dma_map_sg()
>> function return with modified sg->s_offset both on success and
>> failure.
>>
>> Perhaps this is intentional design, or maybe __invalidate_sg() and
>> __finalize_sg() both need to support roll back? Any ideas?
>
>
> What's missing is that some idiot forgot about the hard-to-exercise failure
> path and didn't update __invalidate_sg() to match. I'll get right on that...

Oh well. Fixing the error case sounds good. I don't have any special
test case to trigger anything, so testing is a bit difficult for me.
Apart from that I'm happy to help - let me know if you can think of
something.

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-03-10  7:47 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 17:01 [PATCH 1/3] iommu/dma: Add some missing #includes Robin Murphy
2015-12-18 17:01 ` Robin Murphy
     [not found] ` <9a84191ed813c23db7901f8c73f514d081495bdf.1450457730.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2015-12-18 17:01   ` [PATCH 2/3] iommu/dma: Use correct offset in map_sg Robin Murphy
2015-12-18 17:01     ` Robin Murphy
     [not found]     ` <4812a34857b081e45c36d7e887840f3675da74dc.1450457730.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2015-12-28 16:05       ` Joerg Roedel
2015-12-28 16:05         ` Joerg Roedel
     [not found]         ` <20151228160551.GA17673-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-01-04 16:08           ` Robin Murphy
2016-01-04 16:08             ` Robin Murphy
2016-01-04 16:19       ` [PATCH RESEND] " Robin Murphy
2016-01-04 16:19         ` Robin Murphy
     [not found]         ` <8317d001da4f48831fa23d8d7729a4659ac72b49.1451924092.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-01-07 12:38           ` Joerg Roedel
2016-01-07 12:38             ` Joerg Roedel
2016-03-09  7:50       ` [PATCH 2/3] " Magnus Damm
2016-03-09  7:50         ` Magnus Damm
     [not found]         ` <CANqRtoSHa1fzQge4ntK9Jt_XFiL0AKWtUti93-cwS-aOkJQcjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-09 15:00           ` Robin Murphy
2016-03-09 15:00             ` Robin Murphy
     [not found]             ` <56E03A83.9050909-5wv7dgnIgG8@public.gmane.org>
2016-03-10  7:47               ` Magnus Damm
2016-03-10  7:47                 ` Magnus Damm
2015-12-18 17:01   ` [PATCH 3/3] iommu/dma: Avoid unlikely high-order allocations Robin Murphy
2015-12-18 17:01     ` Robin Murphy
     [not found]     ` <8014a146e90282b8cbc5868cee0d01776670d9a6.1450457730.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2015-12-18 22:35       ` Doug Anderson
2015-12-18 22:35         ` Doug Anderson
     [not found]         ` <CAD=FV=WqJHjb3zoZn=8OPO2iKH1k1sMvXkVBkyM6pAyvo6PgDQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-19  2:55           ` Yong Wu
2016-01-19  2:55             ` Yong Wu
2015-12-28 16:08   ` [PATCH 1/3] iommu/dma: Add some missing #includes Joerg Roedel
2015-12-28 16:08     ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.