All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-05  8:33 FUJITA Tomonori
  2009-06-05 10:41   ` Joerg Roedel
  0 siblings, 1 reply; 30+ messages in thread
From: FUJITA Tomonori @ 2009-06-05  8:33 UTC (permalink / raw)
  To: mingo, lethal
  Cc: joerg.roedel, just.for.lkml, hancockrwd, jens.axboe, bharrosh,
	linux-kernel, linux-scsi

dma-debugs wrongly assumes that no multiple DMA transfers are
performed against the same dma address on one device at the same
time. However it's true only with hardware IOMMUs. For example, an
application can easily send the same buffer twice with different
lengths to one device by using DIO and AIO. If these requests are not
unmapped in the same order in which they were mapped,
hash_bucket_find() finds a wrong entry and gives a false warning.

We should fix this before 2.6.30 release. Seems that there is no
easy way to fix it. I think that it's better to just disable
dma-debug for now.

Torsten Kaiser found this bug with the RAID1 configuration:

http://marc.info/?t=124336541900008&r=1&w=2

Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 arch/sh/Kconfig  |    1 -
 arch/x86/Kconfig |    1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index e7390dd..4fb19d7 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -14,7 +14,6 @@ config SUPERH
 	select HAVE_GENERIC_DMA_COHERENT
 	select HAVE_IOREMAP_PROT if MMU
 	select HAVE_ARCH_TRACEHOOK
-	select HAVE_DMA_API_DEBUG
 	help
 	  The SuperH is a RISC processor targeted for use in embedded systems
 	  and consumer electronics; it was also used in the Sega Dreamcast
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a6efe0a..b3cf490 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -42,7 +42,6 @@ config X86
 	select HAVE_GENERIC_DMA_COHERENT if X86_32
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select USER_STACKTRACE_SUPPORT
-	select HAVE_DMA_API_DEBUG
 	select HAVE_KERNEL_GZIP
 	select HAVE_KERNEL_BZIP2
 	select HAVE_KERNEL_LZMA
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05  8:33 [PATCH] dma-debug: disable DMA_API_DEBUG for now FUJITA Tomonori
@ 2009-06-05 10:41   ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-05 10:41 UTC (permalink / raw)
  To: FUJITA Tomonori, Linus Torvalds
  Cc: mingo, lethal, just.for.lkml, hancockrwd, jens.axboe, bharrosh,
	linux-kernel, linux-scsi


On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> dma-debugs wrongly assumes that no multiple DMA transfers are
> performed against the same dma address on one device at the same
> time. However it's true only with hardware IOMMUs. For example, an
> application can easily send the same buffer twice with different
> lengths to one device by using DIO and AIO. If these requests are not
> unmapped in the same order in which they were mapped,
> hash_bucket_find() finds a wrong entry and gives a false warning.
> 
> We should fix this before 2.6.30 release. Seems that there is no
> easy way to fix it. I think that it's better to just disable
> dma-debug for now.
> 
> Torsten Kaiser found this bug with the RAID1 configuration:
> 
> http://marc.info/?t=124336541900008&r=1&w=2
> 

Argh, I never thought that this can happen. But its not explicitly
forbidden so we have to handle this situation. Thanks for tracking down
the bug to this issue.
However, I think there is a somehow simple fix for the issue. Patch is
attached. Its the least intrusive way I can think of to fix this
problem.
But its up to Linus/Ingo to decide if it can be accepted at this very
late point in the cycle. Since dma-debug is new with 2.6.30 it will at
least not introduce any regression. The patch is boot and basically
load-tested with some disk and network load and showed no issues on my
test machine. The diffstat looks big but more than the half of the patch
adds comments to explain what it does. So the diffstat looks bigger than
the actual change.
Linus, if you think my patch is not acceptable at this point then please
apply the patch from FUJITA Tomonori.

Regards,

	Joerg

>From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Fri, 5 Jun 2009 12:01:35 +0200
Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit

Some device drivers map the same physical address multiple times to a
dma address. Without an IOMMU this results in the same dma address being
put into the dma-debug hash multiple times. With a first-fit match in
hash_bucket_find() this function may return the wrong dma_debug_entry.
This can result in false positive warnings. This patch fixes it by
changing the first-fit behavior of hash_bucket_find() into a best-fit
algorithm.

Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
 1 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 69da09a..2b16536 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
 static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 						struct dma_debug_entry *ref)
 {
-	struct dma_debug_entry *entry;
+	struct dma_debug_entry *entry, *ret = NULL;
+	int matches = 0, match_lvl, last_lvl = 0;
 
 	list_for_each_entry(entry, &bucket->list, list) {
-		if ((entry->dev_addr == ref->dev_addr) &&
-		    (entry->dev == ref->dev))
+		if ((entry->dev_addr != ref->dev_addr) ||
+		    (entry->dev != ref->dev))
+			continue;
+
+		/*
+		 * Some drivers map the same physical address multiple
+		 * times. Without a hardware IOMMU this results in the
+		 * same device addresses being put into the dma-debug
+		 * hash multiple times too. This can result in false
+		 * positives being reported. Therfore we implement a
+		 * best-fit algorithm here which returns the entry from
+		 * the hash which fits best to the reference value
+		 * instead of the first-fit.
+		 */
+		matches += 1;
+		match_lvl = 0;
+		entry->size      == ref->size      ? ++match_lvl : match_lvl;
+		entry->type      == ref->type      ? ++match_lvl : match_lvl;
+		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+
+		if (match_lvl == 3) {
+			/* perfect-fit - return the result */
 			return entry;
+		} else if (match_lvl > last_lvl) {
+			/*
+			 * We found an entry that fits better then the
+			 * previous one
+			 */
+			last_lvl = match_lvl;
+			ret      = entry;
+		}
 	}
 
-	return NULL;
+	/*
+	 * If we have multiple matches but no perfect-fit, just return
+	 * NULL.
+	 */
+	ret = (matches == 1) ? ret : NULL;
+
+	return ret;
 }
 
 /*
-- 
1.6.3.1




-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-05 10:41   ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-05 10:41 UTC (permalink / raw)
  To: FUJITA Tomonori, Linus Torvalds
  Cc: mingo, lethal, just.for.lkml, hancockrwd, jens.axboe, bharrosh,
	linux-kernel, linux-scsi


On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> dma-debugs wrongly assumes that no multiple DMA transfers are
> performed against the same dma address on one device at the same
> time. However it's true only with hardware IOMMUs. For example, an
> application can easily send the same buffer twice with different
> lengths to one device by using DIO and AIO. If these requests are not
> unmapped in the same order in which they were mapped,
> hash_bucket_find() finds a wrong entry and gives a false warning.
> 
> We should fix this before 2.6.30 release. Seems that there is no
> easy way to fix it. I think that it's better to just disable
> dma-debug for now.
> 
> Torsten Kaiser found this bug with the RAID1 configuration:
> 
> http://marc.info/?t=124336541900008&r=1&w=2
> 

Argh, I never thought that this can happen. But its not explicitly
forbidden so we have to handle this situation. Thanks for tracking down
the bug to this issue.
However, I think there is a somehow simple fix for the issue. Patch is
attached. Its the least intrusive way I can think of to fix this
problem.
But its up to Linus/Ingo to decide if it can be accepted at this very
late point in the cycle. Since dma-debug is new with 2.6.30 it will at
least not introduce any regression. The patch is boot and basically
load-tested with some disk and network load and showed no issues on my
test machine. The diffstat looks big but more than the half of the patch
adds comments to explain what it does. So the diffstat looks bigger than
the actual change.
Linus, if you think my patch is not acceptable at this point then please
apply the patch from FUJITA Tomonori.

Regards,

	Joerg

From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Fri, 5 Jun 2009 12:01:35 +0200
Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit

Some device drivers map the same physical address multiple times to a
dma address. Without an IOMMU this results in the same dma address being
put into the dma-debug hash multiple times. With a first-fit match in
hash_bucket_find() this function may return the wrong dma_debug_entry.
This can result in false positive warnings. This patch fixes it by
changing the first-fit behavior of hash_bucket_find() into a best-fit
algorithm.

Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
 1 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 69da09a..2b16536 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
 static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 						struct dma_debug_entry *ref)
 {
-	struct dma_debug_entry *entry;
+	struct dma_debug_entry *entry, *ret = NULL;
+	int matches = 0, match_lvl, last_lvl = 0;
 
 	list_for_each_entry(entry, &bucket->list, list) {
-		if ((entry->dev_addr == ref->dev_addr) &&
-		    (entry->dev == ref->dev))
+		if ((entry->dev_addr != ref->dev_addr) ||
+		    (entry->dev != ref->dev))
+			continue;
+
+		/*
+		 * Some drivers map the same physical address multiple
+		 * times. Without a hardware IOMMU this results in the
+		 * same device addresses being put into the dma-debug
+		 * hash multiple times too. This can result in false
+		 * positives being reported. Therfore we implement a
+		 * best-fit algorithm here which returns the entry from
+		 * the hash which fits best to the reference value
+		 * instead of the first-fit.
+		 */
+		matches += 1;
+		match_lvl = 0;
+		entry->size      == ref->size      ? ++match_lvl : match_lvl;
+		entry->type      == ref->type      ? ++match_lvl : match_lvl;
+		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+
+		if (match_lvl == 3) {
+			/* perfect-fit - return the result */
 			return entry;
+		} else if (match_lvl > last_lvl) {
+			/*
+			 * We found an entry that fits better then the
+			 * previous one
+			 */
+			last_lvl = match_lvl;
+			ret      = entry;
+		}
 	}
 
-	return NULL;
+	/*
+	 * If we have multiple matches but no perfect-fit, just return
+	 * NULL.
+	 */
+	ret = (matches == 1) ? ret : NULL;
+
+	return ret;
 }
 
 /*
-- 
1.6.3.1




-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 10:41   ` Joerg Roedel
@ 2009-06-05 11:38     ` FUJITA Tomonori
  -1 siblings, 0 replies; 30+ messages in thread
From: FUJITA Tomonori @ 2009-06-05 11:38 UTC (permalink / raw)
  To: joerg.roedel
  Cc: fujita.tomonori, torvalds, mingo, lethal, just.for.lkml,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, 5 Jun 2009 12:41:33 +0200
Joerg Roedel <joerg.roedel@amd.com> wrote:

> 
> On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> > dma-debugs wrongly assumes that no multiple DMA transfers are
> > performed against the same dma address on one device at the same
> > time. However it's true only with hardware IOMMUs. For example, an
> > application can easily send the same buffer twice with different
> > lengths to one device by using DIO and AIO. If these requests are not
> > unmapped in the same order in which they were mapped,
> > hash_bucket_find() finds a wrong entry and gives a false warning.
> > 
> > We should fix this before 2.6.30 release. Seems that there is no
> > easy way to fix it. I think that it's better to just disable
> > dma-debug for now.
> > 
> > Torsten Kaiser found this bug with the RAID1 configuration:
> > 
> > http://marc.info/?t=124336541900008&r=1&w=2
> > 
> 
> Argh, I never thought that this can happen. But its not explicitly
> forbidden so we have to handle this situation. Thanks for tracking down
> the bug to this issue.
> However, I think there is a somehow simple fix for the issue. Patch is
> attached. Its the least intrusive way I can think of to fix this
> problem.
> But its up to Linus/Ingo to decide if it can be accepted at this very
> late point in the cycle. Since dma-debug is new with 2.6.30 it will at
> least not introduce any regression. The patch is boot and basically
> load-tested with some disk and network load and showed no issues on my
> test machine. The diffstat looks big but more than the half of the patch
> adds comments to explain what it does. So the diffstat looks bigger than
> the actual change.
> Linus, if you think my patch is not acceptable at this point then please
> apply the patch from FUJITA Tomonori.
> 
> Regards,
> 
> 	Joerg
> 
> From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Fri, 5 Jun 2009 12:01:35 +0200
> Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit
> 
> Some device drivers map the same physical address multiple times to a
> dma address. Without an IOMMU this results in the same dma address being
> put into the dma-debug hash multiple times. With a first-fit match in
> hash_bucket_find() this function may return the wrong dma_debug_entry.
> This can result in false positive warnings. This patch fixes it by
> changing the first-fit behavior of hash_bucket_find() into a best-fit
> algorithm.

I thought about something like this fix however we can do better; with
this fix, we could overlook a bad hardware IOMMU bug.

I think that the better fix can handle both cases per device:

- multiple identical dma addresses should not happen (with devices
  behind hardware IOMMU)
- multiple identical dma addresses could happen


It's better to use a strict-fit algorithm (as we do now) with the
former. It's pretty difficult to do something better than a best-fit
algorithm with the latter though.



> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
> Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 69da09a..2b16536 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
>  static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
>  						struct dma_debug_entry *ref)
>  {
> -	struct dma_debug_entry *entry;
> +	struct dma_debug_entry *entry, *ret = NULL;
> +	int matches = 0, match_lvl, last_lvl = 0;
>  
>  	list_for_each_entry(entry, &bucket->list, list) {
> -		if ((entry->dev_addr == ref->dev_addr) &&
> -		    (entry->dev == ref->dev))
> +		if ((entry->dev_addr != ref->dev_addr) ||
> +		    (entry->dev != ref->dev))
> +			continue;
> +
> +		/*
> +		 * Some drivers map the same physical address multiple
> +		 * times. Without a hardware IOMMU this results in the
> +		 * same device addresses being put into the dma-debug
> +		 * hash multiple times too. This can result in false
> +		 * positives being reported. Therfore we implement a
> +		 * best-fit algorithm here which returns the entry from
> +		 * the hash which fits best to the reference value
> +		 * instead of the first-fit.
> +		 */
> +		matches += 1;
> +		match_lvl = 0;
> +		entry->size      == ref->size      ? ++match_lvl : match_lvl;
> +		entry->type      == ref->type      ? ++match_lvl : match_lvl;
> +		entry->direction == ref->direction ? ++match_lvl : match_lvl;
> +
> +		if (match_lvl == 3) {
> +			/* perfect-fit - return the result */
>  			return entry;
> +		} else if (match_lvl > last_lvl) {
> +			/*
> +			 * We found an entry that fits better then the
> +			 * previous one
> +			 */
> +			last_lvl = match_lvl;
> +			ret      = entry;
> +		}
>  	}
>  
> -	return NULL;
> +	/*
> +	 * If we have multiple matches but no perfect-fit, just return
> +	 * NULL.
> +	 */
> +	ret = (matches == 1) ? ret : NULL;
> +
> +	return ret;
>  }
>  
>  /*
> -- 
> 1.6.3.1
> 
> 
> 
> 
> -- 
>            | Advanced Micro Devices GmbH
>  Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
>  System    | 
>  Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
>  Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
>            | Registergericht München, HRB Nr. 43632
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-05 11:38     ` FUJITA Tomonori
  0 siblings, 0 replies; 30+ messages in thread
From: FUJITA Tomonori @ 2009-06-05 11:38 UTC (permalink / raw)
  To: joerg.roedel
  Cc: fujita.tomonori, torvalds, mingo, lethal, just.for.lkml,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, 5 Jun 2009 12:41:33 +0200
Joerg Roedel <joerg.roedel@amd.com> wrote:

> 
> On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> > dma-debugs wrongly assumes that no multiple DMA transfers are
> > performed against the same dma address on one device at the same
> > time. However it's true only with hardware IOMMUs. For example, an
> > application can easily send the same buffer twice with different
> > lengths to one device by using DIO and AIO. If these requests are not
> > unmapped in the same order in which they were mapped,
> > hash_bucket_find() finds a wrong entry and gives a false warning.
> > 
> > We should fix this before 2.6.30 release. Seems that there is no
> > easy way to fix it. I think that it's better to just disable
> > dma-debug for now.
> > 
> > Torsten Kaiser found this bug with the RAID1 configuration:
> > 
> > http://marc.info/?t=124336541900008&r=1&w=2
> > 
> 
> Argh, I never thought that this can happen. But its not explicitly
> forbidden so we have to handle this situation. Thanks for tracking down
> the bug to this issue.
> However, I think there is a somehow simple fix for the issue. Patch is
> attached. Its the least intrusive way I can think of to fix this
> problem.
> But its up to Linus/Ingo to decide if it can be accepted at this very
> late point in the cycle. Since dma-debug is new with 2.6.30 it will at
> least not introduce any regression. The patch is boot and basically
> load-tested with some disk and network load and showed no issues on my
> test machine. The diffstat looks big but more than the half of the patch
> adds comments to explain what it does. So the diffstat looks bigger than
> the actual change.
> Linus, if you think my patch is not acceptable at this point then please
> apply the patch from FUJITA Tomonori.
> 
> Regards,
> 
> 	Joerg
> 
> From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Fri, 5 Jun 2009 12:01:35 +0200
> Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit
> 
> Some device drivers map the same physical address multiple times to a
> dma address. Without an IOMMU this results in the same dma address being
> put into the dma-debug hash multiple times. With a first-fit match in
> hash_bucket_find() this function may return the wrong dma_debug_entry.
> This can result in false positive warnings. This patch fixes it by
> changing the first-fit behavior of hash_bucket_find() into a best-fit
> algorithm.

I thought about something like this fix however we can do better; with
this fix, we could overlook a bad hardware IOMMU bug.

I think that the better fix can handle both cases per device:

- multiple identical dma addresses should not happen (with devices
  behind hardware IOMMU)
- multiple identical dma addresses could happen


It's better to use a strict-fit algorithm (as we do now) with the
former. It's pretty difficult to do something better than a best-fit
algorithm with the latter though.



> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
> Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 69da09a..2b16536 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
>  static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
>  						struct dma_debug_entry *ref)
>  {
> -	struct dma_debug_entry *entry;
> +	struct dma_debug_entry *entry, *ret = NULL;
> +	int matches = 0, match_lvl, last_lvl = 0;
>  
>  	list_for_each_entry(entry, &bucket->list, list) {
> -		if ((entry->dev_addr == ref->dev_addr) &&
> -		    (entry->dev == ref->dev))
> +		if ((entry->dev_addr != ref->dev_addr) ||
> +		    (entry->dev != ref->dev))
> +			continue;
> +
> +		/*
> +		 * Some drivers map the same physical address multiple
> +		 * times. Without a hardware IOMMU this results in the
> +		 * same device addresses being put into the dma-debug
> +		 * hash multiple times too. This can result in false
> +		 * positives being reported. Therfore we implement a
> +		 * best-fit algorithm here which returns the entry from
> +		 * the hash which fits best to the reference value
> +		 * instead of the first-fit.
> +		 */
> +		matches += 1;
> +		match_lvl = 0;
> +		entry->size      == ref->size      ? ++match_lvl : match_lvl;
> +		entry->type      == ref->type      ? ++match_lvl : match_lvl;
> +		entry->direction == ref->direction ? ++match_lvl : match_lvl;
> +
> +		if (match_lvl == 3) {
> +			/* perfect-fit - return the result */
>  			return entry;
> +		} else if (match_lvl > last_lvl) {
> +			/*
> +			 * We found an entry that fits better then the
> +			 * previous one
> +			 */
> +			last_lvl = match_lvl;
> +			ret      = entry;
> +		}
>  	}
>  
> -	return NULL;
> +	/*
> +	 * If we have multiple matches but no perfect-fit, just return
> +	 * NULL.
> +	 */
> +	ret = (matches == 1) ? ret : NULL;
> +
> +	return ret;
>  }
>  
>  /*
> -- 
> 1.6.3.1
> 
> 
> 
> 
> -- 
>            | Advanced Micro Devices GmbH
>  Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
>  System    | 
>  Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
>  Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
>            | Registergericht München, HRB Nr. 43632
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 11:38     ` FUJITA Tomonori
@ 2009-06-05 12:44       ` Joerg Roedel
  -1 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-05 12:44 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: torvalds, mingo, lethal, just.for.lkml, hancockrwd, jens.axboe,
	bharrosh, linux-kernel, linux-scsi

On Fri, Jun 05, 2009 at 08:38:22PM +0900, FUJITA Tomonori wrote:
> On Fri, 5 Jun 2009 12:41:33 +0200
> Joerg Roedel <joerg.roedel@amd.com> wrote:

> I thought about something like this fix however we can do better; with
> this fix, we could overlook a bad hardware IOMMU bug.

I don't buy this yet. Can you explain what kind of hardware IOMMU bug we
can miss here? For the match in my patch it is still strictly necessary
that the dma address matches.

> I think that the better fix can handle both cases per device:
> 
> - multiple identical dma addresses should not happen (with devices
>   behind hardware IOMMU)
> - multiple identical dma addresses could happen
> 
> 
> It's better to use a strict-fit algorithm (as we do now) with the
> former. It's pretty difficult to do something better than a best-fit
> algorithm with the latter though.

When we understand the same under 'strict-fit' this will not work I
think. The idea behind dma-debug is to find the cases where the
dma_unmap parameters are not equal to the dma_map parameters. If we use
strict-fit we risk to oversee some sets of these cases because there is
no dma_debug_entry which strictly matches the reference value.

Regards,

Joerg


-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-05 12:44       ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-05 12:44 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: torvalds, mingo, lethal, just.for.lkml, hancockrwd, jens.axboe,
	bharrosh, linux-kernel, linux-scsi

On Fri, Jun 05, 2009 at 08:38:22PM +0900, FUJITA Tomonori wrote:
> On Fri, 5 Jun 2009 12:41:33 +0200
> Joerg Roedel <joerg.roedel@amd.com> wrote:

> I thought about something like this fix however we can do better; with
> this fix, we could overlook a bad hardware IOMMU bug.

I don't buy this yet. Can you explain what kind of hardware IOMMU bug we
can miss here? For the match in my patch it is still strictly necessary
that the dma address matches.

> I think that the better fix can handle both cases per device:
> 
> - multiple identical dma addresses should not happen (with devices
>   behind hardware IOMMU)
> - multiple identical dma addresses could happen
> 
> 
> It's better to use a strict-fit algorithm (as we do now) with the
> former. It's pretty difficult to do something better than a best-fit
> algorithm with the latter though.

When we understand the same under 'strict-fit' this will not work I
think. The idea behind dma-debug is to find the cases where the
dma_unmap parameters are not equal to the dma_map parameters. If we use
strict-fit we risk to oversee some sets of these cases because there is
no dma_debug_entry which strictly matches the reference value.

Regards,

Joerg


-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 11:38     ` FUJITA Tomonori
  (?)
  (?)
@ 2009-06-05 14:57     ` Arnd Bergmann
  -1 siblings, 0 replies; 30+ messages in thread
From: Arnd Bergmann @ 2009-06-05 14:57 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: joerg.roedel, torvalds, mingo, lethal, just.for.lkml, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Friday 05 June 2009, FUJITA Tomonori wrote:
> I think that the better fix can handle both cases per device:
> 
> - multiple identical dma addresses should not happen (with devices
>   behind hardware IOMMU)
> - multiple identical dma addresses could happen

I guess you could also have the case where for a given range
of addresses, you use a linear mapping and the dma addresses
can be identical, while for other physical addresses you would
rely on address translation.

For example on PowerPC/Cell with infiniband adapters, you can get
linear mapping behavior for DMA_ATTR_WEAK_ORDERING but IOMMU
translation without that flag, for the same device and same
physical address.

	Arnd <><

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 10:41   ` Joerg Roedel
@ 2009-06-05 15:52     ` Torsten Kaiser
  -1 siblings, 0 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-05 15:52 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 5, 2009 at 12:41 PM, Joerg Roedel <joerg.roedel@amd.com> wrote:
> From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Fri, 5 Jun 2009 12:01:35 +0200
> Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit
>
> Some device drivers map the same physical address multiple times to a
> dma address. Without an IOMMU this results in the same dma address being
> put into the dma-debug hash multiple times. With a first-fit match in
> hash_bucket_find() this function may return the wrong dma_debug_entry.
> This can result in false positive warnings. This patch fixes it by
> changing the first-fit behavior of hash_bucket_find() into a best-fit
> algorithm.
>
> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
> Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 69da09a..2b16536 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
>  static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
>                                                struct dma_debug_entry *ref)
>  {
> -       struct dma_debug_entry *entry;
> +       struct dma_debug_entry *entry, *ret = NULL;
> +       int matches = 0, match_lvl, last_lvl = 0;
>
>        list_for_each_entry(entry, &bucket->list, list) {
> -               if ((entry->dev_addr == ref->dev_addr) &&
> -                   (entry->dev == ref->dev))
> +               if ((entry->dev_addr != ref->dev_addr) ||
> +                   (entry->dev != ref->dev))
> +                       continue;
> +
> +               /*
> +                * Some drivers map the same physical address multiple
> +                * times. Without a hardware IOMMU this results in the
> +                * same device addresses being put into the dma-debug
> +                * hash multiple times too. This can result in false
> +                * positives being reported. Therfore we implement a
> +                * best-fit algorithm here which returns the entry from
> +                * the hash which fits best to the reference value
> +                * instead of the first-fit.
> +                */
> +               matches += 1;
> +               match_lvl = 0;
> +               entry->size      == ref->size      ? ++match_lvl : match_lvl;
> +               entry->type      == ref->type      ? ++match_lvl : match_lvl;
> +               entry->direction == ref->direction ? ++match_lvl : match_lvl;
> +
> +               if (match_lvl == 3) {
> +                       /* perfect-fit - return the result */
>                        return entry;
> +               } else if (match_lvl > last_lvl) {
> +                       /*
> +                        * We found an entry that fits better then the
> +                        * previous one
> +                        */
> +                       last_lvl = match_lvl;
> +                       ret      = entry;
> +               }
>        }
>
> -       return NULL;
> +       /*
> +        * If we have multiple matches but no perfect-fit, just return
> +        * NULL.
> +        */
> +       ret = (matches == 1) ? ret : NULL;

This doesn't look right to me.
The comment above says "returns the entry from the hash which fits
best", but this will always return NULL, if there are are multiple
entrys, but no perfect match.

On a wrong unmap that would result in "DMA-API: device driver tries to
free DMA memory it has not allocated" instead of a report what kind of
mismatches happend.

Or did you mean to return NULL if there are more then one matches at
the last_lvl? Then a matches=1; is missing from the "found an entry"
block.

Should there be a warning if more then one possible match were found?

* driver maps address 'a' with size 1
* driver maps same address 'a' with size 2
* driver wrongly unmaps the second allocation with size 1
  -> no warning, because the first allocation is returned
* driver wants to correctly unmap the first allocation
  -> wrong warning about this unmap because size mismatch

Also what about sg_call_ents and sg_mapped_ents?
Could it be possible to get the same address/size with different sg-counts.

As in my case most of the warnings where about a wrong unmap count and
only a few about the memory size, that seems possible.

For reference, here part of the warnings from my system:

Wrong count:
Jun  5 03:32:58 treogen [   51.469720] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA sg list with different entry count
[map count=1] [unmap count=2]
Jun  5 03:32:58 treogen [   51.469725] Modules linked in: msp3400
tuner tea5767 tda8290 tuner_xc2028 xc
5000 tda9887 tuner_simple tuner_types mt20xx tea5761 bttv ir_common
v4l2_common videodev v4l1_compat v4
l2_compat_ioctl32 videobuf_dma_sg videobuf_core btcx_risc tveeprom sg pata_amd
Jun  5 03:32:58 treogen [   51.469771] Pid: 0, comm: swapper Not
tainted 2.6.30-rc7 #1
Jun  5 03:32:58 treogen [   51.469775] Call Trace:
Jun  5 03:32:58 treogen [   51.469779]  <IRQ>  [<ffffffff8041755e>] ?
check_unmap+0x65e/0x6a0
Jun  5 03:32:58 treogen [   51.469797]  [<ffffffff802432d8>]
warn_slowpath_common+0x78/0xd0
Jun  5 03:32:58 treogen [   51.469804]  [<ffffffff802433b4>]
warn_slowpath_fmt+0x64/0x70
Jun  5 03:32:58 treogen [   51.469816]  [<ffffffff8028dd42>] ?
mempool_free_slab+0x12/0x20
Jun  5 03:32:58 treogen [   51.469828]  [<ffffffff8068d74d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun  5 03:32:58 treogen [   51.469835]  [<ffffffff8041755e>]
check_unmap+0x65e/0x6a0
Jun  5 03:32:58 treogen [   51.469842]  [<ffffffff804176ae>]
debug_dma_unmap_sg+0x10e/0x1b0

Wrong size:
Jun  3 06:50:33 treogen [  276.421432] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA memory with different size [device
address=0x000000007b590000] [map size=16384 bytes] [unmap size=12288
bytes]
                                     Jun  3 06:50:33 treogen [
276.421438] Modules linked in: msp3400 tuner tea5767 tda8290
tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761
bttv ir_common v4l2_common videodev v4l1_compat v4l2_compat_ioctl32
videobuf_dma_sg videobuf_core btcx_risc sg pata_amd tveeprom
Jun  3 06:50:33 treogen [  276.421480] Pid: 1301, comm: kcryptd Not
tainted 2.6.30-rc7 #1
Jun  3 06:50:33 treogen [  276.421485] Call Trace:
Jun  3 06:50:33 treogen [  276.421490]  <IRQ>  [<ffffffff80417355>] ?
check_unmap+0x455/0x6a0
Jun  3 06:50:33 treogen [  276.421507]  [<ffffffff802432d8>]
warn_slowpath_common+0x78/0xd0
Jun  3 06:50:33 treogen [  276.421514]  [<ffffffff802433b4>]
warn_slowpath_fmt+0x64/0x70
Jun  3 06:50:33 treogen [  276.421524]  [<ffffffff8028dd42>] ?
mempool_free_slab+0x12/0x20
Jun  3 06:50:33 treogen [  276.421535]  [<ffffffff8068d74d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun  3 06:50:33 treogen [  276.421542]  [<ffffffff80417355>]
check_unmap+0x455/0x6a0
Jun  3 06:50:33 treogen [  276.421549]  [<ffffffff804176ae>]
debug_dma_unmap_sg+0x10e/0x1b0

> +
> +       return ret;
>  }
>
>  /*
> --
> 1.6.3.1
>
>
>
>
> --
>           | Advanced Micro Devices GmbH
>  Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
>  System    |
>  Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
>  Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
>           | Registergericht München, HRB Nr. 43632
>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-05 15:52     ` Torsten Kaiser
  0 siblings, 0 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-05 15:52 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 5, 2009 at 12:41 PM, Joerg Roedel <joerg.roedel@amd.com> wrote:
> From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Fri, 5 Jun 2009 12:01:35 +0200
> Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit
>
> Some device drivers map the same physical address multiple times to a
> dma address. Without an IOMMU this results in the same dma address being
> put into the dma-debug hash multiple times. With a first-fit match in
> hash_bucket_find() this function may return the wrong dma_debug_entry.
> This can result in false positive warnings. This patch fixes it by
> changing the first-fit behavior of hash_bucket_find() into a best-fit
> algorithm.
>
> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
> Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 69da09a..2b16536 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -185,15 +185,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
>  static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
>                                                struct dma_debug_entry *ref)
>  {
> -       struct dma_debug_entry *entry;
> +       struct dma_debug_entry *entry, *ret = NULL;
> +       int matches = 0, match_lvl, last_lvl = 0;
>
>        list_for_each_entry(entry, &bucket->list, list) {
> -               if ((entry->dev_addr == ref->dev_addr) &&
> -                   (entry->dev == ref->dev))
> +               if ((entry->dev_addr != ref->dev_addr) ||
> +                   (entry->dev != ref->dev))
> +                       continue;
> +
> +               /*
> +                * Some drivers map the same physical address multiple
> +                * times. Without a hardware IOMMU this results in the
> +                * same device addresses being put into the dma-debug
> +                * hash multiple times too. This can result in false
> +                * positives being reported. Therfore we implement a
> +                * best-fit algorithm here which returns the entry from
> +                * the hash which fits best to the reference value
> +                * instead of the first-fit.
> +                */
> +               matches += 1;
> +               match_lvl = 0;
> +               entry->size      == ref->size      ? ++match_lvl : match_lvl;
> +               entry->type      == ref->type      ? ++match_lvl : match_lvl;
> +               entry->direction == ref->direction ? ++match_lvl : match_lvl;
> +
> +               if (match_lvl == 3) {
> +                       /* perfect-fit - return the result */
>                        return entry;
> +               } else if (match_lvl > last_lvl) {
> +                       /*
> +                        * We found an entry that fits better then the
> +                        * previous one
> +                        */
> +                       last_lvl = match_lvl;
> +                       ret      = entry;
> +               }
>        }
>
> -       return NULL;
> +       /*
> +        * If we have multiple matches but no perfect-fit, just return
> +        * NULL.
> +        */
> +       ret = (matches == 1) ? ret : NULL;

This doesn't look right to me.
The comment above says "returns the entry from the hash which fits
best", but this will always return NULL, if there are are multiple
entrys, but no perfect match.

On a wrong unmap that would result in "DMA-API: device driver tries to
free DMA memory it has not allocated" instead of a report what kind of
mismatches happend.

Or did you mean to return NULL if there are more then one matches at
the last_lvl? Then a matches=1; is missing from the "found an entry"
block.

Should there be a warning if more then one possible match were found?

* driver maps address 'a' with size 1
* driver maps same address 'a' with size 2
* driver wrongly unmaps the second allocation with size 1
  -> no warning, because the first allocation is returned
* driver wants to correctly unmap the first allocation
  -> wrong warning about this unmap because size mismatch

Also what about sg_call_ents and sg_mapped_ents?
Could it be possible to get the same address/size with different sg-counts.

As in my case most of the warnings where about a wrong unmap count and
only a few about the memory size, that seems possible.

For reference, here part of the warnings from my system:

Wrong count:
Jun  5 03:32:58 treogen [   51.469720] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA sg list with different entry count
[map count=1] [unmap count=2]
Jun  5 03:32:58 treogen [   51.469725] Modules linked in: msp3400
tuner tea5767 tda8290 tuner_xc2028 xc
5000 tda9887 tuner_simple tuner_types mt20xx tea5761 bttv ir_common
v4l2_common videodev v4l1_compat v4
l2_compat_ioctl32 videobuf_dma_sg videobuf_core btcx_risc tveeprom sg pata_amd
Jun  5 03:32:58 treogen [   51.469771] Pid: 0, comm: swapper Not
tainted 2.6.30-rc7 #1
Jun  5 03:32:58 treogen [   51.469775] Call Trace:
Jun  5 03:32:58 treogen [   51.469779]  <IRQ>  [<ffffffff8041755e>] ?
check_unmap+0x65e/0x6a0
Jun  5 03:32:58 treogen [   51.469797]  [<ffffffff802432d8>]
warn_slowpath_common+0x78/0xd0
Jun  5 03:32:58 treogen [   51.469804]  [<ffffffff802433b4>]
warn_slowpath_fmt+0x64/0x70
Jun  5 03:32:58 treogen [   51.469816]  [<ffffffff8028dd42>] ?
mempool_free_slab+0x12/0x20
Jun  5 03:32:58 treogen [   51.469828]  [<ffffffff8068d74d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun  5 03:32:58 treogen [   51.469835]  [<ffffffff8041755e>]
check_unmap+0x65e/0x6a0
Jun  5 03:32:58 treogen [   51.469842]  [<ffffffff804176ae>]
debug_dma_unmap_sg+0x10e/0x1b0

Wrong size:
Jun  3 06:50:33 treogen [  276.421432] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA memory with different size [device
address=0x000000007b590000] [map size=16384 bytes] [unmap size=12288
bytes]
                                     Jun  3 06:50:33 treogen [
276.421438] Modules linked in: msp3400 tuner tea5767 tda8290
tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761
bttv ir_common v4l2_common videodev v4l1_compat v4l2_compat_ioctl32
videobuf_dma_sg videobuf_core btcx_risc sg pata_amd tveeprom
Jun  3 06:50:33 treogen [  276.421480] Pid: 1301, comm: kcryptd Not
tainted 2.6.30-rc7 #1
Jun  3 06:50:33 treogen [  276.421485] Call Trace:
Jun  3 06:50:33 treogen [  276.421490]  <IRQ>  [<ffffffff80417355>] ?
check_unmap+0x455/0x6a0
Jun  3 06:50:33 treogen [  276.421507]  [<ffffffff802432d8>]
warn_slowpath_common+0x78/0xd0
Jun  3 06:50:33 treogen [  276.421514]  [<ffffffff802433b4>]
warn_slowpath_fmt+0x64/0x70
Jun  3 06:50:33 treogen [  276.421524]  [<ffffffff8028dd42>] ?
mempool_free_slab+0x12/0x20
Jun  3 06:50:33 treogen [  276.421535]  [<ffffffff8068d74d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun  3 06:50:33 treogen [  276.421542]  [<ffffffff80417355>]
check_unmap+0x455/0x6a0
Jun  3 06:50:33 treogen [  276.421549]  [<ffffffff804176ae>]
debug_dma_unmap_sg+0x10e/0x1b0

> +
> +       return ret;
>  }
>
>  /*
> --
> 1.6.3.1
>
>
>
>
> --
>           | Advanced Micro Devices GmbH
>  Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
>  System    |
>  Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
>  Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
>           | Registergericht München, HRB Nr. 43632
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 15:52     ` Torsten Kaiser
  (?)
@ 2009-06-05 18:20     ` Joerg Roedel
  2009-06-05 20:25       ` Torsten Kaiser
  2009-06-05 22:11       ` Torsten Kaiser
  -1 siblings, 2 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-05 18:20 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 05, 2009 at 05:52:27PM +0200, Torsten Kaiser wrote:
> 
> This doesn't look right to me.
> The comment above says "returns the entry from the hash which fits
> best", but this will always return NULL, if there are are multiple
> entrys, but no perfect match.

This is intentional. If there is no perfect-fit there is no way to tell
which entry was meant. So we potentially report wrong errors with a
wrong mapping backtrace which confuses even more than the wrong
"DMA-API: device driver tries to free DMA memory it has not allocated".

> Should there be a warning if more then one possible match were found?

True. That would be better. But I tried to keep the code change as small
as possible without disabling the feature completly.

> * driver maps address 'a' with size 1
> * driver maps same address 'a' with size 2
> * driver wrongly unmaps the second allocation with size 1
>   -> no warning, because the first allocation is returned

Hmm, I am not sure if we can handle this situation correctly in the
dma-debug code. There is no unique key to identify a mapping request
which allows to assign an unmap request to it. Currently dma-debug uses
device and dma-address. But that seems not to be sufficient. The
best-fit algorithm is also a but fuzzy of course.

> * driver wants to correctly unmap the first allocation
>   -> wrong warning about this unmap because size mismatch

Ok, at least we get a warning about a bug. Not very useful because it
reports the wrong bug. Is this the situation which triggered the
original bug report?

> Also what about sg_call_ents and sg_mapped_ents?
> Could it be possible to get the same address/size with different sg-counts.

It looks not forbidden in the API. So I guess this can happen too.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 18:20     ` Joerg Roedel
@ 2009-06-05 20:25       ` Torsten Kaiser
  2009-06-05 22:11       ` Torsten Kaiser
  1 sibling, 0 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-05 20:25 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 5, 2009 at 8:20 PM, Joerg Roedel <joro@8bytes.org> wrote:
> On Fri, Jun 05, 2009 at 05:52:27PM +0200, Torsten Kaiser wrote:
>>
>> This doesn't look right to me.
>> The comment above says "returns the entry from the hash which fits
>> best", but this will always return NULL, if there are are multiple
>> entrys, but no perfect match.
>
> This is intentional. If there is no perfect-fit there is no way to tell
> which entry was meant. So we potentially report wrong errors with a
> wrong mapping backtrace which confuses even more than the wrong
> "DMA-API: device driver tries to free DMA memory it has not allocated".

Ah, OK.
I thought that reporting a (maybe) wrong difference in map/unmap would
be better that a definitely wrong message "freed not allocated
memory". And if the fuzzy matcher would warn about the fuzzy match, a
developer would know that this difference might be wrong.

>> Should there be a warning if more then one possible match were found?
>
> True. That would be better. But I tried to keep the code change as small
> as possible without disabling the feature completly.

OK, it just looked like the code did something else the comment said.

>> * driver maps address 'a' with size 1
>> * driver maps same address 'a' with size 2
>> * driver wrongly unmaps the second allocation with size 1
>>   -> no warning, because the first allocation is returned
>
> Hmm, I am not sure if we can handle this situation correctly in the
> dma-debug code. There is no unique key to identify a mapping request
> which allows to assign an unmap request to it. Currently dma-debug uses
> device and dma-address. But that seems not to be sufficient. The
> best-fit algorithm is also a but fuzzy of course.

Yes, I just read pci-gart_64.c to see if there is a better key.
The API always seems to include size and direction too.

Would it work to to use all device, address, size and direction as
key, as the proposes exact matcher does?
This would obvious not work with the current diagnostics in
check_unmap, as not even single near matches would be returned.
But if you would move the diagnostics from check_unmap into
hash_bucket_find it could work:
If hash_bucket_find does not find a perfect match, it could output:
* no match -> tries to free DMA memory it has not allocated
* 1 match -> warn about any differences (size, type, cpu-address, sg
list count, direction)
* 2+ matches, with perfect match -> no warning; set flag in other matches.
* 2+ matches, without perfect match -> list differences from all
matches; set flag in other matches after returning the best

If differences from a flagged match are output, add a note, that this
warning is unreliable because of these possible map/unmap mismatches.

>> * driver wants to correctly unmap the first allocation
>>   -> wrong warning about this unmap because size mismatch
>
> Ok, at least we get a warning about a bug. Not very useful because it
> reports the wrong bug. Is this the situation which triggered the
> original bug report?

No, the original report was with a correctly working driver that just
confused the dma-debugging code into issuing wrong warnings:
http://marc.info/?l=linux-kernel&m=124336530609750&w=2

This constructed case was just me trying to find more evil cases that
are currently not covered.

Your patch would blame the correctly programmed second unmap for
"freeing unallocated memory",
my proposal would blame it for the size difference, but with a note
that there was a concurrent second map at the same address.

And in a case where the code maps the same address twice but wants to
unmap it wrong, it would correctly output both possible, but
mismatching maps, instead of the wrong "unallocated memory" warning.

>> Also what about sg_call_ents and sg_mapped_ents?
>> Could it be possible to get the same address/size with different sg-counts.
>
> It looks not forbidden in the API. So I guess this can happen too.

Should it then be added to the fuzzy matcher?
Otherwise I would except the warnings like "DMA-API: device driver
frees DMA sg list with different entry count [map count=2] [unmap
count=1]" to still trigger. (I didn't have the time to test your patch
yet)

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 18:20     ` Joerg Roedel
  2009-06-05 20:25       ` Torsten Kaiser
@ 2009-06-05 22:11       ` Torsten Kaiser
  1 sibling, 0 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-05 22:11 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 5, 2009 at 8:20 PM, Joerg Roedel <joro@8bytes.org> wrote:
> On Fri, Jun 05, 2009 at 05:52:27PM +0200, Torsten Kaiser wrote:
>> Should there be a warning if more then one possible match were found?
>
> True. That would be better. But I tried to keep the code change as small
> as possible without disabling the feature completly.
>
>> * driver maps address 'a' with size 1
>> * driver maps same address 'a' with size 2
>> * driver wrongly unmaps the second allocation with size 1
>>   -> no warning, because the first allocation is returned
>
> Hmm, I am not sure if we can handle this situation correctly in the
> dma-debug code. There is no unique key to identify a mapping request
> which allows to assign an unmap request to it. Currently dma-debug uses
> device and dma-address. But that seems not to be sufficient. The
> best-fit algorithm is also a but fuzzy of course.

Maybe we just shouldn't try to handle it at all:

static void add_dma_entry(struct dma_debug_entry *entry)
{
        struct hash_bucket *bucket;
        unsigned long flags;

        bucket = get_hash_bucket(entry, &flags);
        if(hash_bucket_find(bucket, entry)) {
                printk(KERN_ERR "DMA-API: device mapped same address twice, "
                  "this use case cannot be handled currently -
disabling debugging\n");
                global_disable = true;
        }
        hash_bucket_add(bucket, entry);
        put_hash_bucket(bucket, &flags);
}

This would allow this feature to remain for most cased, but would also
prevent all false warnings.

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 10:41   ` Joerg Roedel
                     ` (2 preceding siblings ...)
  (?)
@ 2009-06-07  8:13   ` Ingo Molnar
  2009-06-07  8:22     ` Torsten Kaiser
  2009-06-07 10:45     ` FUJITA Tomonori
  -1 siblings, 2 replies; 30+ messages in thread
From: Ingo Molnar @ 2009-06-07  8:13 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, lethal, just.for.lkml,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi


* Joerg Roedel <joerg.roedel@amd.com> wrote:

> 
> On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> > dma-debugs wrongly assumes that no multiple DMA transfers are
> > performed against the same dma address on one device at the same
> > time. However it's true only with hardware IOMMUs. For example, an
> > application can easily send the same buffer twice with different
> > lengths to one device by using DIO and AIO. If these requests are not
> > unmapped in the same order in which they were mapped,
> > hash_bucket_find() finds a wrong entry and gives a false warning.
> > 
> > We should fix this before 2.6.30 release. Seems that there is no
> > easy way to fix it. I think that it's better to just disable
> > dma-debug for now.
> > 
> > Torsten Kaiser found this bug with the RAID1 configuration:
> > 
> > http://marc.info/?t=124336541900008&r=1&w=2
> > 
> 
> Argh, I never thought that this can happen. But its not explicitly 
> forbidden so we have to handle this situation. Thanks for tracking 
> down the bug to this issue.
>
> However, I think there is a somehow simple fix for the issue. 
> Patch is attached. Its the least intrusive way I can think of to 
> fix this problem.
>
> But its up to Linus/Ingo to decide if it can be accepted at this 
> very late point in the cycle. Since dma-debug is new with 2.6.30 
> it will at least not introduce any regression. [...]

I think it's too late for v2.6.30 to do any of the changes - and the 
DMA debug facility is off by default.

Also, i think such DMA patterns, while 'allowed' can be quite 
dangerous as its such a rare usage combination really. AIO and DIO 
are crazy to begin with, mixing AIO and DIO for the same buffer is 
madness square two. (It can result in 3 agents for the same memory 
address: CPU, dma1 and dma2. How many interesting chipset erratums 
could there be related to such scenarios?)

But it is certainly not the task of a debug facility to restrict 
existing user-visible ABIs, so fixing the false positive is correct.

So i've applied your fix to the iommu branch for v2.6.31 and marked 
it for -stable backporting, that way 2.6.30.1 will be able to pick 
the patch up (if it remains problem-free in testing).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-07  8:13   ` Ingo Molnar
@ 2009-06-07  8:22     ` Torsten Kaiser
  2009-06-07 10:45     ` FUJITA Tomonori
  1 sibling, 0 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-07  8:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Sun, Jun 7, 2009 at 10:13 AM, Ingo Molnar <mingo@elte.hu> wrote:
> * Joerg Roedel <joerg.roedel@amd.com> wrote:
>> But its up to Linus/Ingo to decide if it can be accepted at this
>> very late point in the cycle. Since dma-debug is new with 2.6.30
>> it will at least not introduce any regression. [...]
>
> I think it's too late for v2.6.30 to do any of the changes - and the
> DMA debug facility is off by default.
>
> Also, i think such DMA patterns, while 'allowed' can be quite
> dangerous as its such a rare usage combination really. AIO and DIO
> are crazy to begin with, mixing AIO and DIO for the same buffer is
> madness square two. (It can result in 3 agents for the same memory
> address: CPU, dma1 and dma2. How many interesting chipset erratums
> could there be related to such scenarios?)

I think in my case the cause is somewhat simpler: RAID1
My root filesystem is on a RAID1 that consists of two disks, both
connected to the sata_sil24 controller.
So it is only natural for the md driver to issue two dma read requests
for the same address: one for each drive.

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [tip:core/iommu] dma-debug: change hash_bucket_find from first-fit to best-fit
  2009-06-05 10:41   ` Joerg Roedel
                     ` (3 preceding siblings ...)
  (?)
@ 2009-06-07 10:22   ` tip-bot for Joerg Roedel
  -1 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Joerg Roedel @ 2009-06-07 10:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, just.for.lkml, torvalds, joerg.roedel,
	fujita.tomonori, stable, tglx, mingo

Commit-ID:  7caf6a49bb17d0377210693af5737563b31aa5ee
Gitweb:     http://git.kernel.org/tip/7caf6a49bb17d0377210693af5737563b31aa5ee
Author:     Joerg Roedel <joerg.roedel@amd.com>
AuthorDate: Fri, 5 Jun 2009 12:01:35 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 7 Jun 2009 10:04:53 +0200

dma-debug: change hash_bucket_find from first-fit to best-fit

Some device drivers map the same physical address multiple times to a
dma address. Without an IOMMU this results in the same dma address being
put into the dma-debug hash multiple times. With a first-fit match in
hash_bucket_find() this function may return the wrong dma_debug_entry.

This can result in false positive warnings. This patch fixes it by
changing the first-fit behavior of hash_bucket_find() into a best-fit
algorithm.

Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: lethal@linux-sh.org
Cc: just.for.lkml@googlemail.com
Cc: hancockrwd@gmail.com
Cc: jens.axboe@oracle.com
Cc: bharrosh@panasas.com
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <20090605104132.GE24836@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
 1 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index cdd205d..8fcc09c 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -186,15 +186,50 @@ static void put_hash_bucket(struct hash_bucket *bucket,
 static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 						struct dma_debug_entry *ref)
 {
-	struct dma_debug_entry *entry;
+	struct dma_debug_entry *entry, *ret = NULL;
+	int matches = 0, match_lvl, last_lvl = 0;
 
 	list_for_each_entry(entry, &bucket->list, list) {
-		if ((entry->dev_addr == ref->dev_addr) &&
-		    (entry->dev == ref->dev))
+		if ((entry->dev_addr != ref->dev_addr) ||
+		    (entry->dev != ref->dev))
+			continue;
+
+		/*
+		 * Some drivers map the same physical address multiple
+		 * times. Without a hardware IOMMU this results in the
+		 * same device addresses being put into the dma-debug
+		 * hash multiple times too. This can result in false
+		 * positives being reported. Therfore we implement a
+		 * best-fit algorithm here which returns the entry from
+		 * the hash which fits best to the reference value
+		 * instead of the first-fit.
+		 */
+		matches += 1;
+		match_lvl = 0;
+		entry->size      == ref->size      ? ++match_lvl : match_lvl;
+		entry->type      == ref->type      ? ++match_lvl : match_lvl;
+		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+
+		if (match_lvl == 3) {
+			/* perfect-fit - return the result */
 			return entry;
+		} else if (match_lvl > last_lvl) {
+			/*
+			 * We found an entry that fits better then the
+			 * previous one
+			 */
+			last_lvl = match_lvl;
+			ret      = entry;
+		}
 	}
 
-	return NULL;
+	/*
+	 * If we have multiple matches but no perfect-fit, just return
+	 * NULL.
+	 */
+	ret = (matches == 1) ? ret : NULL;
+
+	return ret;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-07  8:13   ` Ingo Molnar
  2009-06-07  8:22     ` Torsten Kaiser
@ 2009-06-07 10:45     ` FUJITA Tomonori
  1 sibling, 0 replies; 30+ messages in thread
From: FUJITA Tomonori @ 2009-06-07 10:45 UTC (permalink / raw)
  To: mingo
  Cc: joerg.roedel, fujita.tomonori, torvalds, lethal, just.for.lkml,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Sun, 7 Jun 2009 10:13:05 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Joerg Roedel <joerg.roedel@amd.com> wrote:
> 
> > 
> > On Fri, Jun 05, 2009 at 05:33:14PM +0900, FUJITA Tomonori wrote:
> > > dma-debugs wrongly assumes that no multiple DMA transfers are
> > > performed against the same dma address on one device at the same
> > > time. However it's true only with hardware IOMMUs. For example, an
> > > application can easily send the same buffer twice with different
> > > lengths to one device by using DIO and AIO. If these requests are not
> > > unmapped in the same order in which they were mapped,
> > > hash_bucket_find() finds a wrong entry and gives a false warning.
> > > 
> > > We should fix this before 2.6.30 release. Seems that there is no
> > > easy way to fix it. I think that it's better to just disable
> > > dma-debug for now.
> > > 
> > > Torsten Kaiser found this bug with the RAID1 configuration:
> > > 
> > > http://marc.info/?t=124336541900008&r=1&w=2
> > > 
> > 
> > Argh, I never thought that this can happen. But its not explicitly 
> > forbidden so we have to handle this situation. Thanks for tracking 
> > down the bug to this issue.
> >
> > However, I think there is a somehow simple fix for the issue. 
> > Patch is attached. Its the least intrusive way I can think of to 
> > fix this problem.
> >
> > But its up to Linus/Ingo to decide if it can be accepted at this 
> > very late point in the cycle. Since dma-debug is new with 2.6.30 
> > it will at least not introduce any regression. [...]
> 
> I think it's too late for v2.6.30 to do any of the changes - and the 
> DMA debug facility is off by default.

Fedora enables it by default and seems that we got some bogus bug
reports. I like not to see any bogus bug reports about v2.6.30. So I
prefer to disable dma-debug feature.


> Also, i think such DMA patterns, while 'allowed' can be quite 
> dangerous as its such a rare usage combination really. AIO and DIO 
> are crazy to begin with, mixing AIO and DIO for the same buffer is 
> madness square two. (It can result in 3 agents for the same memory 
> address: CPU, dma1 and dma2. How many interesting chipset erratums 
> could there be related to such scenarios?)

As Torsten pointed out, this bug happens on common and sane
configurations. And if you want to write the same contents to two
places on disk, AIO and DIO is a reasonable solution, I think.


> But it is certainly not the task of a debug facility to restrict 
> existing user-visible ABIs, so fixing the false positive is correct.
> 
> So i've applied your fix to the iommu branch for v2.6.31 and marked 
> it for -stable backporting, that way 2.6.30.1 will be able to pick 
> the patch up (if it remains problem-free in testing).
> 

Joerg patch weakens the checking capability. IMHO, it's the wrong
direction.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-05 10:41   ` Joerg Roedel
                     ` (4 preceding siblings ...)
  (?)
@ 2009-06-10 20:41   ` Torsten Kaiser
  2009-06-11  8:10       ` Joerg Roedel
  -1 siblings, 1 reply; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-10 20:41 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 5, 2009 at 12:41 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> From c0baa4d43a674c9cc321af3b23a6d5c9b46d65bd Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Fri, 5 Jun 2009 12:01:35 +0200
> Subject: [PATCH] dma-debug: change hash_bucket_find from first-fit to best-fit
>
> Some device drivers map the same physical address multiple times to a
> dma address. Without an IOMMU this results in the same dma address being
> put into the dma-debug hash multiple times. With a first-fit match in
> hash_bucket_find() this function may return the wrong dma_debug_entry.
> This can result in false positive warnings. This patch fixes it by
> changing the first-fit behavior of hash_bucket_find() into a best-fit
> algorithm.
>
> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
> Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |   43 +++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 39 insertions(+), 4 deletions(-)

I applied this patch to the just released 2.6.30, but it does not fix
the false warning on my System.

Jun 10 21:10:14 treogen [ 2611.715341] ------------[ cut here ]------------
Jun 10 21:10:14 treogen [ 2611.715359] WARNING: at lib/dma-debug.c:565
check_unmap+0x536/0x620()
Jun 10 21:10:14 treogen [ 2611.715363] Hardware name: KFN5-D SLI
Jun 10 21:10:14 treogen [ 2611.715369] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA sg list with different entry count
[map count=2] [unmap count=1]
Jun 10 21:10:14 treogen [ 2611.715374] Modules linked in: msp3400
tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple
tuner_types mt20xx tea5761 bttv ir_common v4l2_common videodev
v4l1_compat v4l2_compat_ioctl32 videobuf_dma_sg videobuf_core
btcx_risc sg pata_amd tveeprom
Jun 10 21:10:14 treogen [ 2611.715416] Pid: 0, comm: swapper Not
tainted 2.6.30 #1
Jun 10 21:10:14 treogen [ 2611.715420] Call Trace:
Jun 10 21:10:14 treogen [ 2611.715424]  <IRQ>  [<ffffffff804175c6>] ?
check_unmap+0x536/0x620
Jun 10 21:10:14 treogen [ 2611.715441]  [<ffffffff80243308>]
warn_slowpath_common+0x78/0xd0
Jun 10 21:10:14 treogen [ 2611.715448]  [<ffffffff802433e4>]
warn_slowpath_fmt+0x64/0x70
Jun 10 21:10:14 treogen [ 2611.715458]  [<ffffffff8057c5d8>] ?
dec_pending+0x88/0x170
Jun 10 21:10:14 treogen [ 2611.715465]  [<ffffffff8057c8bf>] ?
clone_endio+0x9f/0xd0
Jun 10 21:10:14 treogen [ 2611.715475]  [<ffffffff8068e52d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun 10 21:10:14 treogen [ 2611.715482]  [<ffffffff804175c6>]
check_unmap+0x536/0x620
Jun 10 21:10:14 treogen [ 2611.715493]  [<ffffffff8028de5a>] ?
mempool_free+0x8a/0xa0
Jun 10 21:10:14 treogen [ 2611.715501]  [<ffffffff804177bd>]
debug_dma_unmap_sg+0x10d/0x190
Jun 10 21:10:14 treogen [ 2611.715510]  [<ffffffff802c0e55>] ?
__slab_free+0x185/0x340
Jun 10 21:10:14 treogen [ 2611.715519]  [<ffffffff804c1121>] ?
__scsi_put_command+0x61/0xa0
Jun 10 21:10:14 treogen [ 2611.715528]  [<ffffffff804d4258>]
ata_sg_clean+0x78/0xf0
Jun 10 21:10:14 treogen [ 2611.715535]  [<ffffffff804d4305>]
__ata_qc_complete+0x35/0x110
Jun 10 21:10:14 treogen [ 2611.715544]  [<ffffffff804c7b88>] ?
scsi_io_completion+0x398/0x530
Jun 10 21:10:14 treogen [ 2611.715551]  [<ffffffff804d449d>]
ata_qc_complete+0xbd/0x250
Jun 10 21:10:14 treogen [ 2611.715559]  [<ffffffff804d49db>]
ata_qc_complete_multiple+0xab/0xf0
Jun 10 21:10:14 treogen [ 2611.715568]  [<ffffffff804ea289>]
sil24_interrupt+0xb9/0x5b0
Jun 10 21:10:14 treogen [ 2611.715576]  [<ffffffff802614cc>] ?
getnstimeofday+0x5c/0xf0
Jun 10 21:10:14 treogen [ 2611.715584]  [<ffffffff8025d349>] ?
ktime_get_ts+0x59/0x60
Jun 10 21:10:14 treogen [ 2611.715593]  [<ffffffff802730b0>]
handle_IRQ_event+0x70/0x180
Jun 10 21:10:14 treogen [ 2611.715601]  [<ffffffff802753bd>]
handle_fasteoi_irq+0x6d/0xe0
Jun 10 21:10:14 treogen [ 2611.715609]  [<ffffffff8020e42f>]
handle_irq+0x1f/0x30
Jun 10 21:10:14 treogen [ 2611.715614]  [<ffffffff8020db7a>] do_IRQ+0x6a/0xf0
Jun 10 21:10:14 treogen [ 2611.715623]  [<ffffffff8020be53>]
ret_from_intr+0x0/0xf
Jun 10 21:10:14 treogen [ 2611.715627]  <EOI>  [<ffffffff80213657>] ?
default_idle+0x77/0xe0
Jun 10 21:10:14 treogen [ 2611.715640]  [<ffffffff80213655>] ?
default_idle+0x75/0xe0
Jun 10 21:10:14 treogen [ 2611.715648]  [<ffffffff8025e3bf>] ?
notifier_call_chain+0x3f/0x80
Jun 10 21:10:14 treogen [ 2611.715655]  [<ffffffff802136f8>] ?
c1e_idle+0x38/0x110
Jun 10 21:10:14 treogen [ 2611.715663]  [<ffffffff8020a71e>] ?
cpu_idle+0x6e/0xd0
Jun 10 21:10:14 treogen [ 2611.715672]  [<ffffffff8067bb8d>] ?
rest_init+0x6d/0x80
Jun 10 21:10:14 treogen [ 2611.715682]  [<ffffffff80929cc5>] ?
start_kernel+0x35a/0x422
Jun 10 21:10:14 treogen [ 2611.715690]  [<ffffffff80929289>] ?
x86_64_start_reservations+0x99/0xb9
Jun 10 21:10:14 treogen [ 2611.715697]  [<ffffffff80929389>] ?
x86_64_start_kernel+0xe0/0xf2
Jun 10 21:10:14 treogen [ 2611.715702] ---[ end trace db81115dbc8b11c5 ]---
Jun 10 21:10:14 treogen [ 2611.715706] Mapped at:
Jun 10 21:10:14 treogen [ 2611.715709]  [<ffffffff80416ef9>]
debug_dma_map_sg+0x159/0x180
Jun 10 21:10:14 treogen [ 2611.715717]  [<ffffffff804d47cc>]
ata_qc_issue+0x19c/0x300
Jun 10 21:10:14 treogen [ 2611.715724]  [<ffffffff804da6c8>]
ata_scsi_translate+0xa8/0x180
Jun 10 21:10:14 treogen [ 2611.715733]  [<ffffffff804dd261>]
ata_scsi_queuecmd+0xb1/0x2d0
Jun 10 21:10:14 treogen [ 2611.715739]  [<ffffffff804c0f23>]
scsi_dispatch_cmd+0xe3/0x220

So I implemented a check that would turn of the DMA-API debugging, if
such a double mapping was observed.
The resulting place where this was triggered also verified that is
seems to be the RAID1 that is causing these mappings:

Jun 10 21:39:30 treogen [    9.251818] md: Autodetecting RAID arrays.
Jun 10 21:39:30 treogen [    9.334044] md: Scanned 5 and added 5 devices.
Jun 10 21:39:30 treogen [    9.334048] md: autorun ...
Jun 10 21:39:30 treogen [    9.334052] md: considering sdb3 ...
Jun 10 21:39:30 treogen [    9.334065] md:  adding sdb3 ...
Jun 10 21:39:30 treogen [    9.334073] md: sdb1 has different UUID to sdb3
Jun 10 21:39:30 treogen [    9.334081] md: sdc2 has different UUID to sdb3
Jun 10 21:39:30 treogen [    9.334091] md:  adding sda3 ...
Jun 10 21:39:30 treogen [    9.334101] md: sda1 has different UUID to sdb3
Jun 10 21:39:30 treogen [    9.334111] md: created md3
Jun 10 21:39:30 treogen [    9.334115] md: bind<sda3>
Jun 10 21:39:30 treogen [    9.334149] md: bind<sdb3>
Jun 10 21:39:30 treogen [    9.334166] md: running: <sdb3><sda3>
Jun 10 21:39:30 treogen [    9.346111] raid1: raid set md3 active with
2 out of 2 mirrors
Jun 10 21:39:30 treogen [    9.354255] md3: bitmap initialized from
disk: read 10/10 pages, set 24 bits
Jun 10 21:39:30 treogen [    9.354260] created bitmap (145 pages) for device md3
Jun 10 21:39:30 treogen [    9.354399] DMA-API: device mapped same
address twice, this use case cannot be handled currently - disabling
debugging
Jun 10 21:39:30 treogen [    9.359858] md: considering sdb1 ...
Jun 10 21:39:30 treogen [    9.359874] md:  adding sdb1 ...
Jun 10 21:39:30 treogen [    9.359882] md: sdc2 has different UUID to sdb1

Here is the patch, I was using. I hope Gmail does mangle it to badly...

From: Torsten Kaiser <just.for.lkml@googlemail.com>

The DMA-API allowes a device to map the same addresse multiple times.
If this happens (for example with md-raid1) the DMA-API debugging code can no
longer associate the map and unmap request with 100% confidence, as the keys
collide.
If the wrong entry gets returned on unmap, this can trigger bogus warnings about
mismatching parameters. This was visible on my system.
As such double mappings seem to be rather uncommon, only disable the DMA-API
if such a mapping is observed. This way all wrong warnings can be
prevented without
removing this debugging tool.

Possible enhancement: If the returned entry is 100% identical to the
entry that needs
to be added, it would be irrelevant which entry will be returned on
unmap. So could
allow identical entries into the hash list without turning itself off.

Signed-off-by: Torsten Kaiser <just.for.lkml@googlemail.com
---
--- lib/dma-debug.c.orig	2009-06-10 21:22:53.371813622 +0200
+++ lib/dma-debug.c	2009-06-10 21:26:40.892282983 +0200
@@ -253,6 +253,12 @@ static void add_dma_entry(struct dma_deb
 	unsigned long flags;

 	bucket = get_hash_bucket(entry, &flags);
+	if (hash_bucket_find(bucket, entry)) {
+		printk(KERN_ERR "DMA-API: device mapped same address twice, "
+			"this use case cannot be handled currently "
+			"- disabling debugging\n");
+		global_disable = true;
+	}
 	hash_bucket_add(bucket, entry);
 	put_hash_bucket(bucket, &flags);
 }

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-10 20:41   ` [PATCH] dma-debug: disable DMA_API_DEBUG for now Torsten Kaiser
@ 2009-06-11  8:10       ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-11  8:10 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Wed, Jun 10, 2009 at 10:41:53PM +0200, Torsten Kaiser wrote:
> I applied this patch to the just released 2.6.30, but it does not fix
> the false warning on my System.
> 
> Jun 10 21:10:14 treogen [ 2611.715341] ------------[ cut here ]------------
> Jun 10 21:10:14 treogen [ 2611.715359] WARNING: at lib/dma-debug.c:565
> check_unmap+0x536/0x620()
> Jun 10 21:10:14 treogen [ 2611.715363] Hardware name: KFN5-D SLI
> Jun 10 21:10:14 treogen [ 2611.715369] sata_sil24 0000:04:00.0:
> DMA-API: device driver frees DMA sg list with different entry count [map count=2] [unmap count=1]

Ok, thats because we need also to check for sg_call_ents in the best-fit
checks. Can you please test if you can reproduce it with the attached
patch?

>From 1e83c7eab546314ad9dbe08602d243bb83e93b50 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Thu, 11 Jun 2009 10:03:42 +0200
Subject: [PATCH] dma-debug: check for sg_call_ents in best-fit algorithm too

If we don't check for sg_call_ents the hash_bucket_find function might
still return the wrong dma_debug_entry for sg mappings.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index ad65fc0..eb86ec3 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -262,11 +262,12 @@ static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 		 */
 		matches += 1;
 		match_lvl = 0;
-		entry->size      == ref->size      ? ++match_lvl : match_lvl;
-		entry->type      == ref->type      ? ++match_lvl : match_lvl;
-		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+		entry->size         == ref->size         ? ++match_lvl : 0;
+		entry->type         == ref->type         ? ++match_lvl : 0;
+		entry->direction    == ref->direction    ? ++match_lvl : 0;
+		entry->sg_call_ents == ref->sg_call_ents ? ++match_lvl : 0;
 
-		if (match_lvl == 3) {
+		if (match_lvl == 4) {
 			/* perfect-fit - return the result */
 			return entry;
 		} else if (match_lvl > last_lvl) {
-- 
1.6.3.1


-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-11  8:10       ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-11  8:10 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Wed, Jun 10, 2009 at 10:41:53PM +0200, Torsten Kaiser wrote:
> I applied this patch to the just released 2.6.30, but it does not fix
> the false warning on my System.
> 
> Jun 10 21:10:14 treogen [ 2611.715341] ------------[ cut here ]------------
> Jun 10 21:10:14 treogen [ 2611.715359] WARNING: at lib/dma-debug.c:565
> check_unmap+0x536/0x620()
> Jun 10 21:10:14 treogen [ 2611.715363] Hardware name: KFN5-D SLI
> Jun 10 21:10:14 treogen [ 2611.715369] sata_sil24 0000:04:00.0:
> DMA-API: device driver frees DMA sg list with different entry count [map count=2] [unmap count=1]

Ok, thats because we need also to check for sg_call_ents in the best-fit
checks. Can you please test if you can reproduce it with the attached
patch?

From 1e83c7eab546314ad9dbe08602d243bb83e93b50 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Thu, 11 Jun 2009 10:03:42 +0200
Subject: [PATCH] dma-debug: check for sg_call_ents in best-fit algorithm too

If we don't check for sg_call_ents the hash_bucket_find function might
still return the wrong dma_debug_entry for sg mappings.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index ad65fc0..eb86ec3 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -262,11 +262,12 @@ static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 		 */
 		matches += 1;
 		match_lvl = 0;
-		entry->size      == ref->size      ? ++match_lvl : match_lvl;
-		entry->type      == ref->type      ? ++match_lvl : match_lvl;
-		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+		entry->size         == ref->size         ? ++match_lvl : 0;
+		entry->type         == ref->type         ? ++match_lvl : 0;
+		entry->direction    == ref->direction    ? ++match_lvl : 0;
+		entry->sg_call_ents == ref->sg_call_ents ? ++match_lvl : 0;
 
-		if (match_lvl == 3) {
+		if (match_lvl == 4) {
 			/* perfect-fit - return the result */
 			return entry;
 		} else if (match_lvl > last_lvl) {
-- 
1.6.3.1


-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-11  8:10       ` Joerg Roedel
  (?)
@ 2009-06-11 17:38       ` Torsten Kaiser
  2009-06-12  7:50           ` Joerg Roedel
  2009-06-12 14:13         ` Joerg Roedel
  -1 siblings, 2 replies; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-11 17:38 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Thu, Jun 11, 2009 at 10:10 AM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> On Wed, Jun 10, 2009 at 10:41:53PM +0200, Torsten Kaiser wrote:
>> I applied this patch to the just released 2.6.30, but it does not fix
>> the false warning on my System.
>>
>> Jun 10 21:10:14 treogen [ 2611.715341] ------------[ cut here ]------------
>> Jun 10 21:10:14 treogen [ 2611.715359] WARNING: at lib/dma-debug.c:565
>> check_unmap+0x536/0x620()
>> Jun 10 21:10:14 treogen [ 2611.715363] Hardware name: KFN5-D SLI
>> Jun 10 21:10:14 treogen [ 2611.715369] sata_sil24 0000:04:00.0:
>> DMA-API: device driver frees DMA sg list with different entry count [map count=2] [unmap count=1]
>
> Ok, thats because we need also to check for sg_call_ents in the best-fit
> checks. Can you please test if you can reproduce it with the attached
> patch?
>
> From 1e83c7eab546314ad9dbe08602d243bb83e93b50 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@amd.com>
> Date: Thu, 11 Jun 2009 10:03:42 +0200
> Subject: [PATCH] dma-debug: check for sg_call_ents in best-fit algorithm too
>
> If we don't check for sg_call_ents the hash_bucket_find function might
> still return the wrong dma_debug_entry for sg mappings.
>
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  lib/dma-debug.c |    9 +++++----
>  1 files changed, 5 insertions(+), 4 deletions(-)

I tried this patch, but I still get a wrong warning from the DMA-API:
Jun 11 19:24:57 treogen [    9.451064] raid1: raid set md2 active with
2 out of 2 mirrors
Jun 11 19:24:57 treogen [    9.479278] md2: bitmap initialized from
disk: read 10/10 pages, set 0 bits
Jun 11 19:24:57 treogen [    9.479282] created bitmap (153 pages) for device md2
Jun 11 19:24:57 treogen [    9.513544] md: ... autorun DONE.
Jun 11 19:24:57 treogen [    9.517590]  md3: unknown partition table
Jun 11 19:24:57 treogen [   20.718608] XFS mounting filesystem dm-0
Jun 11 19:24:57 treogen [   21.220452] ------------[ cut here ]------------
Jun 11 19:24:57 treogen [   21.220477] WARNING: at lib/dma-debug.c:532
check_unmap+0x49e/0x620()
Jun 11 19:24:57 treogen [   21.220482] Hardware name: KFN5-D SLI
Jun 11 19:24:57 treogen [   21.220488] sata_sil24 0000:04:00.0:
DMA-API: device driver tries to free DM
A memory it has not allocated [device address=0x000000011e4c3000]
[size=4096 bytes]
Jun 11 19:24:57 treogen [   21.220494] Modules linked in:
Jun 11 19:24:57 treogen [   21.220502] Pid: 1301, comm: kcryptd Not
tainted 2.6.30 #3
Jun 11 19:24:57 treogen [   21.220507] Call Trace:
Jun 11 19:24:57 treogen [   21.220510]  <IRQ>  [<ffffffff8041753e>] ?
check_unmap+0x49e/0x620
Jun 11 19:24:57 treogen [   21.220528]  [<ffffffff80243308>]
warn_slowpath_common+0x78/0xd0
Jun 11 19:24:57 treogen [   21.220535]  [<ffffffff802433e4>]
warn_slowpath_fmt+0x64/0x70
Jun 11 19:24:57 treogen [   21.220545]  [<ffffffff802c10b4>] ?
kmem_cache_free+0xa4/0x130
Jun 11 19:24:57 treogen [   21.220555]  [<ffffffff8028ddc2>] ?
mempool_free_slab+0x12/0x20
Jun 11 19:24:57 treogen [   21.220562]  [<ffffffff8028de5a>] ?
mempool_free+0x8a/0xa0
Jun 11 19:24:57 treogen [   21.220573]  [<ffffffff8068e53d>] ?
_spin_lock_irqsave+0x1d/0x40
Jun 11 19:24:57 treogen [   21.220580]  [<ffffffff8041753e>]
check_unmap+0x49e/0x620
Jun 11 19:24:57 treogen [   21.220587]  [<ffffffff8028ddc2>] ?
mempool_free_slab+0x12/0x20
Jun 11 19:24:57 treogen [   21.220597]  [<ffffffff803f17bc>] ?
__freed_request+0x10c/0x160
Jun 11 19:24:57 treogen [   21.220605]  [<ffffffff804177cd>]
debug_dma_unmap_sg+0x10d/0x190
Jun 11 19:24:57 treogen [   21.220614]  [<ffffffff804c1131>] ?
__scsi_put_command+0x61/0xa0
Jun 11 19:24:57 treogen [   21.220624]  [<ffffffff804d4268>]
ata_sg_clean+0x78/0xf0
Jun 11 19:24:57 treogen [   21.220631]  [<ffffffff804d4315>]
__ata_qc_complete+0x35/0x110
Jun 11 19:24:57 treogen [   21.220640]  [<ffffffff804c7b98>] ?
scsi_io_completion+0x398/0x530
Jun 11 19:24:57 treogen [   21.220647]  [<ffffffff804d44ad>]
ata_qc_complete+0xbd/0x250
Jun 11 19:24:57 treogen [   21.220654]  [<ffffffff804d49eb>]
ata_qc_complete_multiple+0xab/0xf0
Jun 11 19:24:57 treogen [   21.220664]  [<ffffffff804ea299>]
sil24_interrupt+0xb9/0x5b0
Jun 11 19:24:57 treogen [   21.220673]  [<ffffffff802730b0>]
handle_IRQ_event+0x70/0x180
Jun 11 19:24:57 treogen [   21.220681]  [<ffffffff802753bd>]
handle_fasteoi_irq+0x6d/0xe0
Jun 11 19:24:57 treogen [   21.220689]  [<ffffffff8020e42f>]
handle_irq+0x1f/0x30
Jun 11 19:24:57 treogen [   21.220695]  [<ffffffff8020db7a>] do_IRQ+0x6a/0xf0
Jun 11 19:24:57 treogen [   21.220704]  [<ffffffff8020be53>]
ret_from_intr+0x0/0xf
Jun 11 19:24:57 treogen [   21.220707]  <EOI>  [<ffffffff80409aab>] ?
memcpy_c+0xb/0x20
Jun 11 19:24:57 treogen [   21.220722]  [<ffffffff803e5bc8>] ?
crypto_cbc_encrypt+0xd8/0x1b0
Jun 11 19:24:57 treogen [   21.220729]  [<ffffffff8022f320>] ?
twofish_encrypt+0x0/0x10
Jun 11 19:24:57 treogen [   21.220739]  [<ffffffff803dcd68>] ?
async_encrypt+0x38/0x40
Jun 11 19:24:57 treogen [   21.220749]  [<ffffffff8058447a>] ?
crypt_convert+0x20a/0x2a0
Jun 11 19:24:57 treogen [   21.220757]  [<ffffffff805847dd>] ?
kcryptd_crypt+0x2cd/0x500
Jun 11 19:24:57 treogen [   21.220765]  [<ffffffff80584510>] ?
kcryptd_crypt+0x0/0x500
Jun 11 19:24:57 treogen [   21.220774]  [<ffffffff80255cf7>] ?
worker_thread+0x137/0x1f0
Jun 11 19:24:57 treogen [   21.220782]  [<ffffffff80259d20>] ?
autoremove_wake_function+0x0/0x40
Jun 11 19:24:57 treogen [   21.220790]  [<ffffffff80255bc0>] ?
worker_thread+0x0/0x1f0
Jun 11 19:24:57 treogen [   21.220797]  [<ffffffff80255bc0>] ?
worker_thread+0x0/0x1f0
Jun 11 19:24:57 treogen [   21.220804]  [<ffffffff80259886>] ? kthread+0x56/0x90
Jun 11 19:24:57 treogen [   21.220812]  [<ffffffff8020c4aa>] ?
child_rip+0xa/0x20
Jun 11 19:24:57 treogen [   21.220819]  [<ffffffff8020bea9>] ?
restore_args+0x0/0x30
Jun 11 19:24:57 treogen [   21.220825]  [<ffffffff80259830>] ? kthread+0x0/0x90
Jun 11 19:24:57 treogen [   21.220832]  [<ffffffff8020c4a0>] ?
child_rip+0x0/0x20
Jun 11 19:24:57 treogen [   21.220836] ---[ end trace f020083379b5b162 ]---
Jun 11 19:24:57 treogen [   21.339740] Ending clean XFS mount for
filesystem: dm-0

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-11 17:38       ` Torsten Kaiser
@ 2009-06-12  7:50           ` Joerg Roedel
  2009-06-12 14:13         ` Joerg Roedel
  1 sibling, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-12  7:50 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
> I tried this patch, but I still get a wrong warning from the DMA-API:
> Jun 11 19:24:57 treogen [    9.451064] raid1: raid set md2 active with
> 2 out of 2 mirrors
> Jun 11 19:24:57 treogen [    9.479278] md2: bitmap initialized from
> disk: read 10/10 pages, set 0 bits
> Jun 11 19:24:57 treogen [    9.479282] created bitmap (153 pages) for device md2
> Jun 11 19:24:57 treogen [    9.513544] md: ... autorun DONE.
> Jun 11 19:24:57 treogen [    9.517590]  md3: unknown partition table
> Jun 11 19:24:57 treogen [   20.718608] XFS mounting filesystem dm-0
> Jun 11 19:24:57 treogen [   21.220452] ------------[ cut here ]------------
> Jun 11 19:24:57 treogen [   21.220477] WARNING: at lib/dma-debug.c:532
> check_unmap+0x49e/0x620()
> Jun 11 19:24:57 treogen [   21.220482] Hardware name: KFN5-D SLI
> Jun 11 19:24:57 treogen [   21.220488] sata_sil24 0000:04:00.0:
> DMA-API: device driver tries to free DM A memory it has not allocated [device address=0x000000011e4c3000]

This happens with the best-fit if there is no perfect fit but multiple
matches. This warning needs to be adjusted to also cover this case. I
have this on my list and will fix it soon.

Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
@ 2009-06-12  7:50           ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-12  7:50 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
> I tried this patch, but I still get a wrong warning from the DMA-API:
> Jun 11 19:24:57 treogen [    9.451064] raid1: raid set md2 active with
> 2 out of 2 mirrors
> Jun 11 19:24:57 treogen [    9.479278] md2: bitmap initialized from
> disk: read 10/10 pages, set 0 bits
> Jun 11 19:24:57 treogen [    9.479282] created bitmap (153 pages) for device md2
> Jun 11 19:24:57 treogen [    9.513544] md: ... autorun DONE.
> Jun 11 19:24:57 treogen [    9.517590]  md3: unknown partition table
> Jun 11 19:24:57 treogen [   20.718608] XFS mounting filesystem dm-0
> Jun 11 19:24:57 treogen [   21.220452] ------------[ cut here ]------------
> Jun 11 19:24:57 treogen [   21.220477] WARNING: at lib/dma-debug.c:532
> check_unmap+0x49e/0x620()
> Jun 11 19:24:57 treogen [   21.220482] Hardware name: KFN5-D SLI
> Jun 11 19:24:57 treogen [   21.220488] sata_sil24 0000:04:00.0:
> DMA-API: device driver tries to free DM A memory it has not allocated [device address=0x000000011e4c3000]

This happens with the best-fit if there is no perfect fit but multiple
matches. This warning needs to be adjusted to also cover this case. I
have this on my list and will fix it soon.

Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-11 17:38       ` Torsten Kaiser
  2009-06-12  7:50           ` Joerg Roedel
@ 2009-06-12 14:13         ` Joerg Roedel
  2009-06-12 14:51           ` Torsten Kaiser
  2009-06-13 17:08           ` Torsten Kaiser
  1 sibling, 2 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-12 14:13 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 808 bytes --]

On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
> DMA-API: device driver tries to free DM
> A memory it has not allocated [device address=0x000000011e4c3000]
> [size=4096 bytes]

Hmm, looking again over the code I've seen that the ref
dma_debug_entries are not alway filled with all necessary information
for best-fit. Can you please try if you still get false warnings when
you apply the two patches attached instead of the one I sent yesterday?

Thanks,

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632

[-- Attachment #2: 0001-dma-debug-check-for-sg_call_ents-in-best-fit-algorit.patch --]
[-- Type: text/x-diff, Size: 1833 bytes --]

>From 4620c534541fb4d28c0c52553274ef94a43813ab Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Thu, 11 Jun 2009 10:03:42 +0200
Subject: [PATCH 1/2] dma-debug: check for sg_call_ents in best-fit algorithm too

If we don't check for sg_call_ents the hash_bucket_find function might
still return the wrong dma_debug_entry for sg mappings.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index ad65fc0..c71e2dd 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -262,11 +262,12 @@ static struct dma_debug_entry *hash_bucket_find(struct hash_bucket *bucket,
 		 */
 		matches += 1;
 		match_lvl = 0;
-		entry->size      == ref->size      ? ++match_lvl : match_lvl;
-		entry->type      == ref->type      ? ++match_lvl : match_lvl;
-		entry->direction == ref->direction ? ++match_lvl : match_lvl;
+		entry->size         == ref->size         ? ++match_lvl : 0;
+		entry->type         == ref->type         ? ++match_lvl : 0;
+		entry->direction    == ref->direction    ? ++match_lvl : 0;
+		entry->sg_call_ents == ref->sg_call_ents ? ++match_lvl : 0;
 
-		if (match_lvl == 3) {
+		if (match_lvl == 4) {
 			/* perfect-fit - return the result */
 			return entry;
 		} else if (match_lvl > last_lvl) {
@@ -1076,16 +1077,14 @@ void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 			.dev_addr       = sg_dma_address(s),
 			.size           = sg_dma_len(s),
 			.direction      = dir,
-			.sg_call_ents   = 0,
+			.sg_call_ents   = nelems,
 		};
 
 		if (mapped_ents && i >= mapped_ents)
 			break;
 
-		if (!i) {
-			ref.sg_call_ents = nelems;
+		if (!i)
 			mapped_ents = get_nr_mapped_entries(dev, s);
-		}
 
 		check_unmap(&ref);
 	}
-- 
1.6.3.1


[-- Attachment #3: 0002-dma-debug-be-more-careful-when-building-reference-en.patch --]
[-- Type: text/x-diff, Size: 8816 bytes --]

>From 15aad0cfd5061c6c479666234278d1473445c473 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Fri, 12 Jun 2009 15:25:06 +0200
Subject: [PATCH 2/2] dma-debug: be more careful when building reference entries

The current code is not very careful when it builds reference
dma_debug_entries which get passed to hash_bucket_find(). But since this
function changed to a best-fit algorithm these entries have to be more
acurate. This patch adds this higher level of accuracy.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 lib/dma-debug.c |  134 +++++++++++++++++++++++++++++++++++++------------------
 1 files changed, 91 insertions(+), 43 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index c71e2dd..3b93129 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -874,72 +874,68 @@ static void check_for_illegal_area(struct device *dev, void *addr, u64 size)
 				"[addr=%p] [size=%llu]\n", addr, size);
 }
 
-static void check_sync(struct device *dev, dma_addr_t addr,
-		       u64 size, u64 offset, int direction, bool to_cpu)
+static void check_sync(struct device *dev,
+		       struct dma_debug_entry *ref,
+		       bool to_cpu)
 {
-	struct dma_debug_entry ref = {
-		.dev            = dev,
-		.dev_addr       = addr,
-		.size           = size,
-		.direction      = direction,
-	};
 	struct dma_debug_entry *entry;
 	struct hash_bucket *bucket;
 	unsigned long flags;
 
-	bucket = get_hash_bucket(&ref, &flags);
+	bucket = get_hash_bucket(ref, &flags);
 
-	entry = hash_bucket_find(bucket, &ref);
+	entry = hash_bucket_find(bucket, ref);
 
 	if (!entry) {
 		err_printk(dev, NULL, "DMA-API: device driver tries "
 				"to sync DMA memory it has not allocated "
 				"[device address=0x%016llx] [size=%llu bytes]\n",
-				(unsigned long long)addr, size);
+				(unsigned long long)ref->dev_addr, ref->size);
 		goto out;
 	}
 
-	if ((offset + size) > entry->size) {
+	if (ref->size > entry->size) {
 		err_printk(dev, entry, "DMA-API: device driver syncs"
 				" DMA memory outside allocated range "
 				"[device address=0x%016llx] "
-				"[allocation size=%llu bytes] [sync offset=%llu] "
-				"[sync size=%llu]\n", entry->dev_addr, entry->size,
-				offset, size);
+				"[allocation size=%llu bytes] "
+				"[sync offset+size=%llu]\n",
+				entry->dev_addr, entry->size,
+				ref->size);
 	}
 
-	if (direction != entry->direction) {
+	if (ref->direction != entry->direction) {
 		err_printk(dev, entry, "DMA-API: device driver syncs "
 				"DMA memory with different direction "
 				"[device address=0x%016llx] [size=%llu bytes] "
 				"[mapped with %s] [synced with %s]\n",
-				(unsigned long long)addr, entry->size,
+				(unsigned long long)ref->dev_addr, entry->size,
 				dir2name[entry->direction],
-				dir2name[direction]);
+				dir2name[ref->direction]);
 	}
 
 	if (entry->direction == DMA_BIDIRECTIONAL)
 		goto out;
 
 	if (to_cpu && !(entry->direction == DMA_FROM_DEVICE) &&
-		      !(direction == DMA_TO_DEVICE))
+		      !(ref->direction == DMA_TO_DEVICE))
 		err_printk(dev, entry, "DMA-API: device driver syncs "
 				"device read-only DMA memory for cpu "
 				"[device address=0x%016llx] [size=%llu bytes] "
 				"[mapped with %s] [synced with %s]\n",
-				(unsigned long long)addr, entry->size,
+				(unsigned long long)ref->dev_addr, entry->size,
 				dir2name[entry->direction],
-				dir2name[direction]);
+				dir2name[ref->direction]);
 
 	if (!to_cpu && !(entry->direction == DMA_TO_DEVICE) &&
-		       !(direction == DMA_FROM_DEVICE))
+		       !(ref->direction == DMA_FROM_DEVICE))
 		err_printk(dev, entry, "DMA-API: device driver syncs "
 				"device write-only DMA memory to device "
 				"[device address=0x%016llx] [size=%llu bytes] "
 				"[mapped with %s] [synced with %s]\n",
-				(unsigned long long)addr, entry->size,
+				(unsigned long long)ref->dev_addr, entry->size,
 				dir2name[entry->direction],
-				dir2name[direction]);
+				dir2name[ref->direction]);
 
 out:
 	put_hash_bucket(bucket, &flags);
@@ -1037,19 +1033,16 @@ void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
 }
 EXPORT_SYMBOL(debug_dma_map_sg);
 
-static int get_nr_mapped_entries(struct device *dev, struct scatterlist *s)
+static int get_nr_mapped_entries(struct device *dev,
+				 struct dma_debug_entry *ref)
 {
-	struct dma_debug_entry *entry, ref;
+	struct dma_debug_entry *entry;
 	struct hash_bucket *bucket;
 	unsigned long flags;
 	int mapped_ents;
 
-	ref.dev      = dev;
-	ref.dev_addr = sg_dma_address(s);
-	ref.size     = sg_dma_len(s),
-
-	bucket       = get_hash_bucket(&ref, &flags);
-	entry        = hash_bucket_find(bucket, &ref);
+	bucket       = get_hash_bucket(ref, &flags);
+	entry        = hash_bucket_find(bucket, ref);
 	mapped_ents  = 0;
 
 	if (entry)
@@ -1084,7 +1077,7 @@ void debug_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 			break;
 
 		if (!i)
-			mapped_ents = get_nr_mapped_entries(dev, s);
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
 
 		check_unmap(&ref);
 	}
@@ -1139,10 +1132,19 @@ EXPORT_SYMBOL(debug_dma_free_coherent);
 void debug_dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,
 				   size_t size, int direction)
 {
+	struct dma_debug_entry ref;
+
 	if (unlikely(global_disable))
 		return;
 
-	check_sync(dev, dma_handle, size, 0, direction, true);
+	ref.type         = dma_debug_single;
+	ref.dev          = dev;
+	ref.dev_addr     = dma_handle;
+	ref.size         = size;
+	ref.direction    = direction;
+	ref.sg_call_ents = 0;
+
+	check_sync(dev, &ref, true);
 }
 EXPORT_SYMBOL(debug_dma_sync_single_for_cpu);
 
@@ -1150,10 +1152,19 @@ void debug_dma_sync_single_for_device(struct device *dev,
 				      dma_addr_t dma_handle, size_t size,
 				      int direction)
 {
+	struct dma_debug_entry ref;
+
 	if (unlikely(global_disable))
 		return;
 
-	check_sync(dev, dma_handle, size, 0, direction, false);
+	ref.type         = dma_debug_single;
+	ref.dev          = dev;
+	ref.dev_addr     = dma_handle;
+	ref.size         = size;
+	ref.direction    = direction;
+	ref.sg_call_ents = 0;
+
+	check_sync(dev, &ref, false);
 }
 EXPORT_SYMBOL(debug_dma_sync_single_for_device);
 
@@ -1162,10 +1173,19 @@ void debug_dma_sync_single_range_for_cpu(struct device *dev,
 					 unsigned long offset, size_t size,
 					 int direction)
 {
+	struct dma_debug_entry ref;
+
 	if (unlikely(global_disable))
 		return;
 
-	check_sync(dev, dma_handle, size, offset, direction, true);
+	ref.type         = dma_debug_single;
+	ref.dev          = dev;
+	ref.dev_addr     = dma_handle;
+	ref.size         = offset + size;
+	ref.direction    = direction;
+	ref.sg_call_ents = 0;
+
+	check_sync(dev, &ref, true);
 }
 EXPORT_SYMBOL(debug_dma_sync_single_range_for_cpu);
 
@@ -1174,10 +1194,19 @@ void debug_dma_sync_single_range_for_device(struct device *dev,
 					    unsigned long offset,
 					    size_t size, int direction)
 {
+	struct dma_debug_entry ref;
+
 	if (unlikely(global_disable))
 		return;
 
-	check_sync(dev, dma_handle, size, offset, direction, false);
+	ref.type         = dma_debug_single;
+	ref.dev          = dev;
+	ref.dev_addr     = dma_handle;
+	ref.size         = offset + size;
+	ref.direction    = direction;
+	ref.sg_call_ents = 0;
+
+	check_sync(dev, &ref, false);
 }
 EXPORT_SYMBOL(debug_dma_sync_single_range_for_device);
 
@@ -1191,14 +1220,24 @@ void debug_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 		return;
 
 	for_each_sg(sg, s, nelems, i) {
+
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_sg,
+			.dev            = dev,
+			.paddr          = sg_phys(s),
+			.dev_addr       = sg_dma_address(s),
+			.size           = sg_dma_len(s),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
+
 		if (!i)
-			mapped_ents = get_nr_mapped_entries(dev, s);
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
 
 		if (i >= mapped_ents)
 			break;
 
-		check_sync(dev, sg_dma_address(s), sg_dma_len(s), 0,
-			   direction, true);
+		check_sync(dev, &ref, true);
 	}
 }
 EXPORT_SYMBOL(debug_dma_sync_sg_for_cpu);
@@ -1213,14 +1252,23 @@ void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		return;
 
 	for_each_sg(sg, s, nelems, i) {
+
+		struct dma_debug_entry ref = {
+			.type           = dma_debug_sg,
+			.dev            = dev,
+			.paddr          = sg_phys(s),
+			.dev_addr       = sg_dma_address(s),
+			.size           = sg_dma_len(s),
+			.direction      = direction,
+			.sg_call_ents   = nelems,
+		};
 		if (!i)
-			mapped_ents = get_nr_mapped_entries(dev, s);
+			mapped_ents = get_nr_mapped_entries(dev, &ref);
 
 		if (i >= mapped_ents)
 			break;
 
-		check_sync(dev, sg_dma_address(s), sg_dma_len(s), 0,
-			   direction, false);
+		check_sync(dev, &ref, false);
 	}
 }
 EXPORT_SYMBOL(debug_dma_sync_sg_for_device);
-- 
1.6.3.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-12 14:13         ` Joerg Roedel
@ 2009-06-12 14:51           ` Torsten Kaiser
  2009-06-13 11:10             ` Joerg Roedel
  2009-06-13 17:08           ` Torsten Kaiser
  1 sibling, 1 reply; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-12 14:51 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 12, 2009 at 4:13 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
>> DMA-API: device driver tries to free DM
>> A memory it has not allocated [device address=0x000000011e4c3000]
>> [size=4096 bytes]
>
> Hmm, looking again over the code I've seen that the ref
> dma_debug_entries are not alway filled with all necessary information
> for best-fit. Can you please try if you still get false warnings when
> you apply the two patches attached instead of the one I sent yesterday?

Hmm... There isn't a get_nr_mapped_entries() in 2.6.30.
Are these patches again the current linus tree?

For unfilled information I wanted to complain that for non-sg mapping
sg_call_ents never gets filled. But as dma_entry_alloc() does a
memset() to clean the entrys it should not matter.
And what about dma_debug_entry.paddr? Should this also be compared?

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-12 14:51           ` Torsten Kaiser
@ 2009-06-13 11:10             ` Joerg Roedel
  2009-06-13 14:26               ` Torsten Kaiser
  0 siblings, 1 reply; 30+ messages in thread
From: Joerg Roedel @ 2009-06-13 11:10 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 12, 2009 at 04:51:01PM +0200, Torsten Kaiser wrote:
> On Fri, Jun 12, 2009 at 4:13 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> > On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
> >> DMA-API: device driver tries to free DM
> >> A memory it has not allocated [device address=0x000000011e4c3000]
> >> [size=4096 bytes]
> >
> > Hmm, looking again over the code I've seen that the ref
> > dma_debug_entries are not alway filled with all necessary information
> > for best-fit. Can you please try if you still get false warnings when
> > you apply the two patches attached instead of the one I sent yesterday?
> 
> Hmm... There isn't a get_nr_mapped_entries() in 2.6.30.
> Are these patches again the current linus tree?

Indeed. These patches are against tip/core/iommu but should also apply
against current linus/master too.

> For unfilled information I wanted to complain that for non-sg mapping
> sg_call_ents never gets filled. But as dma_entry_alloc() does a
> memset() to clean the entrys it should not matter.
> And what about dma_debug_entry.paddr? Should this also be compared?

paddr shouldn't matter. We also can't compare against it because it is
not a parameter we get in the unmap path (not always).

Joerg


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-13 11:10             ` Joerg Roedel
@ 2009-06-13 14:26               ` Torsten Kaiser
  2009-06-13 17:34                 ` Joerg Roedel
  0 siblings, 1 reply; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-13 14:26 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Sat, Jun 13, 2009 at 1:10 PM, Joerg Roedel<joro@8bytes.org> wrote:
> On Fri, Jun 12, 2009 at 04:51:01PM +0200, Torsten Kaiser wrote:
>> On Fri, Jun 12, 2009 at 4:13 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
>> > On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
>> >> DMA-API: device driver tries to free DM
>> >> A memory it has not allocated [device address=0x000000011e4c3000]
>> >> [size=4096 bytes]
>> >
>> > Hmm, looking again over the code I've seen that the ref
>> > dma_debug_entries are not alway filled with all necessary information
>> > for best-fit. Can you please try if you still get false warnings when
>> > you apply the two patches attached instead of the one I sent yesterday?
>>
>> Hmm... There isn't a get_nr_mapped_entries() in 2.6.30.
>> Are these patches again the current linus tree?
>
> Indeed. These patches are against tip/core/iommu but should also apply
> against current linus/master too.

OK, I will try 2.6.30-git6 with and without these patches.

>> For unfilled information I wanted to complain that for non-sg mapping
>> sg_call_ents never gets filled. But as dma_entry_alloc() does a
>> memset() to clean the entrys it should not matter.
>> And what about dma_debug_entry.paddr? Should this also be compared?
>
> paddr shouldn't matter. We also can't compare against it because it is
> not a parameter we get in the unmap path (not always).

check_unmap() checks it and sg_call_ents are also optional.

Torsten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-12 14:13         ` Joerg Roedel
  2009-06-12 14:51           ` Torsten Kaiser
@ 2009-06-13 17:08           ` Torsten Kaiser
  2009-06-15  7:46             ` Joerg Roedel
  1 sibling, 1 reply; 30+ messages in thread
From: Torsten Kaiser @ 2009-06-13 17:08 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: FUJITA Tomonori, Linus Torvalds, mingo, lethal, hancockrwd,
	jens.axboe, bharrosh, linux-kernel, linux-scsi

On Fri, Jun 12, 2009 at 4:13 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
>> DMA-API: device driver tries to free DM
>> A memory it has not allocated [device address=0x000000011e4c3000]
>> [size=4096 bytes]
>
> Hmm, looking again over the code I've seen that the ref
> dma_debug_entries are not alway filled with all necessary information
> for best-fit. Can you please try if you still get false warnings when
> you apply the two patches attached instead of the one I sent yesterday?

I tested these patches
(0001-dma-debug-check-for-sg_call_ents-in-best-fit-algorit.patch and
0002-dma-debug-be-more-careful-when-building-reference-en.patch)
against 2.6.30-git6 and did not see any warnings.

I can't be 100% sure about the fix, because I do not have a reliable
trigger, but it looks quite good.

Torsten


Just for reference, with an unpatched 2.6.30-git6 the warning was still there:
Jun 13 16:54:56 treogen [  168.201493] ------------[ cut here ]------------
Jun 13 16:54:56 treogen [  168.201512] WARNING: at lib/dma-debug.c:827
check_unmap+0x593/0x670()
Jun 13 16:54:56 treogen [  168.201517] Hardware name: KFN5-D SLI
Jun 13 16:54:56 treogen [  168.201522] sata_sil24 0000:04:00.0:
DMA-API: device driver frees DMA sg lis
t with different entry count [map count=1] [unmap count=2]
Jun 13 16:54:56 treogen [  168.201526] Modules linked in: msp3400
tuner tea5767 tda8290 tuner_xc2028 xc
5000 tda9887 tuner_simple tuner_types mt20xx tea5761 bttv ir_common
v4l2_common videodev v4l1_compat v4
l2_compat_ioctl32 videobuf_dma_sg videobuf_core pata_amd btcx_risc tveeprom sg
Jun 13 16:54:56 treogen [  168.201562] Pid: 0, comm: swapper Not
tainted 2.6.30-git6 #1
Jun 13 16:54:56 treogen [  168.201566] Call Trace:
Jun 13 16:54:56 treogen [  168.201569]  <IRQ>  [<ffffffff8120bb63>] ?
check_unmap+0x593/0x670
Jun 13 16:54:56 treogen [  168.201585]  [<ffffffff81044138>]
warn_slowpath_common+0x78/0xd0
Jun 13 16:54:56 treogen [  168.201591]  [<ffffffff81044214>]
warn_slowpath_fmt+0x64/0x70
Jun 13 16:54:56 treogen [  168.201599]  [<ffffffff810b3bf5>] ?
__slab_free+0x185/0x340
Jun 13 16:54:56 treogen [  168.201606]  [<ffffffff8120bb63>]
check_unmap+0x593/0x670
Jun 13 16:54:56 treogen [  168.201615]  [<ffffffff810e532d>] ?
bio_free+0x4d/0x60
Jun 13 16:54:56 treogen [  168.201622]  [<ffffffff810e5350>] ?
bio_fs_destructor+0x10/0x20
Jun 13 16:54:56 treogen [  168.201628]  [<ffffffff8120bd3e>]
debug_dma_unmap_sg+0xfe/0x140
Jun 13 16:54:56 treogen [  168.201636]  [<ffffffff812ab102>] ?
put_device+0x12/0x20
Jun 13 16:54:56 treogen [  168.201645]  [<ffffffff812c9248>]
ata_sg_clean+0x78/0xf0
Jun 13 16:54:56 treogen [  168.201651]  [<ffffffff812c92f5>]
__ata_qc_complete+0x35/0x110
Jun 13 16:54:56 treogen [  168.201658]  [<ffffffff812c948d>]
ata_qc_complete+0xbd/0x250
Jun 13 16:54:56 treogen [  168.201665]  [<ffffffff812c99bc>]
ata_qc_complete_multiple+0x9c/0xf0
Jun 13 16:54:56 treogen [  168.201674]  [<ffffffff812df219>]
sil24_interrupt+0xb9/0x5b0
Jun 13 16:54:56 treogen [  168.201681]  [<ffffffff81061c5c>] ?
getnstimeofday+0x5c/0xf0
Jun 13 16:54:56 treogen [  168.201688]  [<ffffffff8105dba9>] ?
ktime_get_ts+0x59/0x60
Jun 13 16:54:56 treogen [  168.201696]  [<ffffffff81074598>]
handle_IRQ_event+0x68/0x160
Jun 13 16:54:56 treogen [  168.201703]  [<ffffffff8107663d>]
handle_fasteoi_irq+0x6d/0xe0
Jun 13 16:54:56 treogen [  168.201710]  [<ffffffff8100e33f>]
handle_irq+0x1f/0x30
Jun 13 16:54:56 treogen [  168.201715]  [<ffffffff8100d9ba>] do_IRQ+0x6a/0xe0
Jun 13 16:54:56 treogen [  168.201722]  [<ffffffff8100bd53>]
ret_from_intr+0x0/0xa
Jun 13 16:54:56 treogen [  168.201725]  <EOI>  [<ffffffff8101340a>] ?
default_idle+0x6a/0xd0
Jun 13 16:54:56 treogen [  168.201738]  [<ffffffff8105ebff>] ?
notifier_call_chain+0x3f/0x80
Jun 13 16:54:56 treogen [  168.201744]  [<ffffffff810134a8>] ?
c1e_idle+0x38/0x100
Jun 13 16:54:56 treogen [  168.201751]  [<ffffffff8105ec65>] ?
atomic_notifier_call_chain+0x15/0x20
Jun 13 16:54:56 treogen [  168.201757]  [<ffffffff8100a662>] ?
cpu_idle+0x62/0xb0
Jun 13 16:54:56 treogen [  168.201766]  [<ffffffff8147190d>] ?
rest_init+0x6d/0x80
Jun 13 16:54:56 treogen [  168.201774]  [<ffffffff8171dcad>] ?
start_kernel+0x342/0x405
Jun 13 16:54:56 treogen [  168.201781]  [<ffffffff8171d289>] ?
x86_64_start_reservations+0x99/0xb9
Jun 13 16:54:56 treogen [  168.201787]  [<ffffffff8171d389>] ?
x86_64_start_kernel+0xe0/0xf2
Jun 13 16:54:56 treogen [  168.201792] ---[ end trace 7ab4a9443c6c6e69 ]---
Jun 13 16:54:56 treogen [  168.201795] Mapped at:
Jun 13 16:54:56 treogen [  168.201797]  [<ffffffff8120c609>]
debug_dma_map_sg+0x159/0x180
Jun 13 16:54:56 treogen [  168.201805]  [<ffffffff812c97bc>]
ata_qc_issue+0x19c/0x300
Jun 13 16:54:56 treogen [  168.201812]  [<ffffffff812cf6b8>]
ata_scsi_translate+0xa8/0x180
Jun 13 16:54:56 treogen [  168.201818]  [<ffffffff812d2251>]
ata_scsi_queuecmd+0xb1/0x2d0
Jun 13 16:54:56 treogen [  168.201823]  [<ffffffff812b5fb3>]
scsi_dispatch_cmd+0xe3/0x220

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-13 14:26               ` Torsten Kaiser
@ 2009-06-13 17:34                 ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-13 17:34 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Sat, Jun 13, 2009 at 04:26:06PM +0200, Torsten Kaiser wrote:
> On Sat, Jun 13, 2009 at 1:10 PM, Joerg Roedel<joro@8bytes.org> wrote:
> >
> > paddr shouldn't matter. We also can't compare against it because it is
> > not a parameter we get in the unmap path (not always).
> 
> check_unmap() checks it and sg_call_ents are also optional.

check_unmap() only checks it for coherent mappings (where it makes
sense). But for a best-fit search it does not make sense because we
already check for the type and multiple coherent mappings to the same
dma_addr for the same device are not possible (because for coherent
mappings the memory is allocated by the dma_ops backend).

Joerg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] dma-debug: disable DMA_API_DEBUG for now
  2009-06-13 17:08           ` Torsten Kaiser
@ 2009-06-15  7:46             ` Joerg Roedel
  0 siblings, 0 replies; 30+ messages in thread
From: Joerg Roedel @ 2009-06-15  7:46 UTC (permalink / raw)
  To: Torsten Kaiser
  Cc: Joerg Roedel, FUJITA Tomonori, Linus Torvalds, mingo, lethal,
	hancockrwd, jens.axboe, bharrosh, linux-kernel, linux-scsi

On Sat, Jun 13, 2009 at 07:08:01PM +0200, Torsten Kaiser wrote:
> On Fri, Jun 12, 2009 at 4:13 PM, Joerg Roedel<joerg.roedel@amd.com> wrote:
> > On Thu, Jun 11, 2009 at 07:38:47PM +0200, Torsten Kaiser wrote:
> >> DMA-API: device driver tries to free DM
> >> A memory it has not allocated [device address=0x000000011e4c3000]
> >> [size=4096 bytes]
> >
> > Hmm, looking again over the code I've seen that the ref
> > dma_debug_entries are not alway filled with all necessary information
> > for best-fit. Can you please try if you still get false warnings when
> > you apply the two patches attached instead of the one I sent yesterday?
> 
> I tested these patches
> (0001-dma-debug-check-for-sg_call_ents-in-best-fit-algorit.patch and
> 0002-dma-debug-be-more-careful-when-building-reference-en.patch)
> against 2.6.30-git6 and did not see any warnings.
> 
> I can't be 100% sure about the fix, because I do not have a reliable
> trigger, but it looks quite good.

Ok cool, I will push these two patches upstream then and will send them
to -stable too. So we get this fixed in a 30.x release too. Thanks a lot
for your testing.

	Joerg


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-06-15  7:46 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-05  8:33 [PATCH] dma-debug: disable DMA_API_DEBUG for now FUJITA Tomonori
2009-06-05 10:41 ` Joerg Roedel
2009-06-05 10:41   ` Joerg Roedel
2009-06-05 11:38   ` FUJITA Tomonori
2009-06-05 11:38     ` FUJITA Tomonori
2009-06-05 12:44     ` Joerg Roedel
2009-06-05 12:44       ` Joerg Roedel
2009-06-05 14:57     ` Arnd Bergmann
2009-06-05 15:52   ` Torsten Kaiser
2009-06-05 15:52     ` Torsten Kaiser
2009-06-05 18:20     ` Joerg Roedel
2009-06-05 20:25       ` Torsten Kaiser
2009-06-05 22:11       ` Torsten Kaiser
2009-06-07  8:13   ` Ingo Molnar
2009-06-07  8:22     ` Torsten Kaiser
2009-06-07 10:45     ` FUJITA Tomonori
2009-06-07 10:22   ` [tip:core/iommu] dma-debug: change hash_bucket_find from first-fit to best-fit tip-bot for Joerg Roedel
2009-06-10 20:41   ` [PATCH] dma-debug: disable DMA_API_DEBUG for now Torsten Kaiser
2009-06-11  8:10     ` Joerg Roedel
2009-06-11  8:10       ` Joerg Roedel
2009-06-11 17:38       ` Torsten Kaiser
2009-06-12  7:50         ` Joerg Roedel
2009-06-12  7:50           ` Joerg Roedel
2009-06-12 14:13         ` Joerg Roedel
2009-06-12 14:51           ` Torsten Kaiser
2009-06-13 11:10             ` Joerg Roedel
2009-06-13 14:26               ` Torsten Kaiser
2009-06-13 17:34                 ` Joerg Roedel
2009-06-13 17:08           ` Torsten Kaiser
2009-06-15  7:46             ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.