All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-01 22:59 ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-nvme, corbet, akpm, rmk+kernel,
	keith.busch, axboe, benh, mpe
  Cc: k.kozlowski

This patchset introduces dma_attr DMA_ATTR_NO_WARN (just like __GFP_NOWARN),
which tells the DMA-mapping subsystem to suppress allocation failure reports.

On some architectures allocation failures are reported with error messages
to the system logs.  Although this can help to identify and debug problems,
drivers which handle failures (eg, retry later) have no problems with them,
and can actually flood the system logs with error messages that aren't any
problem at all, depending on the implementation of the retry mechanism.

So, this provides a way for drivers to avoid those error messages on calls
where allocation failures are not a problem, and shouldn't bother the logs.

 - Patch 1/3 introduces the dma_attr DMA_ATTR_NO_WARN.

 - Patch 2/3 implements support for it on powerpc arch (where this problem
             was observed;  it's possible to extend support for more archs)

 - Patch 3/3 implements it on the nvme driver (which might repeatedly trip
             on allocation failures due to high load, flooding system logs
             with error messages at least on powerpc: "iommu_alloc failed")

Changelog:
 v4:
  - rebase for commit 53a4b60 dma-mapping: use unsigned long for dma_attrs.
  - reorder patches 2/3 and 3/3.
 v3:
  - nvme: use DMA_ATTR_NO_WARN when ret = BLK_MQ_RQ_QUEUE_BUSY (io will be
    requeued) but not when ret = BLK_MQ_RQ_QUEUE_ERROR (io will be failed).
    thanks: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
 v2:
  - all: address warnings from checkpatch.pl (line wrapping and typos)

Tested on next-20160801.

Mauricio Faria de Oliveira (3):
  dma-mapping: introduce the DMA_ATTR_NO_WARN attribute
  powerpc: implement the DMA_ATTR_NO_WARN attribute
  nvme: use the DMA_ATTR_NO_WARN attribute

 Documentation/DMA-attributes.txt | 17 +++++++++++++++++
 arch/powerpc/kernel/iommu.c      |  6 ++++--
 drivers/nvme/host/pci.c          |  3 ++-
 include/linux/dma-mapping.h      |  5 +++++
 4 files changed, 28 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-01 22:59 ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)


This patchset introduces dma_attr DMA_ATTR_NO_WARN (just like __GFP_NOWARN),
which tells the DMA-mapping subsystem to suppress allocation failure reports.

On some architectures allocation failures are reported with error messages
to the system logs.  Although this can help to identify and debug problems,
drivers which handle failures (eg, retry later) have no problems with them,
and can actually flood the system logs with error messages that aren't any
problem at all, depending on the implementation of the retry mechanism.

So, this provides a way for drivers to avoid those error messages on calls
where allocation failures are not a problem, and shouldn't bother the logs.

 - Patch 1/3 introduces the dma_attr DMA_ATTR_NO_WARN.

 - Patch 2/3 implements support for it on powerpc arch (where this problem
             was observed;  it's possible to extend support for more archs)

 - Patch 3/3 implements it on the nvme driver (which might repeatedly trip
             on allocation failures due to high load, flooding system logs
             with error messages at least on powerpc: "iommu_alloc failed")

Changelog:
 v4:
  - rebase for commit 53a4b60 dma-mapping: use unsigned long for dma_attrs.
  - reorder patches 2/3 and 3/3.
 v3:
  - nvme: use DMA_ATTR_NO_WARN when ret = BLK_MQ_RQ_QUEUE_BUSY (io will be
    requeued) but not when ret = BLK_MQ_RQ_QUEUE_ERROR (io will be failed).
    thanks: Masayoshi Mizuma <m.mizuma at jp.fujitsu.com>
 v2:
  - all: address warnings from checkpatch.pl (line wrapping and typos)

Tested on next-20160801.

Mauricio Faria de Oliveira (3):
  dma-mapping: introduce the DMA_ATTR_NO_WARN attribute
  powerpc: implement the DMA_ATTR_NO_WARN attribute
  nvme: use the DMA_ATTR_NO_WARN attribute

 Documentation/DMA-attributes.txt | 17 +++++++++++++++++
 arch/powerpc/kernel/iommu.c      |  6 ++++--
 drivers/nvme/host/pci.c          |  3 ++-
 include/linux/dma-mapping.h      |  5 +++++
 4 files changed, 28 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/3] dma-mapping: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-01 22:59 ` Mauricio Faria de Oliveira
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-nvme, corbet, akpm, rmk+kernel,
	keith.busch, axboe, benh, mpe
  Cc: k.kozlowski

Introduce the DMA_ATTR_NO_WARN attribute, and document it.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 Documentation/DMA-attributes.txt | 17 +++++++++++++++++
 include/linux/dma-mapping.h      |  5 +++++
 2 files changed, 22 insertions(+)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 2d455a5..98bf7ac 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -126,3 +126,20 @@ means that we won't try quite as hard to get them.
 
 NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM,
 though ARM64 patches will likely be posted soon.
+
+DMA_ATTR_NO_WARN
+----------------
+
+This tells the DMA-mapping subsystem to suppress allocation failure reports
+(similarly to __GFP_NOWARN).
+
+On some architectures allocation failures are reported with error messages
+to the system logs.  Although this can help to identify and debug problems,
+drivers which handle failures (eg, retry later) have no problems with them,
+and can actually flood the system logs with error messages that aren't any
+problem at all, depending on the implementation of the retry mechanism.
+
+So, this provides a way for drivers to avoid those error messages on calls
+where allocation failures are not a problem, and shouldn't bother the logs.
+
+NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 66533e1..6efbd27 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -56,6 +56,11 @@
  * that gives better TLB efficiency.
  */
 #define DMA_ATTR_ALLOC_SINGLE_PAGES	(1UL << 7)
+/*
+ * DMA_ATTR_NO_WARN: This tells the DMA-mapping subsystem to suppress
+ * allocation failure reports (similarly to __GFP_NOWARN).
+ */
+#define DMA_ATTR_NO_WARN	(1UL << 8)
 
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 1/3] dma-mapping: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)


Introduce the DMA_ATTR_NO_WARN attribute, and document it.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo at linux.vnet.ibm.com>
---
 Documentation/DMA-attributes.txt | 17 +++++++++++++++++
 include/linux/dma-mapping.h      |  5 +++++
 2 files changed, 22 insertions(+)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index 2d455a5..98bf7ac 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -126,3 +126,20 @@ means that we won't try quite as hard to get them.
 
 NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM,
 though ARM64 patches will likely be posted soon.
+
+DMA_ATTR_NO_WARN
+----------------
+
+This tells the DMA-mapping subsystem to suppress allocation failure reports
+(similarly to __GFP_NOWARN).
+
+On some architectures allocation failures are reported with error messages
+to the system logs.  Although this can help to identify and debug problems,
+drivers which handle failures (eg, retry later) have no problems with them,
+and can actually flood the system logs with error messages that aren't any
+problem at all, depending on the implementation of the retry mechanism.
+
+So, this provides a way for drivers to avoid those error messages on calls
+where allocation failures are not a problem, and shouldn't bother the logs.
+
+NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 66533e1..6efbd27 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -56,6 +56,11 @@
  * that gives better TLB efficiency.
  */
 #define DMA_ATTR_ALLOC_SINGLE_PAGES	(1UL << 7)
+/*
+ * DMA_ATTR_NO_WARN: This tells the DMA-mapping subsystem to suppress
+ * allocation failure reports (similarly to __GFP_NOWARN).
+ */
+#define DMA_ATTR_NO_WARN	(1UL << 8)
 
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/3] powerpc: implement the DMA_ATTR_NO_WARN attribute
  2016-08-01 22:59 ` Mauricio Faria de Oliveira
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-nvme, corbet, akpm, rmk+kernel,
	keith.busch, axboe, benh, mpe
  Cc: k.kozlowski

Add support for the DMA_ATTR_NO_WARN attribute on powerpc iommu code.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 37d6e74..5f202a5 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -479,7 +479,8 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table *tbl,
 
 		/* Handle failure */
 		if (unlikely(entry == DMA_ERROR_CODE)) {
-			if (printk_ratelimit())
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %lx npages %lu\n", tbl, vaddr,
 					 npages);
@@ -776,7 +777,8 @@ dma_addr_t iommu_map_page(struct device *dev, struct iommu_table *tbl,
 					 mask >> tbl->it_page_shift, align,
 					 attrs);
 		if (dma_handle == DMA_ERROR_CODE) {
-			if (printk_ratelimit())  {
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())  {
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %p npages %d\n", tbl, vaddr,
 					 npages);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/3] powerpc: implement the DMA_ATTR_NO_WARN attribute
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)


Add support for the DMA_ATTR_NO_WARN attribute on powerpc iommu code.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo at linux.vnet.ibm.com>
---
 arch/powerpc/kernel/iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 37d6e74..5f202a5 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -479,7 +479,8 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table *tbl,
 
 		/* Handle failure */
 		if (unlikely(entry == DMA_ERROR_CODE)) {
-			if (printk_ratelimit())
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %lx npages %lu\n", tbl, vaddr,
 					 npages);
@@ -776,7 +777,8 @@ dma_addr_t iommu_map_page(struct device *dev, struct iommu_table *tbl,
 					 mask >> tbl->it_page_shift, align,
 					 attrs);
 		if (dma_handle == DMA_ERROR_CODE) {
-			if (printk_ratelimit())  {
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())  {
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %p npages %d\n", tbl, vaddr,
 					 npages);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/3] nvme: use the DMA_ATTR_NO_WARN attribute
  2016-08-01 22:59 ` Mauricio Faria de Oliveira
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)
  To: linux-doc, linux-kernel, linux-nvme, corbet, akpm, rmk+kernel,
	keith.busch, axboe, benh, mpe
  Cc: k.kozlowski

Use the DMA_ATTR_NO_WARN attribute for the dma_map_sg() call of the nvme
driver that returns BLK_MQ_RQ_QUEUE_BUSY (not for BLK_MQ_RQ_QUEUE_ERROR).

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
---
 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d7c33f9..eeaeae0 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -503,7 +503,8 @@ static int nvme_map_data(struct nvme_dev *dev, struct request *req,
 		goto out;
 
 	ret = BLK_MQ_RQ_QUEUE_BUSY;
-	if (!dma_map_sg(dev->dev, iod->sg, iod->nents, dma_dir))
+	if (!dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, dma_dir,
+				DMA_ATTR_NO_WARN))
 		goto out;
 
 	if (!nvme_setup_prps(dev, req, size))
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/3] nvme: use the DMA_ATTR_NO_WARN attribute
@ 2016-08-01 22:59   ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 22:59 UTC (permalink / raw)


Use the DMA_ATTR_NO_WARN attribute for the dma_map_sg() call of the nvme
driver that returns BLK_MQ_RQ_QUEUE_BUSY (not for BLK_MQ_RQ_QUEUE_ERROR).

Signed-off-by: Mauricio Faria de Oliveira <mauricfo at linux.vnet.ibm.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman at linux.vnet.ibm.com>
---
 drivers/nvme/host/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d7c33f9..eeaeae0 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -503,7 +503,8 @@ static int nvme_map_data(struct nvme_dev *dev, struct request *req,
 		goto out;
 
 	ret = BLK_MQ_RQ_QUEUE_BUSY;
-	if (!dma_map_sg(dev->dev, iod->sg, iod->nents, dma_dir))
+	if (!dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, dma_dir,
+				DMA_ATTR_NO_WARN))
 		goto out;
 
 	if (!nvme_setup_prps(dev, req, size))
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-01 22:59 ` Mauricio Faria de Oliveira
@ 2016-08-04 22:01   ` Andrew Morton
  -1 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-04 22:01 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

On Mon,  1 Aug 2016 19:59:47 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> This patchset introduces dma_attr DMA_ATTR_NO_WARN (just like __GFP_NOWARN),
> which tells the DMA-mapping subsystem to suppress allocation failure reports.
> 
> On some architectures allocation failures are reported with error messages
> to the system logs.  Although this can help to identify and debug problems,
> drivers which handle failures (eg, retry later) have no problems with them,
> and can actually flood the system logs with error messages that aren't any
> problem at all, depending on the implementation of the retry mechanism.

It would help to have seen an example of the error message - please
always quote such things when fixing bugs.

I assume the warnings are coming via nvme_map_data()'s call to
blk_rq_map_sg()?  An alternative (and more idiomatic) fix would be to
change the blk_rq_map_sg() interface to permit passing down some
foo_NOWARN flag and propagating that down the stack into
ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
be messy.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-04 22:01   ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-04 22:01 UTC (permalink / raw)


On Mon,  1 Aug 2016 19:59:47 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> This patchset introduces dma_attr DMA_ATTR_NO_WARN (just like __GFP_NOWARN),
> which tells the DMA-mapping subsystem to suppress allocation failure reports.
> 
> On some architectures allocation failures are reported with error messages
> to the system logs.  Although this can help to identify and debug problems,
> drivers which handle failures (eg, retry later) have no problems with them,
> and can actually flood the system logs with error messages that aren't any
> problem at all, depending on the implementation of the retry mechanism.

It would help to have seen an example of the error message - please
always quote such things when fixing bugs.

I assume the warnings are coming via nvme_map_data()'s call to
blk_rq_map_sg()?  An alternative (and more idiomatic) fix would be to
change the blk_rq_map_sg() interface to permit passing down some
foo_NOWARN flag and propagating that down the stack into
ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
be messy.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-04 22:01   ` Andrew Morton
@ 2016-08-05  0:17     ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-05  0:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

Andrew,

On 08/04/2016 07:01 PM, Andrew Morton wrote:
> It would help to have seen an example of the error message - please
> always quote such things when fixing bugs.

Indeed; okay.

The error messages are several blocks like this one:

     ppc_iommu_map_sg: 11784 callbacks suppressed
     nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr 
c000018faa7b0000 npages 16
     nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr 
c000018faa9b0000 npages 16
     <repeat>

> I assume the warnings are coming via nvme_map_data()'s call to
> blk_rq_map_sg()?  [snip]

If I understand the point in the question correctly -- actually not,
the warnings are coming via:

   nvme_map_data()
   -> dma_map_sg[_attrs]()
      -> dma_map_ops.map_sg()
        (dma_map_ops = dma_iommu_ops @ arch/powerpc/kernel/iommu.c)
         -> dma_iommu_map_sg()
            -> ppc_iommu_map_sg() /* as seen above */

And from what I could observe, the blk_rq_map_sg() path doesn't end
up in there.

> [snip] An alternative (and more idiomatic) fix would be to
> change the blk_rq_map_sg() interface to permit passing down some
> foo_NOWARN flag and propagating that down the stack into
> ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
> be messy.

I see; I haven't evaluated that, but agree with you it might be messy.

As far as I can see, in order to pass something to blk_rq_map_sg() and
have it eventually make into ppc_iommu_map_sg(), that something should
be present in the scatterlist -- which seems to be what's common/passed
to both blk_rq_map_sg() (the interface point proposed) and dma_map_sg() 
(which is the function which reaches ppc_iommu_map_sg() down the chain).

It seems a bit hidden, and (if I got the suggestion right), it doesn't
seem to be in the scope of scatterlist to contain such a flag.

One point of the patches is make the attribute visible/explicit; I see
it can be inconvenient sometimes, but it allows for a clear / evident
difference between dma_map_sg() calls which are (not) OK with failures.

(for example, the 2 calls in nvme_map_data() - they can return either
BLK_MQ_RQ_QUEUE_BUSY or BLK_MQ_RQ_QUEUE_ERROR - so the former is OK.)

Does that make sense?

Thanks for the review.

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-05  0:17     ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-05  0:17 UTC (permalink / raw)


Andrew,

On 08/04/2016 07:01 PM, Andrew Morton wrote:
> It would help to have seen an example of the error message - please
> always quote such things when fixing bugs.

Indeed; okay.

The error messages are several blocks like this one:

     ppc_iommu_map_sg: 11784 callbacks suppressed
     nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr 
c000018faa7b0000 npages 16
     nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr 
c000018faa9b0000 npages 16
     <repeat>

> I assume the warnings are coming via nvme_map_data()'s call to
> blk_rq_map_sg()?  [snip]

If I understand the point in the question correctly -- actually not,
the warnings are coming via:

   nvme_map_data()
   -> dma_map_sg[_attrs]()
      -> dma_map_ops.map_sg()
        (dma_map_ops = dma_iommu_ops @ arch/powerpc/kernel/iommu.c)
         -> dma_iommu_map_sg()
            -> ppc_iommu_map_sg() /* as seen above */

And from what I could observe, the blk_rq_map_sg() path doesn't end
up in there.

> [snip] An alternative (and more idiomatic) fix would be to
> change the blk_rq_map_sg() interface to permit passing down some
> foo_NOWARN flag and propagating that down the stack into
> ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
> be messy.

I see; I haven't evaluated that, but agree with you it might be messy.

As far as I can see, in order to pass something to blk_rq_map_sg() and
have it eventually make into ppc_iommu_map_sg(), that something should
be present in the scatterlist -- which seems to be what's common/passed
to both blk_rq_map_sg() (the interface point proposed) and dma_map_sg() 
(which is the function which reaches ppc_iommu_map_sg() down the chain).

It seems a bit hidden, and (if I got the suggestion right), it doesn't
seem to be in the scope of scatterlist to contain such a flag.

One point of the patches is make the attribute visible/explicit; I see
it can be inconvenient sometimes, but it allows for a clear / evident
difference between dma_map_sg() calls which are (not) OK with failures.

(for example, the 2 calls in nvme_map_data() - they can return either
BLK_MQ_RQ_QUEUE_BUSY or BLK_MQ_RQ_QUEUE_ERROR - so the former is OK.)

Does that make sense?

Thanks for the review.

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-05  0:17     ` Mauricio Faria de Oliveira
@ 2016-08-05  1:05       ` Andrew Morton
  -1 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-05  1:05 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

On Thu, 4 Aug 2016 21:17:27 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> > [snip] An alternative (and more idiomatic) fix would be to
> > change the blk_rq_map_sg() interface to permit passing down some
> > foo_NOWARN flag and propagating that down the stack into
> > ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
> > be messy.
> 
> I see; I haven't evaluated that, but agree with you it might be messy.
> 
> As far as I can see, in order to pass something to blk_rq_map_sg() and
> have it eventually make into ppc_iommu_map_sg(), that something should
> be present in the scatterlist -- which seems to be what's common/passed
> to both blk_rq_map_sg() (the interface point proposed) and dma_map_sg() 
> (which is the function which reaches ppc_iommu_map_sg() down the chain).
> 
> It seems a bit hidden, and (if I got the suggestion right), it doesn't
> seem to be in the scope of scatterlist to contain such a flag.
> 
> One point of the patches is make the attribute visible/explicit; I see
> it can be inconvenient sometimes, but it allows for a clear / evident
> difference between dma_map_sg() calls which are (not) OK with failures.
> 
> (for example, the 2 calls in nvme_map_data() - they can return either
> BLK_MQ_RQ_QUEUE_BUSY or BLK_MQ_RQ_QUEUE_ERROR - so the former is OK.)

Of course, the alternative is to just delete the damn warnings from
ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
anyone?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-05  1:05       ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-05  1:05 UTC (permalink / raw)


On Thu, 4 Aug 2016 21:17:27 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> > [snip] An alternative (and more idiomatic) fix would be to
> > change the blk_rq_map_sg() interface to permit passing down some
> > foo_NOWARN flag and propagating that down the stack into
> > ppc_iommu_map_sg().  Was this approach evaluated?  I suspect it might
> > be messy.
> 
> I see; I haven't evaluated that, but agree with you it might be messy.
> 
> As far as I can see, in order to pass something to blk_rq_map_sg() and
> have it eventually make into ppc_iommu_map_sg(), that something should
> be present in the scatterlist -- which seems to be what's common/passed
> to both blk_rq_map_sg() (the interface point proposed) and dma_map_sg() 
> (which is the function which reaches ppc_iommu_map_sg() down the chain).
> 
> It seems a bit hidden, and (if I got the suggestion right), it doesn't
> seem to be in the scope of scatterlist to contain such a flag.
> 
> One point of the patches is make the attribute visible/explicit; I see
> it can be inconvenient sometimes, but it allows for a clear / evident
> difference between dma_map_sg() calls which are (not) OK with failures.
> 
> (for example, the 2 calls in nvme_map_data() - they can return either
> BLK_MQ_RQ_QUEUE_BUSY or BLK_MQ_RQ_QUEUE_ERROR - so the former is OK.)

Of course, the alternative is to just delete the damn warnings from
ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
anyone?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-05  1:05       ` Andrew Morton
@ 2016-08-05 12:34         ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-05 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

On 08/04/2016 10:05 PM, Andrew Morton wrote:
> Of course, the alternative is to just delete the damn warnings from
> ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
> anyone?

Sure. I submitted a patch to convert it to dynamic debug (so it would
still be available if one wanted to), but it wasn't accepted [1]; so
I guess it apparently is, in some cases.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html

thanks

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-05 12:34         ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-05 12:34 UTC (permalink / raw)


On 08/04/2016 10:05 PM, Andrew Morton wrote:
> Of course, the alternative is to just delete the damn warnings from
> ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
> anyone?

Sure. I submitted a patch to convert it to dynamic debug (so it would
still be available if one wanted to), but it wasn't accepted [1]; so
I guess it apparently is, in some cases.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html

thanks

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-05 12:34         ` Mauricio Faria de Oliveira
@ 2016-08-05 17:01           ` Andrew Morton
  -1 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-05 17:01 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

On Fri, 5 Aug 2016 09:34:20 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> On 08/04/2016 10:05 PM, Andrew Morton wrote:
> > Of course, the alternative is to just delete the damn warnings from
> > ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
> > anyone?
> 
> Sure. I submitted a patch to convert it to dynamic debug (so it would
> still be available if one wanted to), but it wasn't accepted [1]; so
> I guess it apparently is, in some cases.
> 
> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html
> 

Bah.  What about WARN__ON_ONCE()?  Or much heavier ratelimiting?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-05 17:01           ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2016-08-05 17:01 UTC (permalink / raw)


On Fri, 5 Aug 2016 09:34:20 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> On 08/04/2016 10:05 PM, Andrew Morton wrote:
> > Of course, the alternative is to just delete the damn warnings from
> > ppc_iommu_map_sg().  Imagine that!  Have they ever been of any use to
> > anyone?
> 
> Sure. I submitted a patch to convert it to dynamic debug (so it would
> still be available if one wanted to), but it wasn't accepted [1]; so
> I guess it apparently is, in some cases.
> 
> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html
> 

Bah.  What about WARN__ON_ONCE()?  Or much heavier ratelimiting?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
  2016-08-05 17:01           ` Andrew Morton
@ 2016-08-08 13:38             ` Mauricio Faria de Oliveira
  -1 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-08 13:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-doc, linux-kernel, linux-nvme, corbet, rmk+kernel,
	keith.busch, axboe, benh, mpe, k.kozlowski

On 08/05/2016 02:01 PM, Andrew Morton wrote:
> Bah.  What about WARN__ON_ONCE()?  Or much heavier ratelimiting?

The problem w/ the WARN_ON_ONCE macro for this case is that it applies
once per _function_ (because of its static variable), not per _device_.

So, after the message happens for a particular device, it won't happen
for any other devices.  This would still conflict with the reasons the
other patch (use dynamic debug) was rejected [1].

The other problem w/ WARN_ON_ONCE (even if it could be used per device)
is that it masks/hides some events:  if you reach it with requests that
are OK to fail (i.e., retry), it will be silent for requests that are
not OK to fail (e.g., the blk_integrity-related call on nvme) -- and it
could be helpful for debugging.

I believe that ratelimiting would fall into the same category as above
(masks events since it hits the threshold with the OK-to-fail requests,
and then misses the not-OK-to-fail requests).

The other problem with ratelimiting would be, in case of continuously
working on high loads with the devices (e.g., nvme drive) - which is
acceptable - the messages would still eventually be printed, and since
it will happen every time the threshold is OK to print it again, it's
only a repetition with a greater period.



... the difficult point with the problem this patch addresses is not
to be heavy handed;  it's interesting to find some balance to still
reach the message when it might be useful, and provide some form of
control to the device driver (for when it knows it is OK for requests
to fail).  Given that the driver can have multiple callsites (with
different or no failure handling), unfortunately I found it hard to
make this arch-specific (but that is what I really aimed for first).

If you see other approaches that could help with it, I'd be happy to
write something simpler/more discreet.

Thanks for the suggestions and discussions.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute
@ 2016-08-08 13:38             ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-08 13:38 UTC (permalink / raw)


On 08/05/2016 02:01 PM, Andrew Morton wrote:
> Bah.  What about WARN__ON_ONCE()?  Or much heavier ratelimiting?

The problem w/ the WARN_ON_ONCE macro for this case is that it applies
once per _function_ (because of its static variable), not per _device_.

So, after the message happens for a particular device, it won't happen
for any other devices.  This would still conflict with the reasons the
other patch (use dynamic debug) was rejected [1].

The other problem w/ WARN_ON_ONCE (even if it could be used per device)
is that it masks/hides some events:  if you reach it with requests that
are OK to fail (i.e., retry), it will be silent for requests that are
not OK to fail (e.g., the blk_integrity-related call on nvme) -- and it
could be helpful for debugging.

I believe that ratelimiting would fall into the same category as above
(masks events since it hits the threshold with the OK-to-fail requests,
and then misses the not-OK-to-fail requests).

The other problem with ratelimiting would be, in case of continuously
working on high loads with the devices (e.g., nvme drive) - which is
acceptable - the messages would still eventually be printed, and since
it will happen every time the threshold is OK to print it again, it's
only a repetition with a greater period.



... the difficult point with the problem this patch addresses is not
to be heavy handed;  it's interesting to find some balance to still
reach the message when it might be useful, and provide some form of
control to the device driver (for when it knows it is OK for requests
to fail).  Given that the driver can have multiple callsites (with
different or no failure handling), unfortunately I found it hard to
make this arch-specific (but that is what I really aimed for first).

If you see other approaches that could help with it, I'd be happy to
write something simpler/more discreet.

Thanks for the suggestions and discussions.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-June/144196.html

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/3] powerpc: implement the DMA_ATTR_NO_WARN attribute
  2016-08-01 23:05 Mauricio Faria de Oliveira
@ 2016-08-01 23:05 ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 21+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-01 23:05 UTC (permalink / raw)
  To: linuxppc-dev

Add support for the DMA_ATTR_NO_WARN attribute on powerpc iommu code.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 37d6e74..5f202a5 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -479,7 +479,8 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table *tbl,
 
 		/* Handle failure */
 		if (unlikely(entry == DMA_ERROR_CODE)) {
-			if (printk_ratelimit())
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %lx npages %lu\n", tbl, vaddr,
 					 npages);
@@ -776,7 +777,8 @@ dma_addr_t iommu_map_page(struct device *dev, struct iommu_table *tbl,
 					 mask >> tbl->it_page_shift, align,
 					 attrs);
 		if (dma_handle == DMA_ERROR_CODE) {
-			if (printk_ratelimit())  {
+			if (!(attrs & DMA_ATTR_NO_WARN) &&
+			    printk_ratelimit())  {
 				dev_info(dev, "iommu_alloc failed, tbl %p "
 					 "vaddr %p npages %d\n", tbl, vaddr,
 					 npages);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-08-08 13:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-01 22:59 [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute Mauricio Faria de Oliveira
2016-08-01 22:59 ` Mauricio Faria de Oliveira
2016-08-01 22:59 ` [PATCH 1/3] dma-mapping: " Mauricio Faria de Oliveira
2016-08-01 22:59   ` Mauricio Faria de Oliveira
2016-08-01 22:59 ` [PATCH 2/3] powerpc: implement " Mauricio Faria de Oliveira
2016-08-01 22:59   ` Mauricio Faria de Oliveira
2016-08-01 22:59 ` [PATCH 3/3] nvme: use " Mauricio Faria de Oliveira
2016-08-01 22:59   ` Mauricio Faria de Oliveira
2016-08-04 22:01 ` [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce " Andrew Morton
2016-08-04 22:01   ` Andrew Morton
2016-08-05  0:17   ` Mauricio Faria de Oliveira
2016-08-05  0:17     ` Mauricio Faria de Oliveira
2016-08-05  1:05     ` Andrew Morton
2016-08-05  1:05       ` Andrew Morton
2016-08-05 12:34       ` Mauricio Faria de Oliveira
2016-08-05 12:34         ` Mauricio Faria de Oliveira
2016-08-05 17:01         ` Andrew Morton
2016-08-05 17:01           ` Andrew Morton
2016-08-08 13:38           ` Mauricio Faria de Oliveira
2016-08-08 13:38             ` Mauricio Faria de Oliveira
2016-08-01 23:05 Mauricio Faria de Oliveira
2016-08-01 23:05 ` [PATCH 2/3] powerpc: implement " Mauricio Faria de Oliveira

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.