All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
@ 2017-01-12 21:30 Tony Lindgren
       [not found] ` <20170112213016.19367-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2017-01-12 21:30 UTC (permalink / raw)
  To: Dan Williams, Vinod Koul
  Cc: Bin Liu, Daniel Mack, Felipe Balbi, George Cherian, Johan Hovold,
	Peter Ujfalusi, Sekhar Nori, Sebastian Andrzej Siewior,
	dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko,
	Grygorii Strashko, Kevin Hilman, Patrick Titiano,
	Sergei Shtylyov

Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
when no cable is connected. But looks like few corner case issues still
remain.

Looks like just by repluging USB cable about ten or so times on BeagleBone
when configured in USB peripheral mode we can get warnings and eventually
trigger an oops in cppi41 DMA:

WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
x28/0x38 [cppi41]
...

WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
push_desc_queue+0x94/0x9c [cppi41]
...

Unable to handle kernel NULL pointer dereference at virtual
address 00000104
pgd = c0004000
[00000104] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
...
[<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
(__rpm_callback+0xc0/0x214)
[<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
[<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
[<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
[<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)

This is because of a race with PM runtime and cppi41_dma_issue_pending()
as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
set of patches. Based on mailing list discussions we however came to the
conclusion that a different fix from Alexandre's fix is needed in order
to guarantee that DMA is really active when we try to use it.

To fix the issue, we need to add a driver specific flag as we otherwise
can have -EINPROGRESS state set by PM runtime and can't rely on
pm_runtime_active() to tell us when we can use the DMA.

For reference, this is also documented in Documentation/power/runtime_pm.txt
in the example at the end of the file as pointed out by Grygorii Strashko
<grygorii.strashko-l0cyMroinI0@public.gmane.org>.

So let's fix the issue by introducing atomic_t active that gets toggled
in runtime_suspend() and runtime_resume(). And then let's replace the
pm_runtime_active() checks with atomic_read().

Note that we may also get transactions queued while in -EINPROGRESS
state. So we need to check for new elements in cppi41_runtime_resume()
by replacing list_for_each_entry_safe() with the commonly used test for
while (!list_empty(&cdd->pending)).

Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
---
 drivers/dma/cppi41.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -1,3 +1,4 @@
+#include <linux/atomic.h>
 #include <linux/delay.h>
 #include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
@@ -153,6 +154,8 @@ struct cppi41_dd {
 
 	/* context for suspend/resume */
 	unsigned int dma_tdfdq;
+
+	atomic_t active;
 };
 
 #define FIST_COMPLETION_QUEUE	93
@@ -320,7 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 			int error;
 
 			error = pm_runtime_get(cdd->ddev.dev);
-			if (error < 0)
+			if (((error != -EINPROGRESS) && error < 0) ||
+			    !atomic_read(&cdd->active))
 				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
 					__func__, error);
 
@@ -482,7 +486,7 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
 		return;
 	}
 
-	if (likely(pm_runtime_active(cdd->ddev.dev)))
+	if (likely(atomic_read(&cdd->active)))
 		push_desc_queue(c);
 	else
 		pending_desc(c);
@@ -1003,6 +1007,7 @@ static int cppi41_dma_probe(struct platform_device *pdev)
 	cdd->ddev.dst_addr_widths = CPPI41_DMA_BUSWIDTHS;
 	cdd->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 	cdd->ddev.dev = dev;
+	atomic_set(&cdd->active, 0);
 	INIT_LIST_HEAD(&cdd->ddev.channels);
 	cpp41_dma_info.dma_cap = cdd->ddev.cap_mask;
 
@@ -1151,6 +1156,7 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
 
+	atomic_set(&cdd->active, 0);
 	WARN_ON(!list_empty(&cdd->pending));
 
 	return 0;
@@ -1159,13 +1165,20 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 static int __maybe_unused cppi41_runtime_resume(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
-	struct cppi41_channel *c, *_c;
+	struct cppi41_channel *c;
 	unsigned long flags;
 
+	atomic_set(&cdd->active, 1);
+
 	spin_lock_irqsave(&cdd->lock, flags);
-	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
-		push_desc_queue(c);
+	while (!list_empty(&cdd->pending)) {
+		c = list_first_entry(&cdd->pending,
+				     struct cppi41_channel,
+				     node);
 		list_del(&c->node);
+		spin_unlock_irqrestore(&cdd->lock, flags);
+		push_desc_queue(c);
+		spin_lock_irqsave(&cdd->lock, flags);
 	}
 	spin_unlock_irqrestore(&cdd->lock, flags);
 
-- 
2.11.0
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found] ` <20170112213016.19367-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-12 21:53   ` Grygorii Strashko
       [not found]     ` <927792da-2e90-b2ae-1206-8fcb504d7551-l0cyMroinI0@public.gmane.org>
  2017-01-13  9:24   ` Sergei Shtylyov
  1 sibling, 1 reply; 13+ messages in thread
From: Grygorii Strashko @ 2017-01-12 21:53 UTC (permalink / raw)
  To: Tony Lindgren, Dan Williams, Vinod Koul
  Cc: Bin Liu, Daniel Mack, Felipe Balbi, George Cherian, Johan Hovold,
	Peter Ujfalusi, Sekhar Nori, Sebastian Andrzej Siewior,
	dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/12/2017 03:30 PM, Tony Lindgren wrote:
> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
> when no cable is connected. But looks like few corner case issues still
> remain.
> 
> Looks like just by repluging USB cable about ten or so times on BeagleBone
> when configured in USB peripheral mode we can get warnings and eventually
> trigger an oops in cppi41 DMA:
> 
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
> x28/0x38 [cppi41]
> ...
> 
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
> push_desc_queue+0x94/0x9c [cppi41]
> ...
> 
> Unable to handle kernel NULL pointer dereference at virtual
> address 00000104
> pgd = c0004000
> [00000104] *pgd=00000000
> Internal error: Oops: 805 [#1] SMP ARM
> ...
> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
> (__rpm_callback+0xc0/0x214)
> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
> 
> This is because of a race with PM runtime and cppi41_dma_issue_pending()
> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
> set of patches. Based on mailing list discussions we however came to the
> conclusion that a different fix from Alexandre's fix is needed in order
> to guarantee that DMA is really active when we try to use it.
> 
> To fix the issue, we need to add a driver specific flag as we otherwise
> can have -EINPROGRESS state set by PM runtime and can't rely on
> pm_runtime_active() to tell us when we can use the DMA.
> 
> For reference, this is also documented in Documentation/power/runtime_pm.txt
> in the example at the end of the file as pointed out by Grygorii Strashko
> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
> 
> So let's fix the issue by introducing atomic_t active that gets toggled
> in runtime_suspend() and runtime_resume(). And then let's replace the
> pm_runtime_active() checks with atomic_read().
> 
> Note that we may also get transactions queued while in -EINPROGRESS
> state. So we need to check for new elements in cppi41_runtime_resume()
> by replacing list_for_each_entry_safe() with the commonly used test for
> while (!list_empty(&cdd->pending)).
> 
> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
> ---
>  drivers/dma/cppi41.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> --- a/drivers/dma/cppi41.c
> +++ b/drivers/dma/cppi41.c
> @@ -1,3 +1,4 @@
> +#include <linux/atomic.h>
>  #include <linux/delay.h>
>  #include <linux/dmaengine.h>
>  #include <linux/dma-mapping.h>
> @@ -153,6 +154,8 @@ struct cppi41_dd {
>  
>  	/* context for suspend/resume */
>  	unsigned int dma_tdfdq;
> +
> +	atomic_t active;
>  };
>  
>  #define FIST_COMPLETION_QUEUE	93
> @@ -320,7 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data)
>  			int error;
>  
>  			error = pm_runtime_get(cdd->ddev.dev);
> -			if (error < 0)
> +			if (((error != -EINPROGRESS) && error < 0) ||
> +			    !atomic_read(&cdd->active))
>  				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
>  					__func__, error);
>  
> @@ -482,7 +486,7 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
>  		return;
>  	}
>  

Sry, but even if it looks better it might still race with PM runtime :(

> -	if (likely(pm_runtime_active(cdd->ddev.dev)))
> +	if (likely(atomic_read(&cdd->active)))
>  		push_desc_queue(c);
>  	else


- CPU is here (-EINPROGRESS and active == 0) and then preempted 
- PM runtime will finish cppi41_runtime_resume and clean-up pending descs
- CPU return here and adds desc to the pending queue
- oops

Am I wrong?

>  		pending_desc(c);
> @@ -1003,6 +1007,7 @@ static int cppi41_dma_probe(struct platform_device *pdev)
>  	cdd->ddev.dst_addr_widths = CPPI41_DMA_BUSWIDTHS;
>  	cdd->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
>  	cdd->ddev.dev = dev;
> +	atomic_set(&cdd->active, 0);
>  	INIT_LIST_HEAD(&cdd->ddev.channels);
>  	cpp41_dma_info.dma_cap = cdd->ddev.cap_mask;
>  
> @@ -1151,6 +1156,7 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
>  {
>  	struct cppi41_dd *cdd = dev_get_drvdata(dev);
>  
> +	atomic_set(&cdd->active, 0);
>  	WARN_ON(!list_empty(&cdd->pending));
>  
>  	return 0;
> @@ -1159,13 +1165,20 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
>  static int __maybe_unused cppi41_runtime_resume(struct device *dev)
>  {
>  	struct cppi41_dd *cdd = dev_get_drvdata(dev);
> -	struct cppi41_channel *c, *_c;
> +	struct cppi41_channel *c;
>  	unsigned long flags;
>  
> +	atomic_set(&cdd->active, 1);
> +
>  	spin_lock_irqsave(&cdd->lock, flags);
> -	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
> -		push_desc_queue(c);
> +	while (!list_empty(&cdd->pending)) {
> +		c = list_first_entry(&cdd->pending,
> +				     struct cppi41_channel,
> +				     node);
>  		list_del(&c->node);
> +		spin_unlock_irqrestore(&cdd->lock, flags);
> +		push_desc_queue(c);
> +		spin_lock_irqsave(&cdd->lock, flags);
>  	}
>  	spin_unlock_irqrestore(&cdd->lock, flags);
>  
> 

-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]     ` <927792da-2e90-b2ae-1206-8fcb504d7551-l0cyMroinI0@public.gmane.org>
@ 2017-01-12 22:19       ` Tony Lindgren
       [not found]         ` <20170112221933.GM2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2017-01-12 22:19 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 13:54]:
> On 01/12/2017 03:30 PM, Tony Lindgren wrote:
>
> Sry, but even if it looks better it might still race with PM runtime :(
> 
> > -	if (likely(pm_runtime_active(cdd->ddev.dev)))
> > +	if (likely(atomic_read(&cdd->active)))
> >  		push_desc_queue(c);
> >  	else
> 
> 
> - CPU is here (-EINPROGRESS and active == 0) and then preempted 
> - PM runtime will finish cppi41_runtime_resume and clean-up pending descs
> - CPU return here and adds desc to the pending queue
> - oops
> 
> Am I wrong?

We had cppi41_dma_issue_pending() getting called from atomic contex and
cppi41_runtime_resume() getting preempted where cppi41_dma_issue_pending()
would add to the queue.

If what you're saying can happen in some cases, then we again see the
WARN_ON(!list_empty(&cdd->pending)) in cppi41_runtime_suspend(). At least
so far I have no longer been able to make happen today.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]         ` <20170112221933.GM2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-12 22:52           ` Grygorii Strashko
       [not found]             ` <1c8967b7-d59b-e53d-feeb-80c71464fb94-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Grygorii Strashko @ 2017-01-12 22:52 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/12/2017 04:19 PM, Tony Lindgren wrote:
> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 13:54]:
>> On 01/12/2017 03:30 PM, Tony Lindgren wrote:
>>
>> Sry, but even if it looks better it might still race with PM runtime :(
>>
>>> -	if (likely(pm_runtime_active(cdd->ddev.dev)))
>>> +	if (likely(atomic_read(&cdd->active)))
>>>  		push_desc_queue(c);
>>>  	else
>>
>>
>> - CPU is here (-EINPROGRESS and active == 0) and then preempted 
>> - PM runtime will finish cppi41_runtime_resume and clean-up pending descs
>> - CPU return here and adds desc to the pending queue
>> - oops
>>
>> Am I wrong?
> 
> We had cppi41_dma_issue_pending() getting called from atomic contex and
> cppi41_runtime_resume() getting preempted where cppi41_dma_issue_pending()
> would add to the queue.

Again, I can be mistaken but cppi41_configure_channel() seems not atomic.
cppi41_configure_channel()->dma_async_issue_pending()
+ documentation says "This function can be called in an interrupt context"

And definitely it will be preemptive on RT :(

> 
> If what you're saying can happen in some cases, then we again see the
> WARN_ON(!list_empty(&cdd->pending)) in cppi41_runtime_suspend(). At least
> so far I have no longer been able to make happen today.

-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]             ` <1c8967b7-d59b-e53d-feeb-80c71464fb94-l0cyMroinI0@public.gmane.org>
@ 2017-01-12 23:04               ` Tony Lindgren
       [not found]                 ` <20170112230456.GS2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2017-01-12 23:04 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 14:53]:
> 
> 
> On 01/12/2017 04:19 PM, Tony Lindgren wrote:
> > * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 13:54]:
> >> On 01/12/2017 03:30 PM, Tony Lindgren wrote:
> >>
> >> Sry, but even if it looks better it might still race with PM runtime :(
> >>
> >>> -	if (likely(pm_runtime_active(cdd->ddev.dev)))
> >>> +	if (likely(atomic_read(&cdd->active)))
> >>>  		push_desc_queue(c);
> >>>  	else
> >>
> >>
> >> - CPU is here (-EINPROGRESS and active == 0) and then preempted 
> >> - PM runtime will finish cppi41_runtime_resume and clean-up pending descs
> >> - CPU return here and adds desc to the pending queue
> >> - oops
> >>
> >> Am I wrong?
> > 
> > We had cppi41_dma_issue_pending() getting called from atomic contex and
> > cppi41_runtime_resume() getting preempted where cppi41_dma_issue_pending()
> > would add to the queue.
> 
> Again, I can be mistaken but cppi41_configure_channel() seems not atomic.
> cppi41_configure_channel()->dma_async_issue_pending()
> + documentation says "This function can be called in an interrupt context"
> 
> And definitely it will be preemptive on RT :(

Hmm OK. So are you thinking we should add a spinlock around the
test in cppi41_dma_issue_pending() and when modifying cdd->active?

Something like:

spin_lock_irqsave(&cdd->lock, flags);
if (likely(cdd->active))
	push_desc_queue(c);
else
	list_add_tail(&c->node, &cdd->pending);
spin_unlock_irqrestore(&cdd->lock, flags);

Or do you have something better in mind?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                 ` <20170112230456.GS2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-12 23:42                   ` Grygorii Strashko
       [not found]                     ` <aafe7afb-6d8a-2fef-acdd-bd331151d16a-l0cyMroinI0@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Grygorii Strashko @ 2017-01-12 23:42 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/12/2017 05:04 PM, Tony Lindgren wrote:
> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 14:53]:
>>
>>
>> On 01/12/2017 04:19 PM, Tony Lindgren wrote:
>>> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 13:54]:
>>>> On 01/12/2017 03:30 PM, Tony Lindgren wrote:
>>>>
>>>> Sry, but even if it looks better it might still race with PM runtime :(
>>>>
>>>>> -	if (likely(pm_runtime_active(cdd->ddev.dev)))
>>>>> +	if (likely(atomic_read(&cdd->active)))
>>>>>  		push_desc_queue(c);
>>>>>  	else
>>>>
>>>>
>>>> - CPU is here (-EINPROGRESS and active == 0) and then preempted 
>>>> - PM runtime will finish cppi41_runtime_resume and clean-up pending descs
>>>> - CPU return here and adds desc to the pending queue
>>>> - oops
>>>>
>>>> Am I wrong?
>>>
>>> We had cppi41_dma_issue_pending() getting called from atomic contex and
>>> cppi41_runtime_resume() getting preempted where cppi41_dma_issue_pending()
>>> would add to the queue.
>>
>> Again, I can be mistaken but cppi41_configure_channel() seems not atomic.
>> cppi41_configure_channel()->dma_async_issue_pending()
>> + documentation says "This function can be called in an interrupt context"
>>
>> And definitely it will be preemptive on RT :(
> 
> Hmm OK. So are you thinking we should add a spinlock around the
> test in cppi41_dma_issue_pending() and when modifying cdd->active?

in general yes.

> 
> Something like:
> 
> spin_lock_irqsave(&cdd->lock, flags);
> if (likely(cdd->active))
> 	push_desc_queue(c);
> else
> 	list_add_tail(&c->node, &cdd->pending);
> spin_unlock_irqrestore(&cdd->lock, flags);
> 
> Or do you have something better in mind?

I've come up with smth like below. *Not tested*
But may be it can be done more simpler.

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index d5ba43a..41a8768 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -153,6 +153,7 @@ struct cppi41_dd {
 
        /* context for suspend/resume */
        unsigned int dma_tdfdq;
+       int is_suspended;
 };
 
 #define FIST_COMPLETION_QUEUE  93
@@ -317,12 +318,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 
                while (val) {
                        u32 desc, len;
-                       int error;
-
-                       error = pm_runtime_get(cdd->ddev.dev);
-                       if (error < 0)
-                               dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
-                                       __func__, error);
 
                        q_num = __fls(val);
                        val &= ~(1 << q_num);
@@ -343,9 +338,6 @@ static irqreturn_t cppi41_irq(int irq, void *data)
                        c->residue = pd_trans_len(c->desc->pd6) - len;
                        dma_cookie_complete(&c->txd);
                        dmaengine_desc_get_callback_invoke(&c->txd, NULL);
-
-                       pm_runtime_mark_last_busy(cdd->ddev.dev);
-                       pm_runtime_put_autosuspend(cdd->ddev.dev);
                }
        }
        return IRQ_HANDLED;
@@ -457,38 +449,36 @@ static void push_desc_queue(struct cppi41_channel *c)
        cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
 }
 
-static void pending_desc(struct cppi41_channel *c)
+static void cppi41_dma_issue_pending(struct dma_chan *chan)
 {
+       struct cppi41_channel *c = to_cpp41_chan(chan);
        struct cppi41_dd *cdd = c->cdd;
+       int error;
+       struct cppi41_channel *_c;
        unsigned long flags;
 
        spin_lock_irqsave(&cdd->lock, flags);
        list_add_tail(&c->node, &cdd->pending);
-       spin_unlock_irqrestore(&cdd->lock, flags);
-}
-
-static void cppi41_dma_issue_pending(struct dma_chan *chan)
-{
-       struct cppi41_channel *c = to_cpp41_chan(chan);
-       struct cppi41_dd *cdd = c->cdd;
-       int error;
 
        error = pm_runtime_get(cdd->ddev.dev);
-       if ((error != -EINPROGRESS) && error < 0) {
+       if (error < 0) {
                pm_runtime_put_noidle(cdd->ddev.dev);
                dev_err(cdd->ddev.dev, "Failed to pm_runtime_get: %i\n",
                        error);
-
+               spin_unlock_irqrestore(&cdd->lock, flags);
                return;
        }
 
-       if (likely(pm_runtime_active(cdd->ddev.dev)))
-               push_desc_queue(c);
-       else
-               pending_desc(c);
+       if (!cdd->is_suspended) {
+               list_for_each_entry_safe(c, _c, &cdd->pending, node) {
+                       push_desc_queue(c);
+                       list_del(&c->node);
+               };
+       }
 
        pm_runtime_mark_last_busy(cdd->ddev.dev);
        pm_runtime_put_autosuspend(cdd->ddev.dev);
+       spin_lock_irqsave(&cdd->lock, flags);
 }
 
 static u32 get_host_pd0(u32 length)
@@ -1150,10 +1140,20 @@ static int __maybe_unused cppi41_resume(struct device *dev)
 static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 {
        struct cppi41_dd *cdd = dev_get_drvdata(dev);
+       int ret;
+       unsigned long flags;
 
-       WARN_ON(!list_empty(&cdd->pending));
+       spin_lock_irqsave(&cdd->lock, flags);
+       if (!list_empty(&cdd->pending)) {
+               WARN_ON(!list_empty(&cdd->pending));
+               ret = -EBUSY;
+       } else {
+               /* ... suspend the device ... */
+               cdd->is_suspended = 1;
+       }
+       spin_unlock_irqrestore(&cdd->lock, flags);
 
-       return 0;
+       return ret;
 }
 
 static int __maybe_unused cppi41_runtime_resume(struct device *dev)
@@ -1163,6 +1163,8 @@ static int __maybe_unused cppi41_runtime_resume(struct device *dev)
        unsigned long flags;
 
        spin_lock_irqsave(&cdd->lock, flags);
+       cdd->is_suspended = 0;
+
        list_for_each_entry_safe(c, _c, &cdd->pending, node) {
                push_desc_queue(c);
                list_del(&c->node);


-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                     ` <aafe7afb-6d8a-2fef-acdd-bd331151d16a-l0cyMroinI0@public.gmane.org>
@ 2017-01-13  0:03                       ` Tony Lindgren
       [not found]                         ` <20170113000314.GU2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2017-01-13  0:03 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 15:43]:
> @@ -457,38 +449,36 @@ static void push_desc_queue(struct cppi41_channel *c)
>         cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
>  }
>  
> -static void pending_desc(struct cppi41_channel *c)
> +static void cppi41_dma_issue_pending(struct dma_chan *chan)
>  {
> +       struct cppi41_channel *c = to_cpp41_chan(chan);
>         struct cppi41_dd *cdd = c->cdd;
> +       int error;
> +       struct cppi41_channel *_c;
>         unsigned long flags;
>  
>         spin_lock_irqsave(&cdd->lock, flags);
>         list_add_tail(&c->node, &cdd->pending);
> -       spin_unlock_irqrestore(&cdd->lock, flags);
> -}
> -
> -static void cppi41_dma_issue_pending(struct dma_chan *chan)
> -{
> -       struct cppi41_channel *c = to_cpp41_chan(chan);
> -       struct cppi41_dd *cdd = c->cdd;
> -       int error;
>  
>         error = pm_runtime_get(cdd->ddev.dev);
> -       if ((error != -EINPROGRESS) && error < 0) {
> +       if (error < 0) {
>                 pm_runtime_put_noidle(cdd->ddev.dev);
>                 dev_err(cdd->ddev.dev, "Failed to pm_runtime_get: %i\n",
>                         error);
> -
> +               spin_unlock_irqrestore(&cdd->lock, flags);
>                 return;
>         }
>  
> -       if (likely(pm_runtime_active(cdd->ddev.dev)))
> -               push_desc_queue(c);
> -       else
> -               pending_desc(c);
> +       if (!cdd->is_suspended) {
> +               list_for_each_entry_safe(c, _c, &cdd->pending, node) {
> +                       push_desc_queue(c);
> +                       list_del(&c->node);
> +               };
> +       }
>  
>         pm_runtime_mark_last_busy(cdd->ddev.dev);
>         pm_runtime_put_autosuspend(cdd->ddev.dev);
> +       spin_lock_irqsave(&cdd->lock, flags);
>  }

So always add to the queue no matter, then always flush the queue
directly if active? Yeah that might work :)

Don't we need spinlock in the list_for_each_entry_safe() to prevent
it racing with runtime_resume() though?

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found] ` <20170112213016.19367-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  2017-01-12 21:53   ` Grygorii Strashko
@ 2017-01-13  9:24   ` Sergei Shtylyov
       [not found]     ` <17c98bce-b8fd-33d9-f9f1-65f114306eed-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
  1 sibling, 1 reply; 13+ messages in thread
From: Sergei Shtylyov @ 2017-01-13  9:24 UTC (permalink / raw)
  To: Tony Lindgren, Dan Williams, Vinod Koul
  Cc: Bin Liu, Daniel Mack, Felipe Balbi, George Cherian, Johan Hovold,
	Peter Ujfalusi, Sekhar Nori, Sebastian Andrzej Siewior,
	dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko,
	Grygorii Strashko, Kevin Hilman, Patrick Titiano

Hello!

On 1/13/2017 12:30 AM, Tony Lindgren wrote:

> Commit fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
> when no cable is connected. But looks like few corner case issues still
> remain.
>
> Looks like just by repluging USB cable about ten or so times on BeagleBone

    Re-plugging.

> when configured in USB peripheral mode we can get warnings and eventually
> trigger an oops in cppi41 DMA:
>
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
> x28/0x38 [cppi41]
> ...
>
> WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
> push_desc_queue+0x94/0x9c [cppi41]
> ...
>
> Unable to handle kernel NULL pointer dereference at virtual
> address 00000104
> pgd = c0004000
> [00000104] *pgd=00000000
> Internal error: Oops: 805 [#1] SMP ARM
> ...
> [<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
> (__rpm_callback+0xc0/0x214)
> [<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
> [<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
> [<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
> [<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
>
> This is because of a race with PM runtime and cppi41_dma_issue_pending()
> as reported by Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> in earlier
> set of patches. Based on mailing list discussions we however came to the
> conclusion that a different fix from Alexandre's fix is needed in order
> to guarantee that DMA is really active when we try to use it.
>
> To fix the issue, we need to add a driver specific flag as we otherwise
> can have -EINPROGRESS state set by PM runtime and can't rely on

     Runtime PM.

> pm_runtime_active() to tell us when we can use the DMA.
>
> For reference, this is also documented in Documentation/power/runtime_pm.txt
> in the example at the end of the file as pointed out by Grygorii Strashko
> <grygorii.strashko-l0cyMroinI0@public.gmane.org>.
>
> So let's fix the issue by introducing atomic_t active that gets toggled
> in runtime_suspend() and runtime_resume(). And then let's replace the
> pm_runtime_active() checks with atomic_read().
>
> Note that we may also get transactions queued while in -EINPROGRESS
> state. So we need to check for new elements in cppi41_runtime_resume()
> by replacing list_for_each_entry_safe() with the commonly used test for
> while (!list_empty(&cdd->pending)).
>
> Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support")
> Cc: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Bin Liu <b-liu-l0cyMroinI0@public.gmane.org>
> Cc: Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org>
> Cc: Kevin Hilman <khilman-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Patrick Titiano <ptitiano-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Cc: Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
> Reported-by: Alexandre Bailon <abailon-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
> ---
>  drivers/dma/cppi41.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> --- a/drivers/dma/cppi41.c
> +++ b/drivers/dma/cppi41.c
[...]
> @@ -320,7 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data)
>  			int error;
>
>  			error = pm_runtime_get(cdd->ddev.dev);
> -			if (error < 0)
> +			if (((error != -EINPROGRESS) && error < 0) ||

    Hum, the innermost parens are inconsistent: around != but not around <.

[...]

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                         ` <20170113000314.GU2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-13 16:17                           ` Tony Lindgren
       [not found]                             ` <20170113161731.GV2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2017-01-13 16:17 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170112 16:04]:
> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 15:43]:
> > @@ -457,38 +449,36 @@ static void push_desc_queue(struct cppi41_channel *c)
> >         cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
> >  }
> >  
> > -static void pending_desc(struct cppi41_channel *c)
> > +static void cppi41_dma_issue_pending(struct dma_chan *chan)
> >  {
> > +       struct cppi41_channel *c = to_cpp41_chan(chan);
> >         struct cppi41_dd *cdd = c->cdd;
> > +       int error;
> > +       struct cppi41_channel *_c;
> >         unsigned long flags;
> >  
> >         spin_lock_irqsave(&cdd->lock, flags);
> >         list_add_tail(&c->node, &cdd->pending);
> > -       spin_unlock_irqrestore(&cdd->lock, flags);
> > -}
> > -
> > -static void cppi41_dma_issue_pending(struct dma_chan *chan)
> > -{
> > -       struct cppi41_channel *c = to_cpp41_chan(chan);
> > -       struct cppi41_dd *cdd = c->cdd;
> > -       int error;
> >  
> >         error = pm_runtime_get(cdd->ddev.dev);
> > -       if ((error != -EINPROGRESS) && error < 0) {
> > +       if (error < 0) {
> >                 pm_runtime_put_noidle(cdd->ddev.dev);
> >                 dev_err(cdd->ddev.dev, "Failed to pm_runtime_get: %i\n",
> >                         error);
> > -
> > +               spin_unlock_irqrestore(&cdd->lock, flags);
> >                 return;
> >         }
> >  
> > -       if (likely(pm_runtime_active(cdd->ddev.dev)))
> > -               push_desc_queue(c);
> > -       else
> > -               pending_desc(c);
> > +       if (!cdd->is_suspended) {
> > +               list_for_each_entry_safe(c, _c, &cdd->pending, node) {
> > +                       push_desc_queue(c);
> > +                       list_del(&c->node);
> > +               };
> > +       }
> >  
> >         pm_runtime_mark_last_busy(cdd->ddev.dev);
> >         pm_runtime_put_autosuspend(cdd->ddev.dev);
> > +       spin_lock_irqsave(&cdd->lock, flags);
> >  }
> 
> So always add to the queue no matter, then always flush the queue
> directly if active? Yeah that might work :)
> 
> Don't we need spinlock in the list_for_each_entry_safe() to prevent
> it racing with runtime_resume() though?

I could not apply for me as looks like your mail server replaced tabs
with spaces it seems :(

But anyways here's your basic idea plugged into my recent patch and
it seems to work. I maybe have missed something though while reading
so please check.

The pm_runtime_get/mark_last_busy/put_autosuspend and WARN_ON() we
want to keep in cppi41_irq() at least for now :)

And I'm thinking we must also callcppi41_run_queue() with spinlock
held to prevent out of order triggering of the DMA transfers.

Does this look OK to you?

Regards,

Tony

8< -----------------------
diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index d5ba43a87a68..6ee593eb2acb 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -153,6 +153,8 @@ struct cppi41_dd {
 
 	/* context for suspend/resume */
 	unsigned int dma_tdfdq;
+
+	bool is_suspended;
 };
 
 #define FIST_COMPLETION_QUEUE	93
@@ -316,11 +318,12 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 			__iormb();
 
 		while (val) {
+			unsigned long flags;
 			u32 desc, len;
 			int error;
 
 			error = pm_runtime_get(cdd->ddev.dev);
-			if (error < 0)
+			if ((error != -EINPROGRESS) && error < 0)
 				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
 					__func__, error);
 
@@ -340,6 +343,11 @@ static irqreturn_t cppi41_irq(int irq, void *data)
 			else
 				len = pd_trans_len(c->desc->pd0);
 
+			/* This warning should never trigger */
+			spin_lock_irqsave(&cdd->lock, flags);
+			WARN_ON(cdd->is_suspended);
+			spin_unlock_irqrestore(&cdd->lock, flags);
+
 			c->residue = pd_trans_len(c->desc->pd6) - len;
 			dma_cookie_complete(&c->txd);
 			dmaengine_desc_get_callback_invoke(&c->txd, NULL);
@@ -457,20 +465,26 @@ static void push_desc_queue(struct cppi41_channel *c)
 	cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
 }
 
-static void pending_desc(struct cppi41_channel *c)
+/*
+ * Caller must hold cdd->lock to prevent push_desc_queue()
+ * getting called out of order. We have both cppi41_dma_issue_pending()
+ * and cppi41_runtime_resume() call this function.
+ */
+static void cppi41_run_queue(struct cppi41_dd *cdd)
 {
-	struct cppi41_dd *cdd = c->cdd;
-	unsigned long flags;
+	struct cppi41_channel *c, *_c;
 
-	spin_lock_irqsave(&cdd->lock, flags);
-	list_add_tail(&c->node, &cdd->pending);
-	spin_unlock_irqrestore(&cdd->lock, flags);
+	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
+		push_desc_queue(c);
+		list_del(&c->node);
+	}
 }
 
 static void cppi41_dma_issue_pending(struct dma_chan *chan)
 {
 	struct cppi41_channel *c = to_cpp41_chan(chan);
 	struct cppi41_dd *cdd = c->cdd;
+	unsigned long flags;
 	int error;
 
 	error = pm_runtime_get(cdd->ddev.dev);
@@ -482,10 +496,11 @@ static void cppi41_dma_issue_pending(struct dma_chan *chan)
 		return;
 	}
 
-	if (likely(pm_runtime_active(cdd->ddev.dev)))
-		push_desc_queue(c);
-	else
-		pending_desc(c);
+	spin_lock_irqsave(&cdd->lock, flags);
+	list_add_tail(&c->node, &cdd->pending);
+	if (!cdd->is_suspended)
+		cppi41_run_queue(cdd);
+	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	pm_runtime_mark_last_busy(cdd->ddev.dev);
 	pm_runtime_put_autosuspend(cdd->ddev.dev);
@@ -1150,6 +1165,11 @@ static int __maybe_unused cppi41_resume(struct device *dev)
 static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
+	unsigned long flags;
+
+	spin_lock_irqsave(&cdd->lock, flags);
+	cdd->is_suspended = true;
+	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	WARN_ON(!list_empty(&cdd->pending));
 
@@ -1159,14 +1179,11 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
 static int __maybe_unused cppi41_runtime_resume(struct device *dev)
 {
 	struct cppi41_dd *cdd = dev_get_drvdata(dev);
-	struct cppi41_channel *c, *_c;
 	unsigned long flags;
 
 	spin_lock_irqsave(&cdd->lock, flags);
-	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
-		push_desc_queue(c);
-		list_del(&c->node);
-	}
+	cdd->is_suspended = false;
+	cppi41_run_queue(cdd);
 	spin_unlock_irqrestore(&cdd->lock, flags);
 
 	return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]     ` <17c98bce-b8fd-33d9-f9f1-65f114306eed-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
@ 2017-01-13 16:19       ` Tony Lindgren
  0 siblings, 0 replies; 13+ messages in thread
From: Tony Lindgren @ 2017-01-13 16:19 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko,
	Grygorii Strashko, Kevin Hilman, Patrick Titiano

* Sergei Shtylyov <sergei.shtylyov-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org> [170113 01:25]:
> > @@ -320,7 +323,8 @@ static irqreturn_t cppi41_irq(int irq, void *data)
> >  			int error;
> > 
> >  			error = pm_runtime_get(cdd->ddev.dev);
> > -			if (error < 0)
> > +			if (((error != -EINPROGRESS) && error < 0) ||
> 
>    Hum, the innermost parens are inconsistent: around != but not around <.

Oops just noticed I missed this one in the latest version, will
update for the next revision. The typos I already fixed.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                             ` <20170113161731.GV2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-13 16:44                               ` Tony Lindgren
  2017-01-13 17:37                               ` Grygorii Strashko
  1 sibling, 0 replies; 13+ messages in thread
From: Tony Lindgren @ 2017-01-13 16:44 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170113 08:27]:
> @@ -316,11 +318,12 @@ static irqreturn_t cppi41_irq(int irq, void *data)
>  			__iormb();
>  
>  		while (val) {
> +			unsigned long flags;
>  			u32 desc, len;
>  			int error;
>  
>  			error = pm_runtime_get(cdd->ddev.dev);
> -			if (error < 0)
> +			if ((error != -EINPROGRESS) && error < 0)
>  				dev_err(cdd->ddev.dev, "%s pm runtime get: %i\n",
>  					__func__, error);
>  
> @@ -340,6 +343,11 @@ static irqreturn_t cppi41_irq(int irq, void *data)
>  			else
>  				len = pd_trans_len(c->desc->pd0);
>  
> +			/* This warning should never trigger */
> +			spin_lock_irqsave(&cdd->lock, flags);
> +			WARN_ON(cdd->is_suspended);
> +			spin_unlock_irqrestore(&cdd->lock, flags);
> +
>  			c->residue = pd_trans_len(c->desc->pd6) - len;
>  			dma_cookie_complete(&c->txd);
>  			dmaengine_desc_get_callback_invoke(&c->txd, NULL);

Hmm this check needs to be before cppi41_pop_desc() already as that does
cppi_readl(). And the spinlocks here don't help anything, we just want
to see a warning if we hit a bug somewhere else.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                             ` <20170113161731.GV2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
  2017-01-13 16:44                               ` Tony Lindgren
@ 2017-01-13 17:37                               ` Grygorii Strashko
       [not found]                                 ` <ba31f412-1b92-c4db-a1f4-82448e5ecf18-l0cyMroinI0@public.gmane.org>
  1 sibling, 1 reply; 13+ messages in thread
From: Grygorii Strashko @ 2017-01-13 17:37 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov



On 01/13/2017 10:17 AM, Tony Lindgren wrote:
> * Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> [170112 16:04]:
>> * Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170112 15:43]:
>>> @@ -457,38 +449,36 @@ static void push_desc_queue(struct cppi41_channel *c)
>>>         cppi_writel(reg, cdd->qmgr_mem + QMGR_QUEUE_D(c->q_num));
>>>  }
>>>  
>>> -static void pending_desc(struct cppi41_channel *c)
>>> +static void cppi41_dma_issue_pending(struct dma_chan *chan)
>>>  {
>>> +       struct cppi41_channel *c = to_cpp41_chan(chan);
>>>         struct cppi41_dd *cdd = c->cdd;
>>> +       int error;
>>> +       struct cppi41_channel *_c;
>>>         unsigned long flags;
>>>  
>>>         spin_lock_irqsave(&cdd->lock, flags);
>>>         list_add_tail(&c->node, &cdd->pending);
>>> -       spin_unlock_irqrestore(&cdd->lock, flags);
>>> -}
>>> -
>>> -static void cppi41_dma_issue_pending(struct dma_chan *chan)
>>> -{
>>> -       struct cppi41_channel *c = to_cpp41_chan(chan);
>>> -       struct cppi41_dd *cdd = c->cdd;
>>> -       int error;
>>>  
>>>         error = pm_runtime_get(cdd->ddev.dev);
>>> -       if ((error != -EINPROGRESS) && error < 0) {
>>> +       if (error < 0) {
>>>                 pm_runtime_put_noidle(cdd->ddev.dev);
>>>                 dev_err(cdd->ddev.dev, "Failed to pm_runtime_get: %i\n",
>>>                         error);
>>> -
>>> +               spin_unlock_irqrestore(&cdd->lock, flags);
>>>                 return;
>>>         }
>>>  
>>> -       if (likely(pm_runtime_active(cdd->ddev.dev)))
>>> -               push_desc_queue(c);
>>> -       else
>>> -               pending_desc(c);
>>> +       if (!cdd->is_suspended) {
>>> +               list_for_each_entry_safe(c, _c, &cdd->pending, node) {
>>> +                       push_desc_queue(c);
>>> +                       list_del(&c->node);
>>> +               };
>>> +       }
>>>  
>>>         pm_runtime_mark_last_busy(cdd->ddev.dev);
>>>         pm_runtime_put_autosuspend(cdd->ddev.dev);
>>> +       spin_lock_irqsave(&cdd->lock, flags);
>>>  }
>>
>> So always add to the queue no matter, then always flush the queue
>> directly if active? Yeah that might work :)
>>
>> Don't we need spinlock in the list_for_each_entry_safe() to prevent
>> it racing with runtime_resume() though?
> 
> I could not apply for me as looks like your mail server replaced tabs
> with spaces it seems :(

Sorry.

> 
> But anyways here's your basic idea plugged into my recent patch and
> it seems to work. I maybe have missed something though while reading
> so please check.
> 
> The pm_runtime_get/mark_last_busy/put_autosuspend and WARN_ON() we
> want to keep in cppi41_irq() at least for now :)

As per my understanding and testing it looks like might be enough to
have just pm_runtime_mark_last_busy(cdd->ddev.dev); in cppi41_irq().
But it can be left as is and yes - over all idea is that irq should
not be triggered if device is Idle.

> 
> And I'm thinking we must also callcppi41_run_queue() with spinlock
> held to prevent out of order triggering of the DMA transfers.
> 
> Does this look OK to you?
> 

I think yes. My current version is mostly similar to yours.

> 
> 8< -----------------------
> diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
> index d5ba43a87a68..6ee593eb2acb 100644
> --- a/drivers/dma/cppi41.c
> +++ b/drivers/dma/cppi41.c
> @@ -153,6 +153,8 @@ struct cppi41_dd {
>  
>  	/* context for suspend/resume */
>  	unsigned int dma_tdfdq;
> +
> +	bool is_suspended;
>  };
>  
>  #define FIST_COMPLETION_QUEUE	93
> @@ -316,11 +318,12 @@ static irqreturn_t cppi41_irq(int irq, void *data)

[..]

>  
>  	pm_runtime_mark_last_busy(cdd->ddev.dev);
>  	pm_runtime_put_autosuspend(cdd->ddev.dev);
> @@ -1150,6 +1165,11 @@ static int __maybe_unused cppi41_resume(struct device *dev)
>  static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
>  {
>  	struct cppi41_dd *cdd = dev_get_drvdata(dev);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&cdd->lock, flags);
> +	cdd->is_suspended = true;
> +	spin_unlock_irqrestore(&cdd->lock, flags);
>  
>  	WARN_ON(!list_empty(&cdd->pending));

Shouldn't we check list_empty() under spin lock?

>  
> @@ -1159,14 +1179,11 @@ static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
>  static int __maybe_unused cppi41_runtime_resume(struct device *dev)
>  {
>  	struct cppi41_dd *cdd = dev_get_drvdata(dev);
> -	struct cppi41_channel *c, *_c;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&cdd->lock, flags);
> -	list_for_each_entry_safe(c, _c, &cdd->pending, node) {
> -		push_desc_queue(c);
> -		list_del(&c->node);
> -	}
> +	cdd->is_suspended = false;
> +	cppi41_run_queue(cdd);
>  	spin_unlock_irqrestore(&cdd->lock, flags);
>  
>  	return 0;
> 

-- 
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume
       [not found]                                 ` <ba31f412-1b92-c4db-a1f4-82448e5ecf18-l0cyMroinI0@public.gmane.org>
@ 2017-01-13 17:50                                   ` Tony Lindgren
  0 siblings, 0 replies; 13+ messages in thread
From: Tony Lindgren @ 2017-01-13 17:50 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Dan Williams, Vinod Koul, Bin Liu, Daniel Mack, Felipe Balbi,
	George Cherian, Johan Hovold, Peter Ujfalusi, Sekhar Nori,
	Sebastian Andrzej Siewior, dmaengine-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-omap-u79uwXL29TY76Z2rM5mHXA, Andy Shevchenko, Kevin Hilman,
	Patrick Titiano, Sergei Shtylyov

* Grygorii Strashko <grygorii.strashko-l0cyMroinI0@public.gmane.org> [170113 09:37]:
> On 01/13/2017 10:17 AM, Tony Lindgren wrote:
> > But anyways here's your basic idea plugged into my recent patch and
> > it seems to work. I maybe have missed something though while reading
> > so please check.
> > 
> > The pm_runtime_get/mark_last_busy/put_autosuspend and WARN_ON() we
> > want to keep in cppi41_irq() at least for now :)
> 
> As per my understanding and testing it looks like might be enough to
> have just pm_runtime_mark_last_busy(cdd->ddev.dev); in cppi41_irq().
> But it can be left as is and yes - over all idea is that irq should
> not be triggered if device is Idle.

OK yeah kicking the autoidle timeout is needed here.

> > And I'm thinking we must also callcppi41_run_queue() with spinlock
> > held to prevent out of order triggering of the DMA transfers.
> > 
> > Does this look OK to you?
> 
> I think yes. My current version is mostly similar to yours.

OK will update description and repost shortly.

> > @@ -1150,6 +1165,11 @@ static int __maybe_unused cppi41_resume(struct device *dev)
> >  static int __maybe_unused cppi41_runtime_suspend(struct device *dev)
> >  {
> >  	struct cppi41_dd *cdd = dev_get_drvdata(dev);
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&cdd->lock, flags);
> > +	cdd->is_suspended = true;
> > +	spin_unlock_irqrestore(&cdd->lock, flags);
> >  
> >  	WARN_ON(!list_empty(&cdd->pending));
> 
> Shouldn't we check list_empty() under spin lock?

Yeah let's do that.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-13 17:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12 21:30 [PATCH] dmaengine: cppi41: Fix oops in cppi41_runtime_resume Tony Lindgren
     [not found] ` <20170112213016.19367-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-12 21:53   ` Grygorii Strashko
     [not found]     ` <927792da-2e90-b2ae-1206-8fcb504d7551-l0cyMroinI0@public.gmane.org>
2017-01-12 22:19       ` Tony Lindgren
     [not found]         ` <20170112221933.GM2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-12 22:52           ` Grygorii Strashko
     [not found]             ` <1c8967b7-d59b-e53d-feeb-80c71464fb94-l0cyMroinI0@public.gmane.org>
2017-01-12 23:04               ` Tony Lindgren
     [not found]                 ` <20170112230456.GS2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-12 23:42                   ` Grygorii Strashko
     [not found]                     ` <aafe7afb-6d8a-2fef-acdd-bd331151d16a-l0cyMroinI0@public.gmane.org>
2017-01-13  0:03                       ` Tony Lindgren
     [not found]                         ` <20170113000314.GU2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-13 16:17                           ` Tony Lindgren
     [not found]                             ` <20170113161731.GV2630-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-01-13 16:44                               ` Tony Lindgren
2017-01-13 17:37                               ` Grygorii Strashko
     [not found]                                 ` <ba31f412-1b92-c4db-a1f4-82448e5ecf18-l0cyMroinI0@public.gmane.org>
2017-01-13 17:50                                   ` Tony Lindgren
2017-01-13  9:24   ` Sergei Shtylyov
     [not found]     ` <17c98bce-b8fd-33d9-f9f1-65f114306eed-M4DtvfQ/ZS1MRgGoP+s0PdBPR1lH4CV8@public.gmane.org>
2017-01-13 16:19       ` Tony Lindgren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.