All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vinod Koul <vinod.koul@intel.com>
To: Will Deacon <will.deacon@arm.com>
Cc: "djbw@fb.com" <djbw@fb.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"andriy.shevchenko@linux.intel.com" 
	<andriy.shevchenko@linux.jf.intel.com>,
	"viresh.kumar@linaro.org" <viresh.kumar@linaro.org>
Subject: Re: dmatest regression in 3.10-rc1
Date: Fri, 17 May 2013 18:04:23 +0530	[thread overview]
Message-ID: <20130517123423.GR14863@intel.com> (raw)
In-Reply-To: <20130516153553.GI11706@mudshark.cambridge.arm.com>

On Thu, May 16, 2013 at 04:35:53PM +0100, Will Deacon wrote:
> On Wed, May 15, 2013 at 04:28:03PM +0100, Will Deacon wrote:
> > I've been observing a regression in the dmatest module with 3.10-rc1. It
> > manifests as either:
> > 
> >  - a spurious timeout on one or more of the channel threads
> >  - a complete kernel lockup (loss of console)
> >  - a panic (see below, noting that the callback [dmatest_callback] is
> >    dereferencing a NULL pointer)
> > 
> > If I revert 77101ce578bb ("dmatest: cancel thread immediately when asked
> > for") then things are rosy again, but I'm not sure if this is hiding another
> > problem.
> 
> Right, so I think I understand what's causing this, but I'll leave it to
> Andriy to suggest a fix. The problem comes about because the dmatest
> module is now driven from debugfs, making it possible to unload the module
> whilst a test run is in progress. In this case:
> 
> 	- The DMA threads will return from wait_event_freezable_timeout(...)
> 	  due to kthread_should_stop() returning true, and subsequently
> 	  report failure because done.done is false.
> 
> 	- The DMA engines may not be idle, so the asynchronous callback can
> 	  be invoked after we've started cleaning up, explaining the NULL
> 	  dereference I'm seeing.
> 
> The solutions are either fixing the module exit code to cope with concurrent
> DMA transfers or to revert 77101ce578bb and not allow the channel threads to
> return mid-transfer.
We need to properly abort the channels on removal. This is already handled in
the code but the kthread_stop is called after the transactions are aborted. It
should be the other way round. Can you try with below patch

---

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index d8ce4ec..4e8d581 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -822,6 +822,9 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc)
 	struct dmatest_thread	*_thread;
 	int			ret;
 
+	/* terminate all transfers on specified channels */
+	dmaengine_terminate_all(dtc->chan);
+
 	list_for_each_entry_safe(thread, _thread, &dtc->threads, node) {
 		ret = kthread_stop(thread->task);
 		pr_debug("dmatest: thread %s exited with status %d\n",
@@ -830,9 +833,6 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc)
 		kfree(thread);
 	}
 
-	/* terminate all transfers on specified channels */
-	dmaengine_terminate_all(dtc->chan);
-
 	kfree(dtc);
 }

--
~Vinod 

WARNING: multiple messages have this Message-ID (diff)
From: vinod.koul@intel.com (Vinod Koul)
To: linux-arm-kernel@lists.infradead.org
Subject: dmatest regression in 3.10-rc1
Date: Fri, 17 May 2013 18:04:23 +0530	[thread overview]
Message-ID: <20130517123423.GR14863@intel.com> (raw)
In-Reply-To: <20130516153553.GI11706@mudshark.cambridge.arm.com>

On Thu, May 16, 2013 at 04:35:53PM +0100, Will Deacon wrote:
> On Wed, May 15, 2013 at 04:28:03PM +0100, Will Deacon wrote:
> > I've been observing a regression in the dmatest module with 3.10-rc1. It
> > manifests as either:
> > 
> >  - a spurious timeout on one or more of the channel threads
> >  - a complete kernel lockup (loss of console)
> >  - a panic (see below, noting that the callback [dmatest_callback] is
> >    dereferencing a NULL pointer)
> > 
> > If I revert 77101ce578bb ("dmatest: cancel thread immediately when asked
> > for") then things are rosy again, but I'm not sure if this is hiding another
> > problem.
> 
> Right, so I think I understand what's causing this, but I'll leave it to
> Andriy to suggest a fix. The problem comes about because the dmatest
> module is now driven from debugfs, making it possible to unload the module
> whilst a test run is in progress. In this case:
> 
> 	- The DMA threads will return from wait_event_freezable_timeout(...)
> 	  due to kthread_should_stop() returning true, and subsequently
> 	  report failure because done.done is false.
> 
> 	- The DMA engines may not be idle, so the asynchronous callback can
> 	  be invoked after we've started cleaning up, explaining the NULL
> 	  dereference I'm seeing.
> 
> The solutions are either fixing the module exit code to cope with concurrent
> DMA transfers or to revert 77101ce578bb and not allow the channel threads to
> return mid-transfer.
We need to properly abort the channels on removal. This is already handled in
the code but the kthread_stop is called after the transactions are aborted. It
should be the other way round. Can you try with below patch

---

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index d8ce4ec..4e8d581 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -822,6 +822,9 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc)
 	struct dmatest_thread	*_thread;
 	int			ret;
 
+	/* terminate all transfers on specified channels */
+	dmaengine_terminate_all(dtc->chan);
+
 	list_for_each_entry_safe(thread, _thread, &dtc->threads, node) {
 		ret = kthread_stop(thread->task);
 		pr_debug("dmatest: thread %s exited with status %d\n",
@@ -830,9 +833,6 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc)
 		kfree(thread);
 	}
 
-	/* terminate all transfers on specified channels */
-	dmaengine_terminate_all(dtc->chan);
-
 	kfree(dtc);
 }

--
~Vinod 

  reply	other threads:[~2013-05-17 13:10 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-15 15:28 dmatest regression in 3.10-rc1 Will Deacon
2013-05-15 15:28 ` Will Deacon
2013-05-16 15:35 ` Will Deacon
2013-05-16 15:35   ` Will Deacon
2013-05-17 12:34   ` Vinod Koul [this message]
2013-05-17 12:34     ` Vinod Koul
2013-05-17 14:18     ` Will Deacon
2013-05-17 14:18       ` Will Deacon
2013-05-20  7:52       ` Andy Shevchenko
2013-05-20  7:52         ` Andy Shevchenko
2013-05-20  9:58         ` Will Deacon
2013-05-20  9:58           ` Will Deacon
2013-05-21 12:31           ` Andy Shevchenko
2013-05-21 12:31             ` Andy Shevchenko
2013-05-21 12:33   ` [PATCH] dmatest: abort transfers immediately when asked for Andy Shevchenko
2013-05-21 12:33     ` Andy Shevchenko
2013-05-21 15:11     ` Will Deacon
2013-05-21 15:11       ` Will Deacon
2013-05-21 17:24       ` Andy Shevchenko
2013-05-21 17:24         ` Andy Shevchenko
2013-05-22 12:41         ` Will Deacon
2013-05-22 12:41           ` Will Deacon
2013-05-22 13:26           ` Andy Shevchenko
2013-05-22 13:26             ` Andy Shevchenko
2013-05-23 10:09         ` Vinod Koul
2013-05-23 10:09           ` Vinod Koul
2013-05-23 10:51           ` Andy Shevchenko
2013-05-23 10:51             ` Andy Shevchenko
2013-05-23 10:22             ` Vinod Koul
2013-05-23 10:22               ` Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130517123423.GR14863@intel.com \
    --to=vinod.koul@intel.com \
    --cc=andriy.shevchenko@linux.jf.intel.com \
    --cc=djbw@fb.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.