linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4.6-rc1 regression in SPI core -- deadlock
@ 2016-04-04  1:20 Rich Felker
  2016-04-04  3:53 ` Vignesh R
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2016-04-04  1:20 UTC (permalink / raw)
  To: linux-spi; +Cc: linux-kernel, Jon Hunter, Vignesh R, Mark Brown

I've spent several days trying to debug a deadlock using our local
(not yet ready for upstream) driver for the J-Core SPI device and it
seems to be a new deadlock in the SPI core caused by commit
556351f14e74 and unrelated to the particular driver. Commit
49023d2e4ead tried to solve a related deadlock problem, but there
still seems to be a lock order issue and it's affecting SPI use even
without the spi_flash_read optimization. The deadlock I'm observing
has a kworker thread stuck in wait_for_completion called from
spi_sync_locked (ultimately from mmc_rescan) and the completion is
never finishing because this kworker thread has the bus locked while
the spi master task has already started processing the queue but can't
proceed because the bus lock is taken.

Anyone else seen this? Ideas for a proper fix? I've got it working for
me by disabling the bus locking in __spi_pump_messages entirely (like
before commit 556351f14e74) but I'm pretty sure this breaks the new
feature that was added.

Rich

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 4.6-rc1 regression in SPI core -- deadlock
  2016-04-04  1:20 4.6-rc1 regression in SPI core -- deadlock Rich Felker
@ 2016-04-04  3:53 ` Vignesh R
  2016-04-04  7:04   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Vignesh R @ 2016-04-04  3:53 UTC (permalink / raw)
  To: Rich Felker, linux-spi; +Cc: linux-kernel, Jon Hunter, Mark Brown



On 04/04/2016 06:50 AM, Rich Felker wrote:
> I've spent several days trying to debug a deadlock using our local
> (not yet ready for upstream) driver for the J-Core SPI device and it
> seems to be a new deadlock in the SPI core caused by commit
> 556351f14e74 and unrelated to the particular driver. Commit
> 49023d2e4ead tried to solve a related deadlock problem, but there
> still seems to be a lock order issue and it's affecting SPI use even
> without the spi_flash_read optimization. The deadlock I'm observing
> has a kworker thread stuck in wait_for_completion called from
> spi_sync_locked (ultimately from mmc_rescan) and the completion is
> never finishing because this kworker thread has the bus locked while
> the spi master task has already started processing the queue but can't
> proceed because the bus lock is taken.
> 
> Anyone else seen this? Ideas for a proper fix?

Could you try 24c8cd1b081286("spi: fix possible deadlock between
internal bus locks and bus_lock_flag") from linux-next?

-- 
Regards
Vignesh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 4.6-rc1 regression in SPI core -- deadlock
  2016-04-04  3:53 ` Vignesh R
@ 2016-04-04  7:04   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2016-04-04  7:04 UTC (permalink / raw)
  To: Vignesh R; +Cc: linux-spi, linux-kernel, Jon Hunter, Mark Brown

On Mon, Apr 04, 2016 at 09:23:31AM +0530, Vignesh R wrote:
> 
> 
> On 04/04/2016 06:50 AM, Rich Felker wrote:
> > I've spent several days trying to debug a deadlock using our local
> > (not yet ready for upstream) driver for the J-Core SPI device and it
> > seems to be a new deadlock in the SPI core caused by commit
> > 556351f14e74 and unrelated to the particular driver. Commit
> > 49023d2e4ead tried to solve a related deadlock problem, but there
> > still seems to be a lock order issue and it's affecting SPI use even
> > without the spi_flash_read optimization. The deadlock I'm observing
> > has a kworker thread stuck in wait_for_completion called from
> > spi_sync_locked (ultimately from mmc_rescan) and the completion is
> > never finishing because this kworker thread has the bus locked while
> > the spi master task has already started processing the queue but can't
> > proceed because the bus lock is taken.
> > 
> > Anyone else seen this? Ideas for a proper fix?
> 
> Could you try 24c8cd1b081286("spi: fix possible deadlock between
> internal bus locks and bus_lock_flag") from linux-next?

This seems to fix the problem. I'm still hitting an issue where
mmc_rescan deadlocks (at best) or crashes/corrupts the card state, but
I think that's a separate bug since it happened with my workaround
hack too. The patch in linux-next makes sense to me from what I've
read of the SPI core code so far. Thanks!

Rich

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-04-04  7:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-04  1:20 4.6-rc1 regression in SPI core -- deadlock Rich Felker
2016-04-04  3:53 ` Vignesh R
2016-04-04  7:04   ` Rich Felker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).