Bug in disk event polling

* Bug in disk event polling
@ 2012-02-10 20:31 Alan Stern
  2012-02-10 20:46 ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Stern @ 2012-02-10 20:31 UTC (permalink / raw)
  To: Tejun Heo, Jens Axboe; +Cc: Linux-pm mailing list, Kernel development list

Tejun:

Don't ask me why this hasn't shown up earlier...  There's a big fat bug 
in the implementation of disk event polling.

The polling is done using the system_nrt_wq work queue, which isn't
freezable.  As a result, polling continues while the system is
preparing for suspend or hibernation.

Obviously I/O to suspended devices doesn't work well.  Somewhat less 
obviously, error recovery for the failed I/O attempts can interfere 
with normal system resume.

You can see this for yourself easily enough by suspending or
hibernating while a USB flash drive is plugged in.  You don't even need
to go through the full suspend procedure; the first two stages are
enough (echo devices >/sys/power/pm_test).  Check the system log
afterward; most likely you'll find the flash drive got errors and had
to be unregistered and re-enumerated.

I have verified that changing all occurrences of system_nrt_wq in 
block/genhd.c to system_freezable_wq fixes the bug.  However this may 
not be the way you want to solve it; you may prefer to have a freezable 
non-reentrant work queue.

Alan Stern

^ permalink raw reply	[flat|nested] 10+ messages in thread