linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug in disk event polling
@ 2012-02-10 20:31 Alan Stern
  2012-02-10 20:46 ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Stern @ 2012-02-10 20:31 UTC (permalink / raw)
  To: Tejun Heo, Jens Axboe; +Cc: Linux-pm mailing list, Kernel development list

Tejun:

Don't ask me why this hasn't shown up earlier...  There's a big fat bug 
in the implementation of disk event polling.

The polling is done using the system_nrt_wq work queue, which isn't
freezable.  As a result, polling continues while the system is
preparing for suspend or hibernation.

Obviously I/O to suspended devices doesn't work well.  Somewhat less 
obviously, error recovery for the failed I/O attempts can interfere 
with normal system resume.

You can see this for yourself easily enough by suspending or
hibernating while a USB flash drive is plugged in.  You don't even need
to go through the full suspend procedure; the first two stages are
enough (echo devices >/sys/power/pm_test).  Check the system log
afterward; most likely you'll find the flash drive got errors and had
to be unregistered and re-enumerated.

I have verified that changing all occurrences of system_nrt_wq in 
block/genhd.c to system_freezable_wq fixes the bug.  However this may 
not be the way you want to solve it; you may prefer to have a freezable 
non-reentrant work queue.

Alan Stern


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-10 20:31 Bug in disk event polling Alan Stern
@ 2012-02-10 20:46 ` Tejun Heo
  2012-02-10 21:03   ` Alan Stern
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2012-02-10 20:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

(cc'ing Rafael)

Hello, Alan.

On Fri, Feb 10, 2012 at 03:31:20PM -0500, Alan Stern wrote:
> Don't ask me why this hasn't shown up earlier...  There's a big fat bug 
> in the implementation of disk event polling.
> 
> The polling is done using the system_nrt_wq work queue, which isn't
> freezable.  As a result, polling continues while the system is
> preparing for suspend or hibernation.
> 
> Obviously I/O to suspended devices doesn't work well.  Somewhat less 
> obviously, error recovery for the failed I/O attempts can interfere 
> with normal system resume.

Hmmm.... I see.  Yeah, that can be a problem.

> You can see this for yourself easily enough by suspending or
> hibernating while a USB flash drive is plugged in.  You don't even need
> to go through the full suspend procedure; the first two stages are
> enough (echo devices >/sys/power/pm_test).  Check the system log
> afterward; most likely you'll find the flash drive got errors and had
> to be unregistered and re-enumerated.

Do you happen to have log of such failure?  Polilng failure itself
shouldn't lead to such failure mode.

> I have verified that changing all occurrences of system_nrt_wq in 
> block/genhd.c to system_freezable_wq fixes the bug.  However this may 
> not be the way you want to solve it; you may prefer to have a freezable 
> non-reentrant work queue.

Please feel free to send out a patch to fix the issue. :)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-10 20:46 ` Tejun Heo
@ 2012-02-10 21:03   ` Alan Stern
  2012-02-10 21:12     ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Stern @ 2012-02-10 21:03 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

On Fri, 10 Feb 2012, Tejun Heo wrote:

> > You can see this for yourself easily enough by suspending or
> > hibernating while a USB flash drive is plugged in.  You don't even need
> > to go through the full suspend procedure; the first two stages are
> > enough (echo devices >/sys/power/pm_test).  Check the system log
> > afterward; most likely you'll find the flash drive got errors and had
> > to be unregistered and re-enumerated.
> 
> Do you happen to have log of such failure?  Polilng failure itself
> shouldn't lead to such failure mode.

It does, because the lower-level driver (such as usb-storage) tries to
recover from the error and runs into problems when it's unable to
communicate with the suspended device.  Here's an example:

[ 2854.836894] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[ 2854.836903] PM: Basic memory bitmaps created
[ 2854.836906] PM: Syncing filesystems ... done.
[ 2854.841396] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 2854.856914] PM: Preallocating image memory... done (allocated 33805 pages)
[ 2854.973369] PM: Allocated 135220 kbytes in 0.11 seconds (1229.27 MB/s)
[ 2854.973442] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 2854.991081] hub 2-0:1.0: hub_suspend
[ 2854.991109] usb usb2: bus suspend, wakeup 0
[ 2854.991116] ehci_hcd 0000:00:1d.7: suspend root hub
[ 2854.994727] i8042 aux 00:0a: wake-up capability disabled by ACPI
[ 2854.996329] i8042 kbd 00:09: wake-up capability enabled by ACPI
[ 2855.012852] serial 00:05: disabled
[ 2855.028375] pci 0000:00:1e.0: wake-up capability enabled by ACPI
[ 2855.044195] PM: freeze of devices complete after 54.686 msecs
[ 2855.045721] PM: late freeze of devices complete after 1.448 msecs
[ 2855.045910] ACPI: Preparing to enter system sleep state S4
[ 2855.066804] PM: Saving platform NVS memory
[ 2855.080991] Disabling non-boot CPUs ...
[ 2855.081189] PM: Creating hibernation image:
[ 2855.084032] PM: Need to copy 33649 pages
[ 2855.084032] PM: Normal pages needed: 33649 + 1024, available pages: 159745
[ 2855.084032] PM: Restoring platform NVS memory
[ 2855.084032] Force enabled HPET at resume
[ 2855.088362] ACPI: Waking up from system sleep state S4
[ 2855.146474] pci 0000:00:1f.0: power state changed by ACPI to D0
[ 2855.146608] pci 0000:00:1f.0: power state changed by ACPI to D0
[ 2855.147702] PM: early restore of devices complete after 1.959 msecs
[ 2855.256478] i915 0000:00:02.0: setting latency timer to 64
[ 2855.260350] ehci_hcd 0000:00:1d.7: setting latency timer to 64
[ 2855.261135] usb usb2: root hub lost power or was reset
[ 2855.261218] ehci_hcd 0000:00:1d.7: reset command 0080002 (park)=0 ithresh=8 period=1024 Reset HALT
[ 2855.264089] pci 0000:00:1e.0: setting latency timer to 64
[ 2855.264234] PIIX_IDE 0000:00:1f.1: setting latency timer to 64
[ 2855.265349] usb usb4: root hub lost power or was reset
[ 2855.265420] ehci_hcd 0000:01:01.2: reset command 0080b02  park=3 ithresh=8 period=1024 Reset HALT
[ 2855.265435] ehci_hcd 0000:01:01.2: MWI active
[ 2855.265443] ehci_hcd 0000:01:01.2: ...powerup ports...
[ 2855.265965] pci 0000:00:1e.0: wake-up capability disabled by ACPI
[ 2855.276423] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
[ 2855.276455] usb usb2: usb resume
[ 2855.276463] ehci_hcd 0000:00:1d.7: resume root hub after power loss
[ 2855.288176] usb usb4: usb resume
[ 2855.288186] ehci_hcd 0000:01:01.2: resume root hub after power loss
[ 2855.304037] hub 2-0:1.0: hub_reset_resume
[ 2855.304045] hub 2-0:1.0: trying to enable port power on non-switchable hub
[ 2855.316066] hub 4-0:1.0: hub_reset_resume
[ 2855.316074] hub 4-0:1.0: enabling power on all ports
[ 2855.324178] e100 0000:01:08.0: eth0: NIC Link is Up 100 Mbps Full Duplex
[ 2855.375159] serial 00:05: activated
[ 2855.376755] i8042 kbd 00:09: wake-up capability disabled by ACPI
[ 2855.379728] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
[ 2855.384054] hda: UDMA/100 mode selected
[ 2855.386940] hdb: host max PIO4 wanted PIO255(auto-tune) selected PIO4
[ 2855.409021] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 001103 0  ACK POWER sig=se0 RESET CSC CONNECT
[ 2855.409087] hub 2-0:1.0: port 8: status 0511 change 0001
[ 2855.448429] ehci_hcd 0000:00:1d.7: port 8 full speed --> companion
[ 2855.448440] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 003001 0  ACK POWER OWNER sig=se0 CONNECT
[ 2855.448514] hub 2-0:1.0: port 8 not reset yet, waiting 50ms
[ 2855.456055] hdb: UDMA/100 mode selected
[ 2855.458606] hdc: host max PIO4 wanted PIO255(auto-tune) selected PIO4
[ 2855.458774] hdc: UDMA/66 mode selected
[ 2855.504214] hub 2-0:1.0: port 8 not reset yet, waiting 200ms
[ 2855.708205] hub 2-0:1.0: port 8 not reset yet, waiting 200ms
[ 2855.912189] hub 2-0:1.0: port 8 not reset yet, waiting 200ms
[ 2855.912196] hub 2-0:1.0: port_wait_reset: err = -16
[ 2855.912201] hub 2-0:1.0: port 8 not enabled, trying reset again...
[ 2856.116416] ehci_hcd 0000:00:1d.7: port 8 high speed
[ 2856.116427] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 001005 0  ACK POWER sig=se0 PE CONNECT
[ 2856.172187] usb 2-8: reset high-speed USB device number 2 using ehci_hcd
[ 2856.232393] ehci_hcd 0000:00:1d.7: port 8 high speed
[ 2856.232405] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 001005 0  ACK POWER sig=se0 PE CONNECT
[ 2856.288157] usb 2-8: device descriptor read/64, error -113
[ 2856.448347] ehci_hcd 0000:00:1d.7: port 8 high speed
[ 2856.448357] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 001005 0  ACK POWER sig=se0 PE CONNECT
[ 2856.504154] usb 2-8: device descriptor read/64, error -113
[ 2856.664429] ehci_hcd 0000:00:1d.7: port 8 high speed
[ 2856.664440] ehci_hcd 0000:00:1d.7: GetStatus port:8 status 001005 0  ACK POWER sig=se0 PE CONNECT

None of those resets above should have occurred.  They are the result
of trying to recover from the failure of a TEST UNIT READY command.

> > I have verified that changing all occurrences of system_nrt_wq in 
> > block/genhd.c to system_freezable_wq fixes the bug.  However this may 
> > not be the way you want to solve it; you may prefer to have a freezable 
> > non-reentrant work queue.
> 
> Please feel free to send out a patch to fix the issue. :)

Is there a real reason for using system_nrt_wq?  Are you okay with just
switching over to system_freezable_wq?

Alan Stern


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-10 21:03   ` Alan Stern
@ 2012-02-10 21:12     ` Tejun Heo
  2012-02-10 21:44       ` Alan Stern
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2012-02-10 21:12 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

Hello, Alan.

On Fri, Feb 10, 2012 at 04:03:51PM -0500, Alan Stern wrote:
> None of those resets above should have occurred.  They are the result
> of trying to recover from the failure of a TEST UNIT READY command.

Thanks for the log.  Yeah, I was just thinking about libata and
wondering why it would break that badly.

> > > I have verified that changing all occurrences of system_nrt_wq in 
> > > block/genhd.c to system_freezable_wq fixes the bug.  However this may 
> > > not be the way you want to solve it; you may prefer to have a freezable 
> > > non-reentrant work queue.
> > 
> > Please feel free to send out a patch to fix the issue. :)
> 
> Is there a real reason for using system_nrt_wq?  Are you okay with just
> switching over to system_freezable_wq?

I think it should be nrt.  It assumes that no one else is running it
concurrently; otherwise, multiple CPUs could jump into
disk->fops->check_events() concurrently which can be pretty ugly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-10 21:12     ` Tejun Heo
@ 2012-02-10 21:44       ` Alan Stern
  2012-02-11  0:23         ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Stern @ 2012-02-10 21:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

On Fri, 10 Feb 2012, Tejun Heo wrote:

> Hello, Alan.
> 
> On Fri, Feb 10, 2012 at 04:03:51PM -0500, Alan Stern wrote:
> > None of those resets above should have occurred.  They are the result
> > of trying to recover from the failure of a TEST UNIT READY command.
> 
> Thanks for the log.  Yeah, I was just thinking about libata and
> wondering why it would break that badly.
> 
> > > > I have verified that changing all occurrences of system_nrt_wq in 
> > > > block/genhd.c to system_freezable_wq fixes the bug.  However this may 
> > > > not be the way you want to solve it; you may prefer to have a freezable 
> > > > non-reentrant work queue.
> > > 
> > > Please feel free to send out a patch to fix the issue. :)
> > 
> > Is there a real reason for using system_nrt_wq?  Are you okay with just
> > switching over to system_freezable_wq?
> 
> I think it should be nrt.  It assumes that no one else is running it
> concurrently; otherwise, multiple CPUs could jump into
> disk->fops->check_events() concurrently which can be pretty ugly.

Come to mention it, how can a single work item ever run on more than
one CPU concurrently?  Are you concerned about cases where some other 
thread requeues the work item while it is executing?


Here's a separate but related problem.  Several drivers do I/O in async 
threads.  Two examples: The SCSI core calls kthread_run() to scan for 
devices below a newly-added host, and the sd driver uses 
async_schedule() to probe a newly-added SCSI disk.  There probably are 
lots of other cases I'm not aware of.

The problem is that these async threads generally aren't freezable.  
They will continue to run and do I/O while a system goes through a
sleep transition.  How should this be handled?

kthread_run() can be adjusted on a case-by-case basis, by inserting
calls to set_freezable() and try_to_freeze() at the appropriate places.  
But what about async_schedule()?

Alan Stern


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-10 21:44       ` Alan Stern
@ 2012-02-11  0:23         ` Tejun Heo
  2012-02-12 17:48           ` Alan Stern
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2012-02-11  0:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

Hello,

On Fri, Feb 10, 2012 at 04:44:48PM -0500, Alan Stern wrote:
> > I think it should be nrt.  It assumes that no one else is running it
> > concurrently; otherwise, multiple CPUs could jump into
> > disk->fops->check_events() concurrently which can be pretty ugly.
> 
> Come to mention it, how can a single work item ever run on more than
> one CPU concurrently?  Are you concerned about cases where some other 
> thread requeues the work item while it is executing?

Yeah, there are multiple paths which may queue the work item.  For
polling work, it definitely was possible but maybe locking changes
afterwards removed that.  Even then, it would be better to use nrt wq
as bug caused that way would be very difficult to track down.

> The problem is that these async threads generally aren't freezable.  
> They will continue to run and do I/O while a system goes through a
> sleep transition.  How should this be handled?

I think it would be better to use wq for most kthreads.  A lot of them
aren't strictly correct in the way they deal with
kthread_should_stop() and freezing.  kthread in general simply seems
way too difficult to use correctly.

> kthread_run() can be adjusted on a case-by-case basis, by inserting
> calls to set_freezable() and try_to_freeze() at the appropriate places.  
> But what about async_schedule()?

Given the stuff async is used for, maybe just make all async execution
freezable?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-11  0:23         ` Tejun Heo
@ 2012-02-12 17:48           ` Alan Stern
  2012-02-12 21:26             ` Rafael J. Wysocki
  2012-02-13 17:30             ` Tejun Heo
  0 siblings, 2 replies; 10+ messages in thread
From: Alan Stern @ 2012-02-12 17:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

On Fri, 10 Feb 2012, Tejun Heo wrote:

> Hello,
> 
> On Fri, Feb 10, 2012 at 04:44:48PM -0500, Alan Stern wrote:
> > > I think it should be nrt.  It assumes that no one else is running it
> > > concurrently; otherwise, multiple CPUs could jump into
> > > disk->fops->check_events() concurrently which can be pretty ugly.
> > 
> > Come to mention it, how can a single work item ever run on more than
> > one CPU concurrently?  Are you concerned about cases where some other 
> > thread requeues the work item while it is executing?
> 
> Yeah, there are multiple paths which may queue the work item.  For
> polling work, it definitely was possible but maybe locking changes
> afterwards removed that.  Even then, it would be better to use nrt wq
> as bug caused that way would be very difficult to track down.

Okay, I'll create a new workqueue for this purpose.


> > The problem is that these async threads generally aren't freezable.  
> > They will continue to run and do I/O while a system goes through a
> > sleep transition.  How should this be handled?
> 
> I think it would be better to use wq for most kthreads.  A lot of them
> aren't strictly correct in the way they deal with
> kthread_should_stop() and freezing.  kthread in general simply seems
> way too difficult to use correctly.

Maybe so, but getting rid of it at this point would be a big change.  
Also, kthreads were originally considered more suitable for tasks that
would need to run for a long time; is this no longer true?

> > kthread_run() can be adjusted on a case-by-case basis, by inserting
> > calls to set_freezable() and try_to_freeze() at the appropriate places.  
> > But what about async_schedule()?
> 
> Given the stuff async is used for, maybe just make all async execution
> freezable?

That probably won't work.  What if a driver relies on async thread
execution to carry out its I/O?

As another example, sd_probe() calls async_schedule(sd_probe_async,...)
to handle the long-running parts of probing a SCSI disk.  In turn,
sd_remove() calls async_synchronize_full() to insure that probing is
over before the device is unbound from sd.

What happens if a hot-unpluggable disk drive is unplugged while the
system is asleep?  It's entirely possible that the bus subsystem's
resume routine would see the device was gone and would try to 
unregister it.  Then sd_remove would wait for the async thread 
to finish, which would never happen because the thread would be frozen 
and wouldn't be thawed until all the resume routines had finished.

In this case, the proper solution is to have the SCSI prepare method 
call async_synchronize_full().  Other cases will require other 
solutions.

Alan Stern


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-12 17:48           ` Alan Stern
@ 2012-02-12 21:26             ` Rafael J. Wysocki
  2012-02-12 22:02               ` Alan Stern
  2012-02-13 17:30             ` Tejun Heo
  1 sibling, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2012-02-12 21:26 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tejun Heo, Jens Axboe, Linux-pm mailing list, Kernel development list

On Sunday, February 12, 2012, Alan Stern wrote:
> On Fri, 10 Feb 2012, Tejun Heo wrote:
> 
> > Hello,
> > 
> > On Fri, Feb 10, 2012 at 04:44:48PM -0500, Alan Stern wrote:
> > > > I think it should be nrt.  It assumes that no one else is running it
> > > > concurrently; otherwise, multiple CPUs could jump into
> > > > disk->fops->check_events() concurrently which can be pretty ugly.
> > > 
> > > Come to mention it, how can a single work item ever run on more than
> > > one CPU concurrently?  Are you concerned about cases where some other 
> > > thread requeues the work item while it is executing?
> > 
> > Yeah, there are multiple paths which may queue the work item.  For
> > polling work, it definitely was possible but maybe locking changes
> > afterwards removed that.  Even then, it would be better to use nrt wq
> > as bug caused that way would be very difficult to track down.
> 
> Okay, I'll create a new workqueue for this purpose.
> 
> 
> > > The problem is that these async threads generally aren't freezable.  
> > > They will continue to run and do I/O while a system goes through a
> > > sleep transition.  How should this be handled?
> > 
> > I think it would be better to use wq for most kthreads.  A lot of them
> > aren't strictly correct in the way they deal with
> > kthread_should_stop() and freezing.  kthread in general simply seems
> > way too difficult to use correctly.
> 
> Maybe so, but getting rid of it at this point would be a big change.  
> Also, kthreads were originally considered more suitable for tasks that
> would need to run for a long time; is this no longer true?
> 
> > > kthread_run() can be adjusted on a case-by-case basis, by inserting
> > > calls to set_freezable() and try_to_freeze() at the appropriate places.  
> > > But what about async_schedule()?
> > 
> > Given the stuff async is used for, maybe just make all async execution
> > freezable?
> 
> That probably won't work.  What if a driver relies on async thread
> execution to carry out its I/O?

Well, we use async in the suspend code itself. :-)

> As another example, sd_probe() calls async_schedule(sd_probe_async,...)
> to handle the long-running parts of probing a SCSI disk.  In turn,
> sd_remove() calls async_synchronize_full() to insure that probing is
> over before the device is unbound from sd.
> 
> What happens if a hot-unpluggable disk drive is unplugged while the
> system is asleep?  It's entirely possible that the bus subsystem's
> resume routine would see the device was gone and would try to 
> unregister it.  Then sd_remove would wait for the async thread 
> to finish, which would never happen because the thread would be frozen 
> and wouldn't be thawed until all the resume routines had finished.
> 
> In this case, the proper solution is to have the SCSI prepare method 
> call async_synchronize_full().  Other cases will require other 
> solutions.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-12 21:26             ` Rafael J. Wysocki
@ 2012-02-12 22:02               ` Alan Stern
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Stern @ 2012-02-12 22:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Jens Axboe, Linux-pm mailing list, Kernel development list

On Sun, 12 Feb 2012, Rafael J. Wysocki wrote:

> > > Given the stuff async is used for, maybe just make all async execution
> > > freezable?
> > 
> > That probably won't work.  What if a driver relies on async thread
> > execution to carry out its I/O?
> 
> Well, we use async in the suspend code itself. :-)

Yeah, I remembered that after sending the email message.

Alan Stern


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bug in disk event polling
  2012-02-12 17:48           ` Alan Stern
  2012-02-12 21:26             ` Rafael J. Wysocki
@ 2012-02-13 17:30             ` Tejun Heo
  1 sibling, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2012-02-13 17:30 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jens Axboe, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list

Hello, Alan.

On Sun, Feb 12, 2012 at 12:48:22PM -0500, Alan Stern wrote:
> > I think it would be better to use wq for most kthreads.  A lot of them
> > aren't strictly correct in the way they deal with
> > kthread_should_stop() and freezing.  kthread in general simply seems
> > way too difficult to use correctly.
> 
> Maybe so, but getting rid of it at this point would be a big change.  

Ooh, yeah, if we do it, it would be a gradual transition.

> Also, kthreads were originally considered more suitable for tasks that
> would need to run for a long time; is this no longer true?

wq no longer has problem w/ long running work items.  The only
limiting parameter is max_active, which is primarily there to protect
against berserk cases.

> > > kthread_run() can be adjusted on a case-by-case basis, by inserting
> > > calls to set_freezable() and try_to_freeze() at the appropriate places.  
> > > But what about async_schedule()?
> > 
> > Given the stuff async is used for, maybe just make all async execution
> > freezable?
> 
> That probably won't work.  What if a driver relies on async thread
> execution to carry out its I/O?
>
> As another example, sd_probe() calls async_schedule(sd_probe_async,...)
> to handle the long-running parts of probing a SCSI disk.  In turn,
> sd_remove() calls async_synchronize_full() to insure that probing is
> over before the device is unbound from sd.
> 
> What happens if a hot-unpluggable disk drive is unplugged while the
> system is asleep?  It's entirely possible that the bus subsystem's
> resume routine would see the device was gone and would try to 
> unregister it.  Then sd_remove would wait for the async thread 
> to finish, which would never happen because the thread would be frozen 
> and wouldn't be thawed until all the resume routines had finished.
> 
> In this case, the proper solution is to have the SCSI prepare method 
> call async_synchronize_full().  Other cases will require other 
> solutions.

Hmmm.... I don't know then. :)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-02-13 17:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-10 20:31 Bug in disk event polling Alan Stern
2012-02-10 20:46 ` Tejun Heo
2012-02-10 21:03   ` Alan Stern
2012-02-10 21:12     ` Tejun Heo
2012-02-10 21:44       ` Alan Stern
2012-02-11  0:23         ` Tejun Heo
2012-02-12 17:48           ` Alan Stern
2012-02-12 21:26             ` Rafael J. Wysocki
2012-02-12 22:02               ` Alan Stern
2012-02-13 17:30             ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).