On 2022/02/24 6:33, Tejun Heo wrote: > On Wed, Feb 23, 2022 at 09:57:27AM +0900, Tetsuo Handa wrote: >> On 2022/02/23 2:29, Tejun Heo wrote: >>> On Mon, Feb 21, 2022 at 07:38:09PM +0900, Tetsuo Handa wrote: >>>> Since schedule_on_each_cpu() calls schedule_work_on() and flush_work(), >>>> we should avoid using system_wq in order to avoid unexpected locking >>>> dependency. >>> >>> I don't get it. schedule_on_each_cpu() is flushing each work item and thus >>> shouldn't need its own flushing domain. What's this change for? >> >> A kernel test robot tested "[PATCH v2] workqueue: Warn flush attempt using >> system-wide workqueues" on 5.16.0-06523-g29bd199e4e73 and hit a lockdep >> warning ( https://lkml.kernel.org/r/20220221083358.GC835@xsang-OptiPlex-9020 ). >> >> Although the circular locking dependency itself needs to be handled by >> lockless console printing support, we won't be able to apply >> "[PATCH v2] workqueue: Warn flush attempt using system-wide workqueues" >> if schedule_on_each_cpu() continues using system-wide workqueues. > > The patch seems pretty wrong. What's problematic is system workqueue flushes > (which flushes the entire workqueue), not work item flushes. Why? My understanding is that flushing a workqueue waits for completion of all work items in that workqueue flushing a work item waits for for completion of that work item using a workqueue specified as of queue_work() and if a work item in some workqueue is blocked by other work in that workqueue (e.g. max_active limit, work items on that workqueue and locks they need), it has a risk of deadlock . Then, how can flushing a work item using system-wide workqueues be free of deadlock risk? Isn't it just "unlikely to deadlock" rather than "impossible to deadlock"?