linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* An announcement for kernel-global workqueue users.
@ 2022-03-21  1:24 Tetsuo Handa
  2022-03-21 16:45 ` Tejun Heo
  2022-03-22 15:22 ` Takashi Iwai
  0 siblings, 2 replies; 3+ messages in thread
From: Tetsuo Handa @ 2022-03-21  1:24 UTC (permalink / raw)
  To: LKML; +Cc: Tejun Heo, Lai Jiangshan

Hello.

The Linux kernel provides kernel-global WQs (namely, system_wq, system_highpri_wq,
system_long_wq, system_unbound_wq, system_freezable_wq, system_power_efficient_wq
and system_freezable_power_efficient_wq). But since attempt to flush kernel-global
WQs has possibility of deadlock, Tejun Heo thinks that we should stop calling
flush_scheduled_work() and flush_workqueue(system_*). Such callers as of Linux 5.17
are listed below.

----------
$ git grep -nF 'flush_scheduled_work()'
drivers/acpi/osl.c:1182:         * invoke flush_scheduled_work()/acpi_os_wait_events_complete() to flush
drivers/acpi/osl.c:1575:        flush_scheduled_work();
drivers/block/aoe/aoedev.c:324: flush_scheduled_work();
drivers/block/aoe/aoedev.c:523: flush_scheduled_work();
drivers/crypto/atmel-ecc.c:401: flush_scheduled_work();
drivers/crypto/atmel-sha204a.c:162:     flush_scheduled_work();
drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c:2606:       flush_scheduled_work();
drivers/gpu/drm/bridge/lontium-lt9611uxc.c:985: flush_scheduled_work();
drivers/gpu/drm/i915/display/intel_display.c:10790:     flush_scheduled_work();
drivers/gpu/drm/i915/gt/selftest_execlists.c:87:        flush_scheduled_work();
drivers/iio/light/tsl2563.c:811:        flush_scheduled_work();
drivers/infiniband/hw/mlx4/cm.c:511:            flush_scheduled_work();
drivers/infiniband/hw/mlx4/cm.c:543:            flush_scheduled_work(); /* make sure all timers were flushed */
drivers/infiniband/ulp/isert/ib_isert.c:2639:   flush_scheduled_work();
drivers/input/mouse/psmouse-smbus.c:320:        flush_scheduled_work();
drivers/md/dm.c:229:    flush_scheduled_work();
drivers/message/fusion/mptscsih.c:1234: flush_scheduled_work();
drivers/net/phy/phy.c:1060:     /* Cannot call flush_scheduled_work() here as desired because
drivers/net/usb/lan78xx.c:3240:  * can't flush_scheduled_work() until we drop rtnl (later),
drivers/net/usb/usbnet.c:853:    * can't flush_scheduled_work() until we drop rtnl (later),
drivers/net/wireless/ath/ath6kl/usb.c:481:      flush_scheduled_work();
drivers/net/wwan/wwan_hwsim.c:537:      flush_scheduled_work();         /* Wait deletion works completion */
drivers/nvme/target/configfs.c:1557:    flush_scheduled_work();
drivers/nvme/target/rdma.c:1587:                flush_scheduled_work();
drivers/nvme/target/rdma.c:2056:        flush_scheduled_work();
drivers/nvme/target/tcp.c:1818:         flush_scheduled_work();
drivers/nvme/target/tcp.c:1879: flush_scheduled_work();
drivers/nvme/target/tcp.c:1884: flush_scheduled_work();
drivers/platform/surface/surface_acpi_notify.c:863:     flush_scheduled_work();
drivers/power/supply/ab8500_btemp.c:975:        flush_scheduled_work();
drivers/power/supply/ab8500_chargalg.c:1993:    flush_scheduled_work();
drivers/power/supply/ab8500_charger.c:3400:     flush_scheduled_work();
drivers/power/supply/ab8500_fg.c:3021:  flush_scheduled_work();
drivers/rapidio/devices/tsi721.c:2944:  flush_scheduled_work();
drivers/rtc/dev.c:99:                   flush_scheduled_work();
drivers/scsi/mpt3sas/mpt3sas_scsih.c:12409:     flush_scheduled_work();
drivers/scsi/qla2xxx/qla_target.c:1568:         flush_scheduled_work();
drivers/staging/olpc_dcon/olpc_dcon.c:386:      flush_scheduled_work();
sound/soc/intel/atom/sst/sst.c:363:     flush_scheduled_work();
$ git grep -nF 'flush_workqueue(system_'
drivers/block/rnbd/rnbd-clt.c:1776:     flush_workqueue(system_long_wq);
drivers/infiniband/core/device.c:2857:  flush_workqueue(system_unbound_wq);
include/linux/workqueue.h:592:  flush_workqueue(system_wq);
----------

I tried to send a patch that emits a warning when flushing kernel-global WQs is attempted
( https://lkml.kernel.org/r/2efd5461-fccd-f1d9-7138-0a6767cbf5fe@I-love.SAKURA.ne.jp ).
But Linus does not want such patch
( https://lkml.kernel.org/r/CAHk-=whWreGjEQ6yasspzBrNnS7EQiL+SknToWt=SzUh4XomyQ@mail.gmail.com ).

Steps for converting kernel-global WQs into module's local WQs are shown below.
But since an oversight in Step 4 results in breakage, I think that this conversion
should be carefully handled by maintainers/developers of each module who are
familiar with that module. (This is why I'm sending this mail than sending patches,
in order to ask for your cooperation.)

----------
Step 0: Consider if flushing kernel-global WQs is unavoidable.

    For example, commit 081bdc9fe05bb232 ("RDMA/ib_srp: Fix a deadlock")
    simply removed flush_workqueue(system_long_wq) call.

    For another example, schedule_on_each_cpu() does not need to call
    flush_scheduled_work() because schedule_on_each_cpu() knows the list
    of all "struct work_struct" instances which need to be flushed using
    flush_work() call.

    If flushing kernel-global WQs is still unavoidable, please proceed to
    the following steps.

Step 1: Declare a variable for your module.

    struct workqueue_struct *my_wq;

Step 2: Create a WQ for your module from __init function. The same flags
        used by corresponding kernel-global WQ can be used when creating
        the WQ for your module.

    my_wq = alloc_workqueue("my_wq_name", 0, 0);

Step 3: Destroy the WQ created in Step 2 from __exit function (and the error
        handling path of __init function if __init function may fail after
        creating the WQ).

    destroy_workqueue(my_wq);

Step 4: Replace e.g. schedule_work() call with corresponding queue_work() call
        throughout your module which should be handled by the WQ for your module.

Step 5: Replace flush_scheduled_work() and flush_workqueue(system_*) calls
        with flush_workqueue() of the WQ for your module.

    flush_workqueue(my_wq);
----------

Regards.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: An announcement for kernel-global workqueue users.
  2022-03-21  1:24 An announcement for kernel-global workqueue users Tetsuo Handa
@ 2022-03-21 16:45 ` Tejun Heo
  2022-03-22 15:22 ` Takashi Iwai
  1 sibling, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2022-03-21 16:45 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: LKML, Lai Jiangshan

Hello, Tetsuo.

On Mon, Mar 21, 2022 at 10:24:23AM +0900, Tetsuo Handa wrote:
> Hello.
> 
> The Linux kernel provides kernel-global WQs (namely, system_wq, system_highpri_wq,
> system_long_wq, system_unbound_wq, system_freezable_wq, system_power_efficient_wq
> and system_freezable_power_efficient_wq). But since attempt to flush kernel-global
> WQs has possibility of deadlock, Tejun Heo thinks that we should stop calling
> flush_scheduled_work() and flush_workqueue(system_*). Such callers as of Linux 5.17
> are listed below.

Hey, so, I'm not too sure this approach would work. Someone would have to do
most of the legwork and ping the respective maintainers with proposed
patches explaining what's happening why and how the proposed patches are
safe.

> I tried to send a patch that emits a warning when flushing kernel-global WQs is attempted
> ( https://lkml.kernel.org/r/2efd5461-fccd-f1d9-7138-0a6767cbf5fe@I-love.SAKURA.ne.jp ).
> But Linus does not want such patch
> ( https://lkml.kernel.org/r/CAHk-=whWreGjEQ6yasspzBrNnS7EQiL+SknToWt=SzUh4XomyQ@mail.gmail.com ).

You can *float* these warning patches in -next to help with conversion if
necessary but you really have to convert most of the obvious existing users
beforehand.

> Step 2: Create a WQ for your module from __init function. The same flags
>         used by corresponding kernel-global WQ can be used when creating
>         the WQ for your module.
> 
>     my_wq = alloc_workqueue("my_wq_name", 0, 0);

And, the first preference would be converting to use flush_work() unless
defining a separate flush domain to flush multiple work items is absolutely
necessary.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: An announcement for kernel-global workqueue users.
  2022-03-21  1:24 An announcement for kernel-global workqueue users Tetsuo Handa
  2022-03-21 16:45 ` Tejun Heo
@ 2022-03-22 15:22 ` Takashi Iwai
  1 sibling, 0 replies; 3+ messages in thread
From: Takashi Iwai @ 2022-03-22 15:22 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: LKML, Tejun Heo, Lai Jiangshan

On Mon, 21 Mar 2022 02:24:23 +0100,
Tetsuo Handa wrote:
> 
> Hello.
> 
> The Linux kernel provides kernel-global WQs (namely, system_wq, system_highpri_wq,
> system_long_wq, system_unbound_wq, system_freezable_wq, system_power_efficient_wq
> and system_freezable_power_efficient_wq). But since attempt to flush kernel-global
> WQs has possibility of deadlock, Tejun Heo thinks that we should stop calling
> flush_scheduled_work() and flush_workqueue(system_*). Such callers as of Linux 5.17
> are listed below.
> 
> ----------
> $ git grep -nF 'flush_scheduled_work()'
(snip)
> sound/soc/intel/atom/sst/sst.c:363:     flush_scheduled_work();

At least this one looks superfluous, the call can be simply dropped.
I'm going to submit a fix patch.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-22 15:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21  1:24 An announcement for kernel-global workqueue users Tetsuo Handa
2022-03-21 16:45 ` Tejun Heo
2022-03-22 15:22 ` Takashi Iwai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).