netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
       [not found]   ` <ZF6WsSVGX3O1d0pL@slm.duckdns.org>
@ 2023-07-11 13:55     ` Geert Uytterhoeven
  2023-07-11 14:06       ` Geert Uytterhoeven
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-11 13:55 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hi Tejun,

On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@kernel.org> wrote:
> Workqueue now automatically marks per-cpu work items that hog CPU for too
> long as CPU_INTENSIVE, which excludes them from concurrency management and
> prevents stalling other concurrency-managed work items. If a work function
> keeps running over the thershold, it likely needs to be switched to use an
> unbound workqueue.
>
> This patch adds a debug mechanism which tracks the work functions which
> trigger the automatic CPU_INTENSIVE mechanism and report them using
> pr_warn() with exponential backoff.
>
> v2: Drop bouncing through kthread_worker for printing messages. It was to
>     avoid introducing circular locking dependency but wasn't effective as it
>     still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
>     print directly using printk_deferred().
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>

Thanks for your patch, which is now commit 6363845005202148
("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
mechanism") in v6.5-rc1.

I guess you are interested to know where this triggers.
I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
the result on various machines...

SH/R-Mobile:

  workqueue: genpd_power_off_work_fn hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

Atmark Techno Armadillo800-EVA with shmob_drm:

  workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 16
times, consider switching to WQ_UNBOUND

R-Car Gen2:

  workqueue: rtc_timer_do_work hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

R-Car Gen2/Gen3:

  workqueue: pm_runtime_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND

R-Car Gen3:

  workqueue: kfree_rcu_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND

OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:

  workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND
  workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
times, consider switching to WQ_UNBOUND
  workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
consider switching to WQ_UNBOUND
  workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
consider switching to WQ_UNBOUND
  workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
switching to WQ_UNBOUND

Atari (ARAnyM):

  workqueue: ata_sff_pio_task hogged CPU for >10000us 64 times,
consider switching to WQ_UNBOUND

The OrangeCrab is a slow machine, so it's not that surprising to see these
messages...

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-11 13:55     ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Geert Uytterhoeven
@ 2023-07-11 14:06       ` Geert Uytterhoeven
  2023-07-11 21:39         ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-11 14:06 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Tejun,
>
> On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@kernel.org> wrote:
> > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > prevents stalling other concurrency-managed work items. If a work function
> > keeps running over the thershold, it likely needs to be switched to use an
> > unbound workqueue.
> >
> > This patch adds a debug mechanism which tracks the work functions which
> > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > pr_warn() with exponential backoff.
> >
> > v2: Drop bouncing through kthread_worker for printing messages. It was to
> >     avoid introducing circular locking dependency but wasn't effective as it
> >     still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> >     print directly using printk_deferred().
> >
> > Signed-off-by: Tejun Heo <tj@kernel.org>
> > Suggested-by: Peter Zijlstra <peterz@infradead.org>
>
> Thanks for your patch, which is now commit 6363845005202148
> ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> mechanism") in v6.5-rc1.
>
> I guess you are interested to know where this triggers.
> I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> the result on various machines...

> OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:
>
>   workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
> switching to WQ_UNBOUND
>   workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
> times, consider switching to WQ_UNBOUND
>   workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
> consider switching to WQ_UNBOUND
>   workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
> consider switching to WQ_UNBOUND
>   workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
> switching to WQ_UNBOUND

Got one more after a while:

workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-11 14:06       ` Geert Uytterhoeven
@ 2023-07-11 21:39         ` Tejun Heo
  2023-07-12  0:30           ` Tejun Heo
  2023-07-12  8:05           ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2023-07-11 21:39 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hello,

On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Tejun,
> >
> > On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@kernel.org> wrote:
> > > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > > prevents stalling other concurrency-managed work items. If a work function
> > > keeps running over the thershold, it likely needs to be switched to use an
> > > unbound workqueue.
> > >
> > > This patch adds a debug mechanism which tracks the work functions which
> > > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > > pr_warn() with exponential backoff.
> > >
> > > v2: Drop bouncing through kthread_worker for printing messages. It was to
> > >     avoid introducing circular locking dependency but wasn't effective as it
> > >     still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> > >     print directly using printk_deferred().
> > >
> > > Signed-off-by: Tejun Heo <tj@kernel.org>
> > > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> >
> > Thanks for your patch, which is now commit 6363845005202148
> > ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> > mechanism") in v6.5-rc1.
> >
> > I guess you are interested to know where this triggers.
> > I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> > the result on various machines...
> 
> > OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:
> >
> >   workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
> > switching to WQ_UNBOUND
> >   workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
> > times, consider switching to WQ_UNBOUND
> >   workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
> > consider switching to WQ_UNBOUND
> >   workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
> > consider switching to WQ_UNBOUND
> >   workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
> > switching to WQ_UNBOUND
> 
> Got one more after a while:
> 
> workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
> consider switching to WQ_UNBOUND

I wonder whether the right thing to do here is somehow scaling the threshold
according to the relative processing power. It's difficult to come up with a
threshold which works well across the latest & fastest and really tiny CPUs.
I'll think about it some more but if you have some ideas, please feel free
to suggest.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-11 21:39         ` Tejun Heo
@ 2023-07-12  0:30           ` Tejun Heo
  2023-07-12  9:57             ` Geert Uytterhoeven
  2023-07-12  8:05           ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2023-07-12  0:30 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
...
> > workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
> > consider switching to WQ_UNBOUND
> 
> I wonder whether the right thing to do here is somehow scaling the threshold
> according to the relative processing power. It's difficult to come up with a
> threshold which works well across the latest & fastest and really tiny CPUs.
> I'll think about it some more but if you have some ideas, please feel free
> to suggest.

Geert, do you mind posting the full kernel logs for the affected machines?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-11 21:39         ` Tejun Heo
  2023-07-12  0:30           ` Tejun Heo
@ 2023-07-12  8:05           ` Peter Zijlstra
  2023-07-12  9:04             ` Geert Uytterhoeven
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2023-07-12  8:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Geert Uytterhoeven, Lai Jiangshan, torvalds,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:

> I wonder whether the right thing to do here is somehow scaling the threshold
> according to the relative processing power. It's difficult to come up with a
> threshold which works well across the latest & fastest and really tiny CPUs.
> I'll think about it some more but if you have some ideas, please feel free
> to suggest.

We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
the name, but it does have some relation to how fast the machine is.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-12  8:05           ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Peter Zijlstra
@ 2023-07-12  9:04             ` Geert Uytterhoeven
  2023-07-12 12:27               ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-12  9:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tejun Heo, Lai Jiangshan, torvalds, Linux Kernel Mailing List,
	kernel-team, Linux PM list, DRI Development, linux-rtc,
	linux-riscv, netdev, Linux Fbdev development list,
	Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hoi Peter,

On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > I wonder whether the right thing to do here is somehow scaling the threshold
> > according to the relative processing power. It's difficult to come up with a
> > threshold which works well across the latest & fastest and really tiny CPUs.
> > I'll think about it some more but if you have some ideas, please feel free
> > to suggest.
>
> We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> the name, but it does have some relation to how fast the machine is.

That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
depends on some timer frequency.

R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS

while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-12  0:30           ` Tejun Heo
@ 2023-07-12  9:57             ` Geert Uytterhoeven
  2023-07-17 23:03               ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-12  9:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hi Tejun,

On Wed, Jul 12, 2023 at 2:30 AM Tejun Heo <tj@kernel.org> wrote:
> On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> > > On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> ...
> > > workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
> > > consider switching to WQ_UNBOUND
> >
> > I wonder whether the right thing to do here is somehow scaling the threshold
> > according to the relative processing power. It's difficult to come up with a
> > threshold which works well across the latest & fastest and really tiny CPUs.
> > I'll think about it some more but if you have some ideas, please feel free
> > to suggest.
>
> Geert, do you mind posting the full kernel logs for the affected machines?

https://drive.google.com/file/d/1toDs7ugZJ2eXatpdvySY4yxSsNam9xAC
is an archive with boot and s2ram logs.  Note that my kernels do contain
local debug code, and may be noisy.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-12  9:04             ` Geert Uytterhoeven
@ 2023-07-12 12:27               ` Peter Zijlstra
  2023-07-13 18:53                 ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2023-07-12 12:27 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Tejun Heo, Lai Jiangshan, torvalds, Linux Kernel Mailing List,
	kernel-team, Linux PM list, DRI Development, linux-rtc,
	linux-riscv, netdev, Linux Fbdev development list,
	Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

On Wed, Jul 12, 2023 at 11:04:16AM +0200, Geert Uytterhoeven wrote:
> Hoi Peter,
> 
> On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > > I wonder whether the right thing to do here is somehow scaling the threshold
> > > according to the relative processing power. It's difficult to come up with a
> > > threshold which works well across the latest & fastest and really tiny CPUs.
> > > I'll think about it some more but if you have some ideas, please feel free
> > > to suggest.
> >
> > We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> > the name, but it does have some relation to how fast the machine is.
> 
> That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
> depends on some timer frequency.
> 
> R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
> R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS
> 
> while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.

Hehe, OK, really bogus then. Lets file this idea in the bit-bucket then.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-12 12:27               ` Peter Zijlstra
@ 2023-07-13 18:53                 ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2023-07-13 18:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Lai Jiangshan, torvalds,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

On Wed, Jul 12, 2023 at 02:27:45PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 12, 2023 at 11:04:16AM +0200, Geert Uytterhoeven wrote:
> > Hoi Peter,
> > 
> > On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > > > I wonder whether the right thing to do here is somehow scaling the threshold
> > > > according to the relative processing power. It's difficult to come up with a
> > > > threshold which works well across the latest & fastest and really tiny CPUs.
> > > > I'll think about it some more but if you have some ideas, please feel free
> > > > to suggest.
> > >
> > > We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> > > the name, but it does have some relation to how fast the machine is.
> > 
> > That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
> > depends on some timer frequency.
> > 
> > R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
> > R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS
> > 
> > while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.
> 
> Hehe, OK, really bogus then. Lets file this idea in the bit-bucket then.

I think it can still be useful. On ryzen 3975wx, it's 6989.92, so while it
may be off by some hundreds of percents, there are still orders of magnitude
signal range and that should be enough to suppress most spurious warnings.
I'll post something later today.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-12  9:57             ` Geert Uytterhoeven
@ 2023-07-17 23:03               ` Tejun Heo
  2023-07-18  9:54                 ` Geert Uytterhoeven
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2023-07-17 23:03 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hello, Geert.

Can you please the following patch and see how many reports you get? Looking
back at your reports, I think some of them probably should be converted to
UNBOUND but we should have a better idea with the adjusted threshold.

Thanks.

From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Mon, 17 Jul 2023 12:50:02 -1000
Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
 below 1000

wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
Once detected, they're excluded from concurrency management to prevent them
from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
enabled, repeat offenders are also reported so that the code can be updated.

The default threshold is 10ms which is long enough to do fair bit of work on
modern CPUs while short enough to be usually not noticeable. This
unfortunately leads to a lot of, arguable spurious, detections on very slow
CPUs. Using the same threshold across CPUs whose performance levels may be
apart by multiple levels of magnitude doesn't make whole lot of sense.

This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
is below 1000. This is obviously very inaccurate but it doesn't have to be
accurate to be useful. The mechanism is still useful when the threshold is
fully scaled up and the benefits of reports are usually shared with everyone
regardless of who's reporting, so as long as there are sufficient number of
fast machines reporting, we don't lose much.

Some (or is it all?) ARM CPUs systemtically report significantly lower
BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
are, it's at least a missed opportunity and it probably would be a good idea
to teach workqueue about it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
---
 kernel/workqueue.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 02a8f402eeb5..0d7a3d29762f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -52,6 +52,7 @@
 #include <linux/sched/debug.h>
 #include <linux/nmi.h>
 #include <linux/kvm_para.h>
+#include <linux/delay.h>
 
 #include "workqueue_internal.h"
 
@@ -338,8 +339,10 @@ static cpumask_var_t *wq_numa_possible_cpumask;
  * Per-cpu work items which run for longer than the following threshold are
  * automatically considered CPU intensive and excluded from concurrency
  * management to prevent them from noticeably delaying other per-cpu work items.
+ * ULONG_MAX indicates that the user hasn't overridden it with a boot parameter.
+ * The actual value is initialized in wq_cpu_intensive_thresh_init().
  */
-static unsigned long wq_cpu_intensive_thresh_us = 10000;
+static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 0644);
 
 static bool wq_disable_numa;
@@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
 	       !system_freezable_power_efficient_wq);
 }
 
+static void __init wq_cpu_intensive_thresh_init(void)
+{
+	unsigned long thresh;
+	unsigned long mips;
+
+	/* if the user set it to a specific value, keep it */
+	if (wq_cpu_intensive_thresh_us != ULONG_MAX)
+		return;
+
+	/*
+	 * The default of 10ms is derived from the fact that most modern (as of
+	 * 2023) processors can do a lot in 10ms and that it's just below what
+	 * most consider human-perceivable. However, the kernel also runs on a
+	 * lot slower CPUs including microcontrollers where the threshold is way
+	 * too low.
+	 *
+	 * Let's scale up the threshold upto 1 second if BogoMips is below 1000.
+	 * This is by no means accurate but it doesn't have to be. The mechanism
+	 * is still useful even when the threshold is fully scaled up. Also, as
+	 * the reports would usually be applicable to everyone, some machines
+	 * operating on longer thresholds won't significantly diminish their
+	 * usefulness.
+	 */
+	thresh = 10 * USEC_PER_MSEC;
+
+	/* see init/calibrate.c for lpj -> BogoMIPS calculation */
+	mips = max_t(unsigned long, loops_per_jiffy / 500000 * HZ, 1);
+	if (mips < 1000)
+		thresh = min_t(unsigned long, thresh * 1000 / mips, USEC_PER_SEC);
+
+	pr_debug("wq_cpu_intensive_thresh: lpj=%lu mips=%lu thresh_us=%lu\n",
+		 loops_per_jiffy, mips, thresh);
+
+	wq_cpu_intensive_thresh_us = thresh;
+}
+
 /**
  * workqueue_init - bring workqueue subsystem fully online
  *
@@ -6528,6 +6567,8 @@ void __init workqueue_init(void)
 	struct worker_pool *pool;
 	int cpu, bkt;
 
+	wq_cpu_intensive_thresh_init();
+
 	/*
 	 * It'd be simpler to initialize NUMA in workqueue_init_early() but
 	 * CPU to node mapping may not be available that early on some
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-17 23:03               ` Tejun Heo
@ 2023-07-18  9:54                 ` Geert Uytterhoeven
  2023-07-18 22:01                   ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-18  9:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hi Tejun,

On Tue, Jul 18, 2023 at 1:03 AM Tejun Heo <tj@kernel.org> wrote:
> Can you please the following patch and see how many reports you get? Looking
> back at your reports, I think some of them probably should be converted to
> UNBOUND but we should have a better idea with the adjusted threshold.
>
> Thanks.
>
> From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Mon, 17 Jul 2023 12:50:02 -1000
> Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
>  below 1000
>
> wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
> Once detected, they're excluded from concurrency management to prevent them
> from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
> enabled, repeat offenders are also reported so that the code can be updated.
>
> The default threshold is 10ms which is long enough to do fair bit of work on
> modern CPUs while short enough to be usually not noticeable. This
> unfortunately leads to a lot of, arguable spurious, detections on very slow
> CPUs. Using the same threshold across CPUs whose performance levels may be
> apart by multiple levels of magnitude doesn't make whole lot of sense.
>
> This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
> is below 1000. This is obviously very inaccurate but it doesn't have to be
> accurate to be useful. The mechanism is still useful when the threshold is
> fully scaled up and the benefits of reports are usually shared with everyone
> regardless of who's reporting, so as long as there are sufficient number of
> fast machines reporting, we don't lose much.
>
> Some (or is it all?) ARM CPUs systemtically report significantly lower
> BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
> are, it's at least a missed opportunity and it probably would be a good idea
> to teach workqueue about it.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Thanks!

I gave it a try on a system with an 800 MHz Cortex A9, only to discover
it makes no difference, as that machine has 1600 BogoMIPS:

workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND
workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 8 times,
consider switching to WQ_UNBOUND
workqueue: genpd_power_off_work_fn hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND
workqueue: blk_mq_run_work_fn hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND
workqueue: pm_runtime_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND
workqueue: phy_state_machine hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND
workqueue: drm_mode_rmfb_work_fn hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND
workqueue: sync_hw_clock hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND
workqueue: rtc_timer_do_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND

Artificially low BogoMIPS numbers only happen on systems that have
the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
and arm64).

I will test on more systems, but that will probably not happen until
next week...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-18  9:54                 ` Geert Uytterhoeven
@ 2023-07-18 22:01                   ` Tejun Heo
  2023-07-25 14:46                     ` Geert Uytterhoeven
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2023-07-18 22:01 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hello,

On Tue, Jul 18, 2023 at 11:54:58AM +0200, Geert Uytterhoeven wrote:
> I gave it a try on a system with an 800 MHz Cortex A9, only to discover
> it makes no difference, as that machine has 1600 BogoMIPS:

Oops.

> workqueue: blk_mq_run_work_fn hogged CPU for >10000us 4 times,
> consider switching to WQ_UNBOUND

It could be that we actually want to switch to UNBOUND for some reports but
the above triggering most likely indicates that the threshold is too
aggressive.

> Artificially low BogoMIPS numbers only happen on systems that have
> the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
> and arm64).

Ah, I see. Thanks for the explanation.

> I will test on more systems, but that will probably not happen until
> next week...

Thanks, really appreciate it. Can you try the following instead when you
have time? I just pushed up the lower boundary to 4000 MIPS. The scaling is
still capped at 1s.

From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Mon, 17 Jul 2023 12:50:02 -1000
Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
 below 4000

wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
Once detected, they're excluded from concurrency management to prevent them
from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
enabled, repeat offenders are also reported so that the code can be updated.

The default threshold is 10ms which is long enough to do fair bit of work on
modern CPUs while short enough to be usually not noticeable. This
unfortunately leads to a lot of, arguable spurious, detections on very slow
CPUs. Using the same threshold across CPUs whose performance levels may be
apart by multiple levels of magnitude doesn't make whole lot of sense.

This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
is below 4000. This is obviously very inaccurate but it doesn't have to be
accurate to be useful. The mechanism is still useful when the threshold is
fully scaled up and the benefits of reports are usually shared with everyone
regardless of who's reporting, so as long as there are sufficient number of
fast machines reporting, we don't lose much.

Some (or is it all?) ARM CPUs systemtically report significantly lower
BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
are, it's at least a missed opportunity and it probably would be a good idea
to teach workqueue about it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
---
 kernel/workqueue.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 02a8f402eeb5..0d7a3d29762f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -52,6 +52,7 @@
 #include <linux/sched/debug.h>
 #include <linux/nmi.h>
 #include <linux/kvm_para.h>
+#include <linux/delay.h>
 
 #include "workqueue_internal.h"
 
@@ -338,8 +339,10 @@ static cpumask_var_t *wq_numa_possible_cpumask;
  * Per-cpu work items which run for longer than the following threshold are
  * automatically considered CPU intensive and excluded from concurrency
  * management to prevent them from noticeably delaying other per-cpu work items.
+ * ULONG_MAX indicates that the user hasn't overridden it with a boot parameter.
+ * The actual value is initialized in wq_cpu_intensive_thresh_init().
  */
-static unsigned long wq_cpu_intensive_thresh_us = 10000;
+static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 0644);
 
 static bool wq_disable_numa;
@@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
 	       !system_freezable_power_efficient_wq);
 }
 
+static void __init wq_cpu_intensive_thresh_init(void)
+{
+	unsigned long thresh;
+	unsigned long mips;
+
+	/* if the user set it to a specific value, keep it */
+	if (wq_cpu_intensive_thresh_us != ULONG_MAX)
+		return;
+
+	/*
+	 * The default of 10ms is derived from the fact that most modern (as of
+	 * 2023) processors can do a lot in 10ms and that it's just below what
+	 * most consider human-perceivable. However, the kernel also runs on a
+	 * lot slower CPUs including microcontrollers where the threshold is way
+	 * too low.
+	 *
+	 * Let's scale up the threshold upto 1 second if BogoMips is below 4000.
+	 * This is by no means accurate but it doesn't have to be. The mechanism
+	 * is still useful even when the threshold is fully scaled up. Also, as
+	 * the reports would usually be applicable to everyone, some machines
+	 * operating on longer thresholds won't significantly diminish their
+	 * usefulness.
+	 */
+	thresh = 10 * USEC_PER_MSEC;
+
+	/* see init/calibrate.c for lpj -> BogoMIPS calculation */
+	mips = max_t(unsigned long, loops_per_jiffy / 500000 * HZ, 1);
+	if (mips < 4000)
+		thresh = min_t(unsigned long, thresh * 4000 / mips, USEC_PER_SEC);
+
+	pr_debug("wq_cpu_intensive_thresh: lpj=%lu mips=%lu thresh_us=%lu\n",
+		 loops_per_jiffy, mips, thresh);
+
+	wq_cpu_intensive_thresh_us = thresh;
+}
+
 /**
  * workqueue_init - bring workqueue subsystem fully online
  *
@@ -6528,6 +6567,8 @@ void __init workqueue_init(void)
 	struct worker_pool *pool;
 	int cpu, bkt;
 
+	wq_cpu_intensive_thresh_init();
+
 	/*
 	 * It'd be simpler to initialize NUMA in workqueue_init_early() but
 	 * CPU to node mapping may not be available that early on some
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)
  2023-07-18 22:01                   ` Tejun Heo
@ 2023-07-25 14:46                     ` Geert Uytterhoeven
  2023-07-25 21:52                       ` [PATCH wq/for-6.5-fixes] workqueue: Drop the special locking rule for worker->flags and worker_pool->flags Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2023-07-25 14:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

Hi Tejun,

On Wed, Jul 19, 2023 at 12:01 AM Tejun Heo <tj@kernel.org> wrote:
> On Tue, Jul 18, 2023 at 11:54:58AM +0200, Geert Uytterhoeven wrote:
> > I gave it a try on a system with an 800 MHz Cortex A9, only to discover
> > it makes no difference, as that machine has 1600 BogoMIPS:
>
> Oops.
>
> > workqueue: blk_mq_run_work_fn hogged CPU for >10000us 4 times,
> > consider switching to WQ_UNBOUND
>
> It could be that we actually want to switch to UNBOUND for some reports but
> the above triggering most likely indicates that the threshold is too
> aggressive.
>
> > Artificially low BogoMIPS numbers only happen on systems that have
> > the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
> > and arm64).
>
> Ah, I see. Thanks for the explanation.
>
> > I will test on more systems, but that will probably not happen until
> > next week...
>
> Thanks, really appreciate it. Can you try the following instead when you
> have time? I just pushed up the lower boundary to 4000 MIPS. The scaling is
> still capped at 1s.

Thanks, with the below, I see no more WQ_UNBOUND messages.

> From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Mon, 17 Jul 2023 12:50:02 -1000
> Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
>  below 4000

> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c

> @@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
>                !system_freezable_power_efficient_wq);
>  }
>
> +static void __init wq_cpu_intensive_thresh_init(void)
> +{
> +       unsigned long thresh;
> +       unsigned long mips;

This fails to build on mips.
Apparently mips is a predefined preprocessor macro:

$ echo | mipsel-linux-gnu-gcc -dM -E - | grep -w mips
#define mips 1

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH wq/for-6.5-fixes] workqueue: Drop the special locking rule for worker->flags and worker_pool->flags
  2023-07-25 14:46                     ` Geert Uytterhoeven
@ 2023-07-25 21:52                       ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2023-07-25 21:52 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Lai Jiangshan, torvalds, Peter Zijlstra,
	Linux Kernel Mailing List, kernel-team, Linux PM list,
	DRI Development, linux-rtc, linux-riscv, netdev,
	Linux Fbdev development list, Linux MMC List,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Linux-Renesas

From aa6fde93f3a49e42c0fe0490d7f3711bac0d162e Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Mon, 17 Jul 2023 12:50:02 -1000
Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
 below 4000

wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
Once detected, they're excluded from concurrency management to prevent them
from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
enabled, repeat offenders are also reported so that the code can be updated.

The default threshold is 10ms which is long enough to do fair bit of work on
modern CPUs while short enough to be usually not noticeable. This
unfortunately leads to a lot of, arguable spurious, detections on very slow
CPUs. Using the same threshold across CPUs whose performance levels may be
apart by multiple levels of magnitude doesn't make whole lot of sense.

This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
is below 4000. This is obviously very inaccurate but it doesn't have to be
accurate to be useful. The mechanism is still useful when the threshold is
fully scaled up and the benefits of reports are usually shared with everyone
regardless of who's reporting, so as long as there are sufficient number of
fast machines reporting, we don't lose much.

Some (or is it all?) ARM CPUs systemtically report significantly lower
BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
are, it's at least a missed opportunity and it probably would be a good idea
to teach workqueue about it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
Hello,

Geert, applied this to wq/for-6.5-fixes. The local variable `mips` is
renamed to `bogo` to avoid collision.

Thanks for testing!

 kernel/workqueue.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 02a8f402eeb5..800b4208dba9 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -52,6 +52,7 @@
 #include <linux/sched/debug.h>
 #include <linux/nmi.h>
 #include <linux/kvm_para.h>
+#include <linux/delay.h>
 
 #include "workqueue_internal.h"
 
@@ -338,8 +339,10 @@ static cpumask_var_t *wq_numa_possible_cpumask;
  * Per-cpu work items which run for longer than the following threshold are
  * automatically considered CPU intensive and excluded from concurrency
  * management to prevent them from noticeably delaying other per-cpu work items.
+ * ULONG_MAX indicates that the user hasn't overridden it with a boot parameter.
+ * The actual value is initialized in wq_cpu_intensive_thresh_init().
  */
-static unsigned long wq_cpu_intensive_thresh_us = 10000;
+static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 0644);
 
 static bool wq_disable_numa;
@@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
 	       !system_freezable_power_efficient_wq);
 }
 
+static void __init wq_cpu_intensive_thresh_init(void)
+{
+	unsigned long thresh;
+	unsigned long bogo;
+
+	/* if the user set it to a specific value, keep it */
+	if (wq_cpu_intensive_thresh_us != ULONG_MAX)
+		return;
+
+	/*
+	 * The default of 10ms is derived from the fact that most modern (as of
+	 * 2023) processors can do a lot in 10ms and that it's just below what
+	 * most consider human-perceivable. However, the kernel also runs on a
+	 * lot slower CPUs including microcontrollers where the threshold is way
+	 * too low.
+	 *
+	 * Let's scale up the threshold upto 1 second if BogoMips is below 4000.
+	 * This is by no means accurate but it doesn't have to be. The mechanism
+	 * is still useful even when the threshold is fully scaled up. Also, as
+	 * the reports would usually be applicable to everyone, some machines
+	 * operating on longer thresholds won't significantly diminish their
+	 * usefulness.
+	 */
+	thresh = 10 * USEC_PER_MSEC;
+
+	/* see init/calibrate.c for lpj -> BogoMIPS calculation */
+	bogo = max_t(unsigned long, loops_per_jiffy / 500000 * HZ, 1);
+	if (bogo < 4000)
+		thresh = min_t(unsigned long, thresh * 4000 / bogo, USEC_PER_SEC);
+
+	pr_debug("wq_cpu_intensive_thresh: lpj=%lu BogoMIPS=%lu thresh_us=%lu\n",
+		 loops_per_jiffy, bogo, thresh);
+
+	wq_cpu_intensive_thresh_us = thresh;
+}
+
 /**
  * workqueue_init - bring workqueue subsystem fully online
  *
@@ -6528,6 +6567,8 @@ void __init workqueue_init(void)
 	struct worker_pool *pool;
 	int cpu, bkt;
 
+	wq_cpu_intensive_thresh_init();
+
 	/*
 	 * It'd be simpler to initialize NUMA in workqueue_init_early() but
 	 * CPU to node mapping may not be available that early on some
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-07-25 21:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230511181931.869812-1-tj@kernel.org>
     [not found] ` <20230511181931.869812-7-tj@kernel.org>
     [not found]   ` <ZF6WsSVGX3O1d0pL@slm.duckdns.org>
2023-07-11 13:55     ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Geert Uytterhoeven
2023-07-11 14:06       ` Geert Uytterhoeven
2023-07-11 21:39         ` Tejun Heo
2023-07-12  0:30           ` Tejun Heo
2023-07-12  9:57             ` Geert Uytterhoeven
2023-07-17 23:03               ` Tejun Heo
2023-07-18  9:54                 ` Geert Uytterhoeven
2023-07-18 22:01                   ` Tejun Heo
2023-07-25 14:46                     ` Geert Uytterhoeven
2023-07-25 21:52                       ` [PATCH wq/for-6.5-fixes] workqueue: Drop the special locking rule for worker->flags and worker_pool->flags Tejun Heo
2023-07-12  8:05           ` Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism) Peter Zijlstra
2023-07-12  9:04             ` Geert Uytterhoeven
2023-07-12 12:27               ` Peter Zijlstra
2023-07-13 18:53                 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).