All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] workqueue: Control intensive warning threshold through cmdline
@ 2024-02-22  7:28 Xuewen Yan
  2024-02-22 17:52 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Xuewen Yan @ 2024-02-22  7:28 UTC (permalink / raw)
  To: corbet, tj
  Cc: jiangshanlai, paulmck, rdunlap, peterz, yanjiewtw, ke.wang,
	di.shen, xuewen.yan94, linux-doc, linux-kernel

When CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel will report
the work functions which violate the intensive_threshold_us repeatedly.
And now, only when the violate times exceed 4 and is a power of 2,
the kernel warning could be triggered.

However, sometimes, even if a long work execution time occurs only once,
it may cause other work to be delayed for a long time. This may also
cause some problems sometimes.

In order to freely control the threshold of warninging, a boot argument
is added so that the user can control the warning threshold to be printed.
At the same time, keep the exponential backoff to prevent reporting too much.

By default, the warning threshold is 4.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
changes of v2:
 -Update descriptions and fix some syntax errors in documentation.
 -Use threshold to limit the warnning and keep the exponential backoff. 
---
---
 Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++++
 kernel/workqueue.c                              | 14 +++++++++++---
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25680d0..cde809b22eba 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -7225,6 +7225,17 @@
 			threshold repeatedly. They are likely good
 			candidates for using WQ_UNBOUND workqueues instead.
 
+	workqueue.cpu_intensive_warning_thresh=<uint>
+			If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel
+			will report the work functions which violate the
+			intensive_threshold_us repeatedly. In order to prevent
+			the kernel log from being printed too frequently,
+			control the frequency and the threshold.
+
+			By Default, the threshold is 4 times, and the warning
+			is limited by powers of 2. On the other hand, 0 will
+			disable the warning.
+
 	workqueue.power_efficient
 			Per-cpu workqueues are generally preferred because
 			they show better performance thanks to cache
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7b482a26d741..606ba8bf5271 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -359,6 +359,10 @@ static const char *wq_affn_names[WQ_AFFN_NR_TYPES] = {
  */
 static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 0644);
+#ifdef CONFIG_WQ_CPU_INTENSIVE_REPORT
+static unsigned int wq_cpu_intensive_warning_thresh = 4;
+module_param_named(cpu_intensive_warning_thresh, wq_cpu_intensive_warning_thresh, uint, 0644);
+#endif
 
 /* see the comment above the definition of WQ_POWER_EFFICIENT */
 static bool wq_power_efficient = IS_ENABLED(CONFIG_WQ_POWER_EFFICIENT_DEFAULT);
@@ -1198,11 +1202,13 @@ static void wq_cpu_intensive_report(work_func_t func)
 		u64 cnt;
 
 		/*
-		 * Start reporting from the fourth time and back off
+		 * Start reporting from the warning_thresh and back off
 		 * exponentially.
 		 */
 		cnt = atomic64_inc_return_relaxed(&ent->cnt);
-		if (cnt >= 4 && is_power_of_2(cnt))
+		if (wq_cpu_intensive_warning_thresh &&
+		    cnt >= wq_cpu_intensive_warning_thresh &&
+		    is_power_of_2(cnt + 1 - wq_cpu_intensive_warning_thresh))
 			printk_deferred(KERN_WARNING "workqueue: %ps hogged CPU for >%luus %llu times, consider switching to WQ_UNBOUND\n",
 					ent->func, wq_cpu_intensive_thresh_us,
 					atomic64_read(&ent->cnt));
@@ -1231,10 +1237,12 @@ static void wq_cpu_intensive_report(work_func_t func)
 
 	ent = &wci_ents[wci_nr_ents++];
 	ent->func = func;
-	atomic64_set(&ent->cnt, 1);
+	atomic64_set(&ent->cnt, 0);
 	hash_add_rcu(wci_hash, &ent->hash_node, (unsigned long)func);
 
 	raw_spin_unlock(&wci_lock);
+
+	goto restart;
 }
 
 #else	/* CONFIG_WQ_CPU_INTENSIVE_REPORT */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] workqueue: Control intensive warning threshold through cmdline
  2024-02-22  7:28 [PATCH v2] workqueue: Control intensive warning threshold through cmdline Xuewen Yan
@ 2024-02-22 17:52 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2024-02-22 17:52 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: corbet, jiangshanlai, paulmck, rdunlap, peterz, yanjiewtw,
	ke.wang, di.shen, xuewen.yan94, linux-doc, linux-kernel

On Thu, Feb 22, 2024 at 03:28:08PM +0800, Xuewen Yan wrote:
> When CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel will report
> the work functions which violate the intensive_threshold_us repeatedly.
> And now, only when the violate times exceed 4 and is a power of 2,
> the kernel warning could be triggered.
> 
> However, sometimes, even if a long work execution time occurs only once,
> it may cause other work to be delayed for a long time. This may also
> cause some problems sometimes.
> 
> In order to freely control the threshold of warninging, a boot argument
> is added so that the user can control the warning threshold to be printed.
> At the same time, keep the exponential backoff to prevent reporting too much.
> 
> By default, the warning threshold is 4.
> 
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>

Applied to wq/for-6.9 with the following edit:

> +	workqueue.cpu_intensive_warning_thresh=<uint>
> +			If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel
> +			will report the work functions which violate the
> +			intensive_threshold_us repeatedly. In order to prevent
> +			the kernel log from being printed too frequently,
> +			control the frequency and the threshold.
> +
> +			By Default, the threshold is 4 times, and the warning
> +			is limited by powers of 2. On the other hand, 0 will
> +			disable the warning.

I changed this to:

			If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel
			will report the work functions which violate the
			intensive_threshold_us repeatedly. In order to prevent
			spurious warnings, start printing only after a work
			function has violated this threshold number of times.

			The default is 4 times. 0 disables the warning.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-02-22 17:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-22  7:28 [PATCH v2] workqueue: Control intensive warning threshold through cmdline Xuewen Yan
2024-02-22 17:52 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.