From: Lai Jiangshan <jiangshanlai@gmail.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <frederic@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>, Phil Auld <pauld@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [RFC PATCH v2 1/2] workqueue: Unbind workers before sending them to exit()
Date: Thu, 28 Jul 2022 01:13:37 +0800 [thread overview]
Message-ID: <CAJhGHyCeraX1jcea9kt_FBC561zBgECuw5qx8TAdCG0EHnT6kA@mail.gmail.com> (raw)
In-Reply-To: <20220727115327.2273547-2-vschneid@redhat.com>
Quick review before going to sleep.
On Wed, Jul 27, 2022 at 7:54 PM Valentin Schneider <vschneid@redhat.com> wrote:
>
> It has been reported that isolated CPUs can suffer from interference due to
> per-CPU kworkers waking up just to die.
>
> A surge of workqueue activity during initial setup of a latency-sensitive
> application (refresh_vm_stats() being one of the culprits) can cause extra
> per-CPU kworkers to be spawned. Then, said latency-sensitive task can be
> running merrily on an isolated CPU only to be interrupted sometime later by
> a kworker marked for death (cf. IDLE_WORKER_TIMEOUT, 5 minutes after last
> kworker activity).
>
> Prevent this by affining kworkers to the wq_unbound_cpumask (which doesn't
> contain isolated CPUs, cf. HK_TYPE_WQ) before waking them up after marking
> them with WORKER_DIE.
>
> Changing the affinity does require a sleepable context, so get rid of the
> pool->idle_timer and use a delayed_work instead.
>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> kernel/workqueue.c | 109 +++++++++++++++++++++++++++++++++------------
> 1 file changed, 81 insertions(+), 28 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1ea50f6be843..27642166dcc5 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -167,9 +167,9 @@ struct worker_pool {
> int nr_workers; /* L: total number of workers */
> int nr_idle; /* L: currently idle workers */
>
> - struct list_head idle_list; /* L: list of idle workers */
> - struct timer_list idle_timer; /* L: worker idle timeout */
> - struct timer_list mayday_timer; /* L: SOS timer for workers */
> + struct list_head idle_list; /* L: list of idle workers */
> + struct delayed_work idle_reaper_work; /* L: worker idle timeout */
> + struct timer_list mayday_timer; /* L: SOS timer for workers */
>
> /* a workers is either on busy_hash or idle_list, or the manager */
> DECLARE_HASHTABLE(busy_hash, BUSY_WORKER_HASH_ORDER);
> @@ -1806,8 +1806,10 @@ static void worker_enter_idle(struct worker *worker)
> /* idle_list is LIFO */
> list_add(&worker->entry, &pool->idle_list);
>
> - if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
> - mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
> + if (too_many_workers(pool) && !delayed_work_pending(&pool->idle_reaper_work))
> + mod_delayed_work(system_unbound_wq,
> + &pool->idle_reaper_work,
> + IDLE_WORKER_TIMEOUT);
system_unbound_wq doesn't have a rescuer.
A new workqueue with a rescuer needs to be created and used for
this purpose.
>
> /* Sanity check nr_running. */
> WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);
> @@ -1972,9 +1974,29 @@ static struct worker *create_worker(struct worker_pool *pool)
> return NULL;
> }
>
> +static void unbind_worker(struct worker *worker)
> +{
> + kthread_set_per_cpu(worker->task, -1);
> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, wq_unbound_cpumask) < 0);
> +}
> +
> +static void rebind_worker(struct worker *worker, struct worker_pool *pool)
> +{
> + kthread_set_per_cpu(worker->task, pool->cpu);
> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask) < 0);
> +}
> +
> +static void reap_worker(struct worker *worker)
> +{
> + list_del_init(&worker->entry);
> + unbind_worker(worker);
> + wake_up_process(worker->task);
Since WORKER_DIE is set, the worker can be possible freed now
if there is another source to wake it up.
I think reverting a part of the commit 60f5a4bcf852("workqueue:
async worker destruction") to make use of kthread_stop()
in destroy_worker() should be a good idea.
next prev parent reply other threads:[~2022-07-27 18:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-27 11:53 [RFC PATCH v2 0/2] workqueue: destroy_worker() vs isolated CPUs Valentin Schneider
2022-07-27 11:53 ` [RFC PATCH v2 1/2] workqueue: Unbind workers before sending them to exit() Valentin Schneider
2022-07-27 17:13 ` Lai Jiangshan [this message]
2022-07-28 10:54 ` Valentin Schneider
2022-07-28 16:35 ` Tejun Heo
2022-07-28 17:24 ` Valentin Schneider
2022-07-28 17:31 ` Tejun Heo
2022-07-29 10:12 ` Valentin Schneider
2022-07-27 11:53 ` [RFC PATCH v2 2/2] DEBUG: workqueue: kworker spawner Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJhGHyCeraX1jcea9kt_FBC561zBgECuw5qx8TAdCG0EHnT6kA@mail.gmail.com \
--to=jiangshanlai@gmail.com \
--cc=frederic@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.