All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <jiangshanlai@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@suse.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] workqueue: Make create_worker() safe against spurious wakeups
Date: Thu, 4 Aug 2022 16:57:01 +0800	[thread overview]
Message-ID: <CAJhGHyDHVFto=UNrVQ5bjc11yGCnNmLKAkPPimtwF-Qa+1g==Q@mail.gmail.com> (raw)
In-Reply-To: <20220622140853.31383-1-pmladek@suse.com>

On Wed, Jun 22, 2022 at 10:10 PM Petr Mladek <pmladek@suse.com> wrote:
>
> A system crashed with the following BUG() report:
>
>   [115147.050484] BUG: kernel NULL pointer dereference, address: 0000000000000000
>   [115147.050488] #PF: supervisor write access in kernel mode
>   [115147.050489] #PF: error_code(0x0002) - not-present page
>   [115147.050490] PGD 0 P4D 0
>   [115147.050494] Oops: 0002 [#1] PREEMPT_RT SMP NOPTI
>   [115147.050498] CPU: 1 PID: 16213 Comm: kthreadd Kdump: loaded Tainted: G           O   X    5.3.18-2-rt #1 SLE15-SP2 (unreleased)
>   [115147.050510] RIP: 0010:_raw_spin_lock_irq+0x14/0x30
>   [115147.050513] Code: 89 c6 e8 5f 7a 9b ff 66 90 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 fa 65 ff 05 fb 53 6c 55 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 01 c3 89 c6 e8 2e 7a 9b ff 66 90 c3 90 90 90 90 90
>   [115147.050514] RSP: 0018:ffffb0f68822fed8 EFLAGS: 00010046
>   [115147.050515] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>   [115147.050516] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000000
>   [115147.050517] RBP: ffff9ca73af40a40 R08: 0000000000000001 R09: 0000000000027340
>   [115147.050519] R10: ffffb0f68822fe70 R11: 00000000000000a9 R12: ffffb0f688067dc0
>   [115147.050520] R13: ffff9ca77e9a8000 R14: ffff9ca7634ca780 R15: ffff9ca7634ca780
>   [115147.050521] FS:  0000000000000000(0000) GS:ffff9ca77fb00000(0000) knlGS:0000000000000000
>   [115147.050523] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [115147.050524] CR2: 00000000000000b8 CR3: 000000004472e000 CR4: 00000000003406e0
>   [115147.050524] Call Trace:
>   [115147.050533]  worker_thread+0xb4/0x3c0
>   [115147.050538]  ? process_one_work+0x4a0/0x4a0
>   [115147.050540]  kthread+0x152/0x170
>   [115147.050542]  ? kthread_park+0xa0/0xa0
>   [115147.050544]  ret_from_fork+0x35/0x40
>
> Further debugging shown that the worker thread was woken
> before worker_attach_to_pool() finished in create_worker().
>
> Any kthread is supposed to stay in TASK_UNINTERRUPTIBLE sleep
> until it is explicitly woken. But a spurious wakeup might
> break this expectation.
>
> As a result, worker_thread() might read worker->pool before
> it was set in worker create_worker() by worker_attach_to_pool().
> Also manage_workers() might want to create yet another worker
> before worker->pool->nr_workers is updated. It is a kind off
> a chicken & egg problem.
>
> Synchronize these operations using a completion API.
>
> Note that worker->pool might be then read without wq_pool_attach_mutex.
> Normal worker always belongs to the same pool.
>
> Also note that rescuer_thread() does not need this because all
> needed values are set before the kthreads is created. It is
> connected with a particular workqueue. It is attached to different
> pools when needed.
>
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/workqueue.c          | 13 ++++++++++++-
>  kernel/workqueue_internal.h |  1 +
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1ea50f6be843..9586b0797145 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1939,6 +1939,7 @@ static struct worker *create_worker(struct worker_pool *pool)
>                 goto fail;
>
>         worker->id = id;
> +       init_completion(&worker->ready_to_start);
>
>         if (pool->cpu >= 0)
>                 snprintf(id_buf, sizeof(id_buf), "%d:%d%s", pool->cpu, id,
> @@ -1961,6 +1962,7 @@ static struct worker *create_worker(struct worker_pool *pool)
>         raw_spin_lock_irq(&pool->lock);
>         worker->pool->nr_workers++;
>         worker_enter_idle(worker);
> +       complete(&worker->ready_to_start);
>         wake_up_process(worker->task);
>         raw_spin_unlock_irq(&pool->lock);
>
> @@ -2378,10 +2380,19 @@ static void set_pf_worker(bool val)
>  static int worker_thread(void *__worker)
>  {
>         struct worker *worker = __worker;
> -       struct worker_pool *pool = worker->pool;
> +       struct worker_pool *pool;
>
>         /* tell the scheduler that this is a workqueue worker */
>         set_pf_worker(true);
> +
> +       /*
> +        * Wait until the worker is ready to get started. It must be attached
> +        * to a pool first. This is needed because of spurious wakeups.
> +        */
> +       while (wait_for_completion_interruptible(&worker->ready_to_start))
> +               ;

There might be some wakeups from wake_up_worker() since it is in
the idle list.  These wakeups will be "spurious wakeups" in the view
of the completion subsystem.

> +       pool = worker->pool;
> +
>  woke_up:
>         raw_spin_lock_irq(&pool->lock);
>
> diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
> index e00b1204a8e9..ffca2783c8bf 100644
> --- a/kernel/workqueue_internal.h
> +++ b/kernel/workqueue_internal.h
> @@ -41,6 +41,7 @@ struct worker {
>                                                 /* L: for rescuers */
>         struct list_head        node;           /* A: anchored at pool->workers */
>                                                 /* A: runs through worker->node */
> +       struct completion       ready_to_start; /* None: handle spurious wakeup */
>
>         unsigned long           last_active;    /* L: last active timestamp */
>         unsigned int            flags;          /* X: flags */
> --
> 2.35.3
>

  parent reply	other threads:[~2022-08-04  8:57 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-22 14:08 [PATCH] workqueue: Make create_worker() safe against spurious wakeups Petr Mladek
2022-06-23  7:00 ` Petr Mladek
2022-06-23  7:14   ` Michal Hocko
2022-06-26  4:19   ` Hillf Danton
2022-06-25  5:00 ` re. Spurious wakeup on a newly created kthread Tejun Heo
2022-06-25 17:01   ` Linus Torvalds
2022-06-25 17:36     ` Eric W. Biederman
2022-06-25 18:25       ` Linus Torvalds
2022-06-25 18:43         ` Linus Torvalds
2022-06-25 23:28           ` Eric W. Biederman
2022-06-25 23:41             ` Eric W. Biederman
2022-06-25 23:43             ` Linus Torvalds
2022-06-25 23:48               ` Linus Torvalds
2022-06-26  0:19                 ` Eric W. Biederman
2022-06-27  0:01                   ` Wedson Almeida Filho
2022-06-27  7:11                     ` Peter Zijlstra
2022-06-27 18:23                       ` Wedson Almeida Filho
2022-06-27 18:45                         ` Linus Torvalds
2022-06-26 19:14                 ` [PATCH 0/3] kthread: Stop using TASK_UNINTERRUPTIBLE Eric W. Biederman
2022-06-26 19:15                   ` [PATCH 1/3] kthread: Remove the flags argument from kernel_thread Eric W. Biederman
2022-06-26 21:20                     ` Linus Torvalds
2022-06-26 19:16                   ` [PATCH 2/3] kthread: Replace kernel_thread with new_kthread Eric W. Biederman
2022-06-26 19:16                   ` [PATCH 3/3] kthread: Stop abusing TASK_UNINTERRUPTIBLE (INCOMPLETE) Eric W. Biederman
2022-06-26 19:59                     ` Linus Torvalds
2022-06-26 20:23                       ` Tejun Heo
2022-06-26 20:55                         ` Linus Torvalds
2022-06-27  7:22                         ` Peter Zijlstra
2022-06-27  8:11                           ` Tejun Heo
2022-06-27 18:04                             ` Wedson Almeida Filho
2022-06-27 22:06                               ` Peter Zijlstra
2022-06-27 22:34                                 ` Linus Torvalds
2022-06-27 22:45                                 ` Wedson Almeida Filho
2022-06-28  0:32                                 ` Wedson Almeida Filho
2022-06-28  7:58                                   ` Peter Zijlstra
2022-06-30  0:57                                     ` Wedson Almeida Filho
2022-06-26 22:14                     ` kernel test robot
2022-06-26 22:34                     ` kernel test robot
2022-06-26  0:21               ` re. Spurious wakeup on a newly created kthread Eric W. Biederman
2022-06-28 14:16           ` Christian Brauner
2022-06-26  0:26         ` Eric W. Biederman
2022-06-26  1:58     ` Tejun Heo
2022-06-26  2:53       ` Linus Torvalds
2022-06-26  6:09         ` Tejun Heo
2022-06-27 12:04         ` Michal Hocko
2022-06-28  9:51     ` Petr Mladek
2022-06-28 10:07       ` Tejun Heo
2022-06-27  8:07   ` Michal Hocko
2022-06-27  8:21     ` Tejun Heo
2022-06-27 10:18       ` Michal Hocko
2022-06-28 15:08     ` Petr Mladek
2022-08-04  8:57 ` Lai Jiangshan [this message]
2022-08-04 10:19   ` [PATCH] workqueue: Make create_worker() safe against spurious wakeups Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJhGHyDHVFto=UNrVQ5bjc11yGCnNmLKAkPPimtwF-Qa+1g==Q@mail.gmail.com' \
    --to=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=pmladek@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.