From: Peter Zijlstra <peterz@infradead.org>
To: mingo@kernel.org, oleg@redhat.com, gkohli@codeaurora.org
Cc: tglx@linutronix.de, mpe@ellerman.id.au, bigeasy@linutronix.de,
linux-kernel@vger.kernel.org, will.deacon@arm.com,
peterz@infradead.org
Subject: [PATCH v2 1/4] kthread, sched: Fix kthread_parkme() (again...)
Date: Tue, 12 Jun 2018 18:12:26 +0200 [thread overview]
Message-ID: <20180612161832.373040414@infradead.org> (raw)
In-Reply-To: 20180612161225.693133662@infradead.org
[-- Attachment #1: peterz-kthread-0.patch --]
[-- Type: text/plain, Size: 6135 bytes --]
Gaurav reports that commit:
85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
isn't working for him. Because of the following race:
> controller Thread CPUHP Thread
> takedown_cpu
> kthread_park
> kthread_parkme
> Set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set Task interruptible
>
>
> wake_up_process
> if (!(p->state & state))
> goto out;
>
> Kthread_parkme
> SET TASK_PARKED
> schedule
> raw_spin_lock(&rq->lock)
> ttwu_remote
> waiting for __task_rq_lock
> context_switch
>
> finish_lock_switch
>
>
>
> Case TASK_PARKED
> kthread_park_complete
>
>
> SET Running
Furthermore, Oleg noticed that the whole scheduler TASK_PARKED
handling is buggered because the TASK_DEAD thing is done with
preemption disabled, the current code can still complete early on
preemption :/
So basically revert that earlier fix and go with a variant of the
alternative mentioned in the commit. Promote TASK_PARKED to special
state to avoid the store-store issue on task->state leading to the
WARN in kthread_unpark() -> __kthread_bind().
But in addition, add wait_task_inactive() to kthread_park() to ensure
the task really is PARKED when we return from kthread_park(). This
avoids the whole kthread still gets migrated nonsense -- although it
would be really good to get this done differently.
Cc: Oleg Nesterov <oleg@redhat.com>
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Fixes: 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/kthread.h | 1 -
include/linux/sched.h | 2 +-
kernel/kthread.c | 30 ++++++++++++++++++++++++------
kernel/sched/core.c | 31 +++++++++++--------------------
4 files changed, 36 insertions(+), 28 deletions(-)
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -62,7 +62,6 @@ void *kthread_probe_data(struct task_str
int kthread_park(struct task_struct *k);
void kthread_unpark(struct task_struct *k);
void kthread_parkme(void);
-void kthread_park_complete(struct task_struct *k);
int kthreadd(void *unused);
extern struct task_struct *kthreadd_task;
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -117,7 +117,7 @@ struct task_group;
* the comment with set_special_state().
*/
#define is_special_task_state(state) \
- ((state) & (__TASK_STOPPED | __TASK_TRACED | TASK_DEAD))
+ ((state) & (__TASK_STOPPED | __TASK_TRACED | TASK_PARKED | TASK_DEAD))
#define __set_current_state(state_value) \
do { \
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -177,9 +177,20 @@ void *kthread_probe_data(struct task_str
static void __kthread_parkme(struct kthread *self)
{
for (;;) {
- set_current_state(TASK_PARKED);
+ /*
+ * TASK_PARKED is a special state; we must serialize against
+ * possible pending wakeups to avoid store-store collisions on
+ * task->state.
+ *
+ * Such a collision might possibly result in the task state
+ * changin from TASK_PARKED and us failing the
+ * wait_task_inactive() in kthread_park().
+ */
+ set_special_state(TASK_PARKED);
if (!test_bit(KTHREAD_SHOULD_PARK, &self->flags))
break;
+
+ complete_all(&self->parked);
schedule();
}
__set_current_state(TASK_RUNNING);
@@ -191,11 +202,6 @@ void kthread_parkme(void)
}
EXPORT_SYMBOL_GPL(kthread_parkme);
-void kthread_park_complete(struct task_struct *k)
-{
- complete_all(&to_kthread(k)->parked);
-}
-
static int kthread(void *_create)
{
/* Copy data: it's on kthread's stack */
@@ -461,6 +467,9 @@ void kthread_unpark(struct task_struct *
reinit_completion(&kthread->parked);
clear_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
+ /*
+ * __kthread_parkme() will either see !SHOULD_PARK or get the wakeup.
+ */
wake_up_state(k, TASK_PARKED);
}
EXPORT_SYMBOL_GPL(kthread_unpark);
@@ -487,7 +496,16 @@ int kthread_park(struct task_struct *k)
set_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
if (k != current) {
wake_up_process(k);
+ /*
+ * Wait for __kthread_parkme() to complete(), this means we
+ * _will_ have TASK_PARKED and are about to call schedule().
+ */
wait_for_completion(&kthread->parked);
+ /*
+ * Now wait for that schedule() to complete and the task to
+ * get scheduled out.
+ */
+ WARN_ON_ONCE(!wait_task_inactive(k, TASK_PARKED));
}
return 0;
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7,7 +7,6 @@
*/
#include "sched.h"
-#include <linux/kthread.h>
#include <linux/nospec.h>
#include <asm/switch_to.h>
@@ -2701,28 +2700,20 @@ static struct rq *finish_task_switch(str
membarrier_mm_sync_core_before_usermode(mm);
mmdrop(mm);
}
- if (unlikely(prev_state & (TASK_DEAD|TASK_PARKED))) {
- switch (prev_state) {
- case TASK_DEAD:
- if (prev->sched_class->task_dead)
- prev->sched_class->task_dead(prev);
-
- /*
- * Remove function-return probe instances associated with this
- * task and put them back on the free list.
- */
- kprobe_flush_task(prev);
+ if (unlikely(prev_state == TASK_DEAD)) {
+ if (prev->sched_class->task_dead)
+ prev->sched_class->task_dead(prev);
- /* Task is done with its stack. */
- put_task_stack(prev);
+ /*
+ * Remove function-return probe instances associated with this
+ * task and put them back on the free list.
+ */
+ kprobe_flush_task(prev);
- put_task_struct(prev);
- break;
+ /* Task is done with its stack. */
+ put_task_stack(prev);
- case TASK_PARKED:
- kthread_park_complete(prev);
- break;
- }
+ put_task_struct(prev);
}
tick_nohz_task_switch();
next prev parent reply other threads:[~2018-06-12 16:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-12 16:12 [PATCH v2 0/4] kthread/smpboot: More fixes Peter Zijlstra
2018-06-12 16:12 ` Peter Zijlstra [this message]
2018-06-12 16:12 ` [PATCH v2 2/4] watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work Peter Zijlstra
2018-06-13 5:08 ` [lkp-robot] [watchdog/softlockup] 4808e7a5dc: BUG:KASAN:null-ptr-deref_in_h kernel test robot
2018-06-13 8:27 ` Peter Zijlstra
2018-06-12 16:12 ` [PATCH v2 3/4] smpboot: Remove cpumask from the API Peter Zijlstra
2018-06-12 16:12 ` [PATCH v2 4/4] kthread: Simplify kthread_park() completion Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180612161832.373040414@infradead.org \
--to=peterz@infradead.org \
--cc=bigeasy@linutronix.de \
--cc=gkohli@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=oleg@redhat.com \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).