From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave@sr71.net>, Borislav Petkov <bp@alien8.de>,
LKML <linux-kernel@vger.kernel.org>,
Dave Jones <davej@redhat.com>,
dhillf@gmail.com, Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu
Date: Thu, 11 Apr 2013 16:18:48 +0530 [thread overview]
Message-ID: <51669510.2040200@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1304111020520.21884@ionos>
On 04/11/2013 03:49 PM, Thomas Gleixner wrote:
> Dave,
>
> On Wed, 10 Apr 2013, Dave Hansen wrote:
>
>> I think I got a full trace this time:
>>
>> http://sr71.net/~dave/linux/bigbox-trace.1365621899.txt.gz
>>
>> The last timestamp is pretty close to the timestamp on the console:
>>
>> [ 2071.033434] smpboot_thread_fn():
>> [ 2071.033455] smpboot_thread_fn() cpu: 22 159
>> [ 2071.033470] td->cpu: 22
>> [ 2071.033475] smp_processor_id(): 21
>> [ 2071.033486] comm: migration/%u
>
> Yes, that's helpful. Though it makes my mind boggle:
>
> migration/22-4335 [021] d.h. 2071.020530: sched_wakeup: comm=migration/21 pid=4323 prio=0 success=1 target_cpu=021^M
> migration/22-4335 [021] d... 2071.020541: sched_switch: prev_comm=migration/22 prev_pid=4335 prev_prio=0 prev_state=0x200 ==> next_comm=migration/21 next_pid=4323 next_prio=0^M
> migration/21-4323 [021] d... 2071.026110: sched_switch: prev_comm=migration/21 prev_pid=4323 prev_prio=0 prev_state=S ==> next_comm=migration/22 next_pid=4335 next_prio=0^M
> migration/22-4335 [021] .... 2071.033422: smpboot_thread_fn <-kthread^M
>
> So the migration thread schedules out with state TASK_PARKED and gets
> scheduled back in right away without a wakeup. Srivatsa was about
> right, that this might be related to the sched_set_stop_task() call,
> but the changelog led completely down the wrong road.
>
> So the issue is:
>
> CPU14 CPU21
> create_thread(for CPU22)
> park_thread()
> wait_for_completion() park_me()
> complete()
> sched_set_stop_task()
> schedule(TASK_PARKED)
>
> The sched_set_stop_task() call is issued while the task is on the
> runqueue of CPU21 and that confuses the hell out of the stop_task
> class on that cpu. So as we have to synchronize the state for the
> bind call (the issue observed by Borislav) we need to do the same
> before issuing sched_set_stop_task(). Delta patch below.
>
In that case, why not just apply this 2 line patch on mainline?
The patch I sent yesterday was more elaborate because I wrongly assumed
that kthread_bind() can cause a wakeup. But now, I feel the patch shown
below should work just fine too. Yeah, it binds the task during creation
as well as during unpark, but that should be ok (see below).
Somehow, I believe we can handle this issue without introducing that
whole TASK_PARKED thing..
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 691dc2e..74af4a8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -308,6 +308,9 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
to_kthread(p)->cpu = cpu;
/* Park the thread to get it out of TASK_UNINTERRUPTIBLE state */
kthread_park(p);
+
+ wait_task_inactive(p, TASK_INTERRUPTIBLE);
+ __kthread_bind(p, cpu);
return p;
}
The reason why we can't get rid of the bind in the unpark code is because,
the threads are parked during CPU offline *after* calling CPU_DOWN_PREPARE.
And during CPU_DOWN_PREPARE, the scheduler removes the CPU from the cpu_active_mask.
So on any subsequent wakeup of these threads before they are parked, the scheduler
will force migrate them to some other CPU (but alas it wont print this event
because of the p->mm != NULL check in select_fallback_rq()). So during unpark
during the next online operation we need to bind it again. But that's fine, IMHO.
Regards,
Srivatsa S. Bhat
next prev parent reply other threads:[~2013-04-11 10:51 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-05 21:43 kernel BUG at kernel/smpboot.c:134! Dave Hansen
2013-04-06 7:12 ` Srivatsa S. Bhat
2013-04-06 8:31 ` Thomas Gleixner
2013-04-07 9:20 ` Thomas Gleixner
2013-04-07 9:50 ` Borislav Petkov
2013-04-08 9:24 ` Thomas Gleixner
2013-04-08 11:55 ` Borislav Petkov
2013-04-08 12:17 ` Thomas Gleixner
2013-04-09 14:38 ` [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu Thomas Gleixner
2013-04-09 15:55 ` Dave Hansen
2013-04-09 18:43 ` Thomas Gleixner
2013-04-09 19:30 ` Thomas Gleixner
2013-04-09 20:38 ` Dave Hansen
2013-04-09 20:54 ` Dave Hansen
2013-04-10 8:29 ` Thomas Gleixner
2013-04-10 10:51 ` Thomas Gleixner
2013-04-10 19:41 ` Dave Hansen
2013-04-11 10:19 ` Thomas Gleixner
2013-04-11 10:48 ` Srivatsa S. Bhat [this message]
2013-04-11 11:43 ` Srivatsa S. Bhat
2013-04-11 11:59 ` Srivatsa S. Bhat
2013-04-11 12:51 ` Thomas Gleixner
2013-04-11 12:54 ` Thomas Gleixner
2013-04-11 13:46 ` Thomas Gleixner
2013-04-11 18:07 ` Dave Hansen
2013-04-11 19:48 ` Thomas Gleixner
2013-04-10 14:03 ` [PATCH] CPU hotplug, smpboot: Fix crash in smpboot_thread_fn() Srivatsa S. Bhat
2013-04-11 8:10 ` Thomas Gleixner
2013-04-11 10:19 ` Srivatsa S. Bhat
2013-04-11 19:16 ` [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu Srivatsa S. Bhat
2013-04-11 20:47 ` Thomas Gleixner
2013-04-11 21:19 ` Srivatsa S. Bhat
2013-04-12 10:59 ` Thomas Gleixner
2013-04-12 11:26 ` Srivatsa S. Bhat
2013-04-15 19:49 ` Dave Hansen
2013-04-12 10:41 ` Peter Zijlstra
2013-04-12 12:32 ` [tip:core/urgent] " tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51669510.2040200@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=bp@alien8.de \
--cc=dave@sr71.net \
--cc=davej@redhat.com \
--cc=dhillf@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).