From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757018Ab2AXRZf (ORCPT ); Tue, 24 Jan 2012 12:25:35 -0500 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:59106 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756999Ab2AXRZd (ORCPT ); Tue, 24 Jan 2012 12:25:33 -0500 Message-ID: <4F1EE992.5000901@jp.fujitsu.com> Date: Tue, 24 Jan 2012 12:25:38 -0500 From: KOSAKI Motohiro User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 MIME-Version: 1.0 To: peterz@infradead.org CC: oleg@redhat.com, mingo@elte.hu, y-goto@jp.fujitsu.com, tglx@linutronix.de, kamezawa.hiroyu@jp.fujitsu.com, linux-kernel@vger.kernel.org Subject: Re: [BUG] TASK_DEAD task is able to be woken up in special condition References: <20120116205140.6120.E1E9C6FF@jp.fujitsu.com> <1326721082.2442.234.camel@twins> <20120117174031.3118.E1E9C6FF@jp.fujitsu.com> <20120117090605.GD7612@elte.hu> <20120117151242.GA13290@redhat.com> <20120118094219.GE5842@elte.hu> <20120118142005.GB10105@redhat.com> <1327400349.2614.10.camel@laptop> <1327402527.2614.17.camel@laptop> In-Reply-To: <1327402527.2614.17.camel@laptop> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/24/2012 5:55 AM, Peter Zijlstra wrote: > On Tue, 2012-01-24 at 11:19 +0100, Peter Zijlstra wrote: >> On Wed, 2012-01-18 at 15:20 +0100, Oleg Nesterov wrote: >>> do_exit() is different because it can not handle the spurious wakeup. >>> Well, may be we can? we can simply do >>> >>> for (;;) { >>> tsk->state = TASK_DEAD; >>> schedule(); >>> } >>> >>> __schedule() can't race with ttwu() once it takes rq->lock. If the >>> exiting task is deactivated, finish_task_switch() will see EXIT_DEAD. >> >> TASK_DEAD, right? >> >>> Unless I missed something, the only problem is preempt_disable(), >>> but schedule_debug() checks ->exit_state. >>> >>> OTOH, if we fix this race then probably schedule_debug() should >>> check state == EXIT_DEAD instead. >> >> Hmm, interesting. On the up side that removes the need for that inf loop >> after BUG, down side is of course that we loose the BUG itself too. Now >> I'm not too sure we actually care about that, a task spinning at 100% in >> x state should be fairly obvious borkage and its not like we hit this >> thing very often. > > Something like so, right? schedule_debug() already tests > prev->exit_state so it should DTRT afaict. > > Also, while going over this again, I think Yasunori-San's patch isn't > sufficient, note how the p->state = TASK_RUNNING in ttwu_do_wakeup() can > happen outside of p->pi_lock when the task gets queued on a remote cpu. > > --- > kernel/exit.c | 17 +++++++++++------ > 1 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/kernel/exit.c b/kernel/exit.c > index 294b170..ccd4f84 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -1039,13 +1039,18 @@ void do_exit(long code) > __this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied); > exit_rcu(); > /* causes final put_task_struct in finish_task_switch(). */ > - tsk->state = TASK_DEAD; > tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */ > - schedule(); > - BUG(); > - /* Avoid "noreturn function does return". */ > - for (;;) > - cpu_relax(); /* For when BUG is null */ > + for (;;) { > + /* > + * A spurious wakeup, eg. generated by rwsem when down()'s call > + * to schedule() doesn't happen but the wakeup from the > + * previous owner's up() did, can stomp on our ->state. > + * > + * This loop also avoids "noreturn functions does return" > + */ > + tsk->state = TASK_DEAD; > + schedule(); > + } > } This looks ok to me.