From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kautuk Consul <consul.kautuk@gmail.com>,
Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
David Rientjes <rientjes@google.com>,
Ionut Alexa <ionut.m.alexa@gmail.com>,
Guillaume Morin <guillaume@morinfr.org>,
linux-kernel@vger.kernel.org, Kirill Tkhai <tkhai@yandex.ru>
Subject: Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up()
Date: Mon, 1 Sep 2014 19:58:51 +0200 [thread overview]
Message-ID: <20140901175851.GA15210@redhat.com> (raw)
In-Reply-To: <20140901153935.GQ27892@worktop.ger.corp.intel.com>
On 09/01, Peter Zijlstra wrote:
>
> On Mon, Aug 25, 2014 at 05:57:38PM +0200, Oleg Nesterov wrote:
> > Peter, do you remember another problem with TASK_DEAD we discussed recently?
> > (prev_state == TASK_DEAD detection in finish_task_switch() still looks racy).
>
> Uhm, right. That was somewhere on the todo list :-)
>
> > I am starting to think that perhaps we need something like below, what do
> > you all think?
>
> I'm thinking you lost the hunk that adds rq::dead :-), more comments
> below.
And "goto deactivate" should be moved down, after "switch_count"
initialization.
> > + if (unlikely(rq->dead))
> > + goto deactivate;
> > +
>
> Yeah, it would be best to not have to do that; ideally we would be able
> to maybe do both; set rq->dead and current->state == TASK_DEAD.
To avoid spin_unlock_wait() in do_exit(). But on a second thought this
can't work, please see below.
> > --- x/kernel/exit.c
> > +++ x/kernel/exit.c
> > @@ -815,25 +815,8 @@ void do_exit(long code)
> > __this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
> > exit_rcu();
> >
> > - /*
> > - * The setting of TASK_RUNNING by try_to_wake_up() may be delayed
> > - * when the following two conditions become true.
> > - * - There is race condition of mmap_sem (It is acquired by
> > - * exit_mm()), and
> > - * - SMI occurs before setting TASK_RUNINNG.
> > - * (or hypervisor of virtual machine switches to other guest)
> > - * As a result, we may become TASK_RUNNING after becoming TASK_DEAD
> > - *
> > - * To avoid it, we have to wait for releasing tsk->pi_lock which
> > - * is held by try_to_wake_up()
> > - */
> > - smp_mb();
> > - raw_spin_unlock_wait(&tsk->pi_lock);
> > -
> > - /* causes final put_task_struct in finish_task_switch(). */
> > - tsk->state = TASK_DEAD;
> > tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */
> > - schedule();
> > + exit_schedule();
> > BUG();
> > /* Avoid "noreturn function does return". */
> > for (;;)
>
> Yes, something like this might work fine..
Not really :/ Yes, rq->dead (or just "bool prev_dead") should obviously
solve the problem with ttwu() after the last schedule(). But only in a
sense that the dying task won't be activated.
However, the very fact that another CPU can look at this task_struct
means that we still need spin_unlock_wait(). If nothing else to ensure
that try_to_wake_up()->spin_unlock(pi_lock) won't write into the memory
we are are going to free.
So I think the comment in do exit should be updated too, and smp_mb()
should be moved under raw_spin_unlock_wait() but ...
But. If am right, doesn't this mean we that have even more problems with
postmortem wakeups??? Why ttwu() can't _start_ after spin_unlock_wait ?
Oleg.
next prev parent reply other threads:[~2014-09-01 18:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-25 10:54 [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kautuk Consul
2014-08-25 15:57 ` Oleg Nesterov
2014-08-26 4:45 ` Kautuk Consul
2014-08-26 15:03 ` Oleg Nesterov
2014-09-01 15:39 ` Peter Zijlstra
2014-09-01 17:58 ` Oleg Nesterov [this message]
2014-09-01 19:09 ` Peter Zijlstra
2014-09-02 15:52 ` Oleg Nesterov
2014-09-02 16:47 ` Oleg Nesterov
2014-09-02 17:39 ` Peter Zijlstra
2014-09-03 13:36 ` Oleg Nesterov
2014-09-03 14:44 ` Peter Zijlstra
2014-09-03 15:18 ` Oleg Nesterov
2014-09-04 7:15 ` Peter Zijlstra
2014-09-04 17:03 ` Paul E. McKenney
2014-09-04 5:04 ` Ingo Molnar
2014-09-04 6:32 ` Peter Zijlstra
2014-09-03 16:08 ` task_numa_fault() && TASK_DEAD Oleg Nesterov
2014-09-03 16:33 ` Rik van Riel
2014-09-04 7:11 ` Peter Zijlstra
2014-09-04 10:39 ` Oleg Nesterov
2014-09-04 19:14 ` Hugh Dickins
2014-09-05 11:35 ` Oleg Nesterov
2014-09-03 9:04 ` [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kirill Tkhai
2014-09-03 9:45 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140901175851.GA15210@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=consul.kautuk@gmail.com \
--cc=guillaume@morinfr.org \
--cc=ionut.m.alexa@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tkhai@yandex.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).