linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kirill Tkhai <ktkhai@parallels.com>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Kirill Tkhai <tkhai@yandex.ru>, Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH 3/3] introduce task_rcu_dereference()
Date: Wed, 18 May 2016 20:23:18 +0200	[thread overview]
Message-ID: <20160518182318.GA15661@redhat.com> (raw)
In-Reply-To: <20160518170218.GY3192@twins.programming.kicks-ass.net>

On 05/18, Peter Zijlstra wrote:
>
> So I keep coming back to this, and every time I look at it my brain
> explodes.

Heh ;) I forgot about this completely.

> On Mon, Oct 27, 2014 at 08:54:46PM +0100, Oleg Nesterov wrote:
> > +struct task_struct *task_rcu_dereference(struct task_struct **ptask)
> > +{
> > +	struct task_struct *task;
> > +	struct sighand_struct *sighand;
>
> I think that needs: ' = NULL', because if
>
> > +
> > +	/*
> > +	 * We need to verify that release_task() was not called and thus
> > +	 * delayed_put_task_struct() can't run and drop the last reference
> > +	 * before rcu_read_unlock(). We check task->sighand != NULL, but
> > +	 * we can read the already freed and reused memory.
> > +	 */
> > + retry:
> > +	task = rcu_dereference(*ptask);
> > +	if (!task)
> > +		return NULL;
> > +
> > +	probe_slab_address(&task->sighand, sighand);
>
> this fails because of DEBUG_PAGE_ALLOC, we might not write to sighand at
> all, and
>
> > +	/*
> > +	 * Pairs with atomic_dec_and_test(usage) in put_task_struct(task).
> > +	 * If this task was already freed we can not miss the preceding
> > +	 * update of this pointer.
> > +	 */
> > +	smp_rmb();
> > +	if (unlikely(task != ACCESS_ONCE(*ptask)))
> > +		goto retry;
> > +
> > +	/*
> > +	 * Either this is the same task and we can trust sighand != NULL, or
> > +	 * its memory was re-instantiated as another instance of task_struct.
> > +	 * In the latter case the new task can not go away until another rcu
> > +	 * gp pass, so the only problem is that sighand == NULL can be false
> > +	 * positive but we can pretend we got this NULL before it was freed.
> > +	 */
> > +	if (sighand)
>
> this will be inspecting random values on the stack.

Yes, but thi_s doesn't differ from the case when we inspect the random value
if this task_struct was freed, then reallocated as another thing, then freed
and reallocated as task_struct again.

Please note the comment about the false positive above, and another one below:

> > +	 * We could even eliminate the false positive mentioned above:
> > +	 *
> > +	 *	probe_slab_address(&task->sighand, sighand);
> > +	 *	if (sighand)
> > +	 *		goto retry;

this one.

IOW. We can never know if we have a garbage in "sighand" or the real value,
this task_struct can be freed/reallocated when we do probe_slab_address().

And this is fine. We re-check that "task == *ptask" after that. Now we have
two different cases:

	1. This is actually the same task/task_struct. In this case
           sighand != NULL tells us it is still alive.

        2. This is another task which got the same memory for task_struct.
           We can't know this of course, and we can not trust sighand != NULL.

	   In this case we actually return a random value, but this is correct.

	   If we return NULL - we can pretend that we actually noticed that
	   *ptask was updated when the previous task has exited. Or pretend
	   that probe_slab_address(&sighand) reads NULL.

	   If we return the new task (because sighand is not NULL for any
	   reason) - this is fine too. This (new) task can't go away before
	   another gp pass.

	   And please note again the "We could even eliminate the false positive"
	   comment above (hmm, it should probably say false negative). We could
	   re-read task->sighand once again to avoid the falsely NULL.

	   But this case is very unlikely so I think we do not really care.

And perhaps this idea can find more users...

> This thing does more than rcu_dereference; because it also guarantees
> that task->usage > 0 for the entire RCU section you do this under.
> Because delayed_put_task_struct() will be delayed until at least the
> matching rcu_read_unlock().

Yes, yes, exactly.

> Should we instead create a primitive like try_get_task_struct() which
> returns true if it got a reference? Because that is typically what
> people end up doing..

Well, I won't really argue... but personally I'd prefer to leave it to
the caller. Note that task_rcu_dereference() doesn't even do rcu_read_lock().
It really acts as "rcu dereference".

I agree, try_get_task_struct(ptask) makes sense too, but it should be
implemented as a trivial wrapper on top of task_rcu_dereference().

Oleg.

  reply	other threads:[~2016-05-18 18:23 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-22  7:17 [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Kirill Tkhai
2014-10-22 21:30 ` introduce task_rcu_dereference? Oleg Nesterov
2014-10-22 22:23   ` Oleg Nesterov
2014-10-23 18:15     ` Oleg Nesterov
2014-10-23  8:10   ` Kirill Tkhai
2014-10-23 18:18     ` Oleg Nesterov
2014-10-24  7:51       ` Kirill Tkhai
2014-10-27 19:53 ` [PATCH 0/3] introduce task_rcu_dereference() Oleg Nesterov
2014-10-27 19:54   ` [PATCH 1/3] probe_kernel_address() can use __probe_kernel_read() Oleg Nesterov
2014-10-27 19:54   ` [PATCH 2/3] introduce probe_slab_address() Oleg Nesterov
2014-10-27 19:21     ` Christoph Lameter
2014-10-28  5:44     ` Kirill Tkhai
2014-10-28  5:48       ` Kirill Tkhai
2014-10-28 15:01       ` Peter Zijlstra
2014-10-28 17:56         ` Kirill Tkhai
2014-10-28 18:00           ` Kirill Tkhai
2014-10-28 19:55           ` Oleg Nesterov
2014-10-28 20:12             ` Oleg Nesterov
2014-10-29  5:10               ` Kirill Tkhai
2014-10-27 19:54   ` [PATCH 3/3] introduce task_rcu_dereference() Oleg Nesterov
2014-10-28  6:22     ` Kirill Tkhai
2016-05-18 17:02     ` Peter Zijlstra
2016-05-18 18:23       ` Oleg Nesterov [this message]
2016-05-18 19:10         ` Peter Zijlstra
2016-05-18 19:57           ` Oleg Nesterov
2016-05-26 11:34             ` Peter Zijlstra
2016-06-03 10:49             ` [tip:sched/core] sched/fair: Use task_rcu_dereference() tip-bot for Oleg Nesterov
2016-06-03 10:48       ` [tip:sched/core] sched/api: Introduce task_rcu_dereference() and try_get_task_struct() tip-bot for Oleg Nesterov
2014-10-28 11:02 ` [tip:sched/core] sched/numa: Fix unsafe get_task_struct() in task_numa_assign() tip-bot for Kirill Tkhai
2014-11-08  3:48 ` [PATCH v4] sched/numa: fix " Sasha Levin
2014-11-09 14:07   ` Kirill Tkhai
2014-11-10 10:03     ` Peter Zijlstra
2014-11-10 15:48       ` Sasha Levin
2014-11-10 16:01         ` Peter Zijlstra
2014-11-16  9:50       ` [tip:sched/urgent] sched/numa: Avoid selecting oneself as swap target tip-bot for Peter Zijlstra
2014-11-10 16:03   ` [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Peter Zijlstra
2014-11-10 16:09     ` Sasha Levin
2014-11-10 16:16       ` Peter Zijlstra
2014-11-10 16:10     ` Kirill Tkhai
2014-11-10 16:36       ` Kirill Tkhai
2014-11-10 16:44         ` Sasha Levin
2014-11-10 20:01           ` Kirill Tkhai
2014-11-12  9:49             ` Kirill Tkhai
2014-11-15  2:38     ` Sasha Levin
2014-11-18 17:30       ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160518182318.GA15661@redhat.com \
    --to=oleg@redhat.com \
    --cc=cl@linux.com \
    --cc=ktkhai@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tkhai@yandex.ru \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).