linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: mingo@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
	a.p.zijlstra@chello.nl, y-goto@jp.fujitsu.com,
	akpm@linux-foundation.org, tglx@linutronix.de, mingo@elte.hu,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:sched/core] sched: Fix ancient race in do_exit()
Date: Mon, 30 Jan 2012 08:27:11 -0800	[thread overview]
Message-ID: <CA+55aFzavWN3gGSRtd3Lp6mzY4ndg1af4DhXy1QOCy4HQ8CA9w@mail.gmail.com> (raw)
In-Reply-To: <20120129185949.GA27192@redhat.com>

On Sun, Jan 29, 2012 at 10:59 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> If we remove set_task_state() from the main waiting loop we can never race
> with __rwsem_do_wake()->try_to_wake_up() seeing us in UNINTERRUPTIBLE state.
> rwsem_down_failed_common() simply can't return until UNINTERRUPTIBLE->RUNNING
> transition is finished (__rwsem_do_wake does wakeup first).
>
> And since we do not play with current->state after spin_unlock(), it is
> fine to "race" with waiter->task clearing, just we can do the unnecessary
> but harmless schedule() in TASK_RUNNING.

So the more I think about this code, the more nervous I get.

I did a version that got the sem->wait_lock if it hit the race case
("task" still non-NULL after the schedule), and I could convince
myself that it was race-free and safe. But I couldn't actually really
convince myself that it was better than the old code, because I worry
that we'd actually hit that case much too often (ie another CPU woke
us up, but hasn't cleared "task" yet, so spin on the spinlock etc).

So I do hate the bug in rwsem, but I'm now willing to entertain the
possibility that the bug actually means that unlocking of an rwsem is
"nicely decoupled" from the process that got the semaphore, and may be
a performance advantage.

To be honest, I don't *really* believe that, but changing the rwsem.c
code at this point just scares me enough that without lots and lots of
people looking at it and actually doing some performance analysis, I'm
willing to just say "screw it, I'll take Yasunori Goto's solution".

I'm definitely still not a huge fan of that patch, but I'm not a huge
fan of the alternative either. So Ack on the patch, and I'll just try
to think about this all some more.

                         Linus

  reply	other threads:[~2012-01-30 16:27 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-22  0:42 [BUG] TASK_DEAD task is able to be woken up in special condition Yasunori Goto
2011-12-22  2:14 ` KOSAKI Motohiro
2011-12-22  8:22   ` Yasunori Goto
2011-12-22 20:02     ` KOSAKI Motohiro
2011-12-23  9:49 ` Peter Zijlstra
2011-12-23 15:41   ` Oleg Nesterov
2011-12-26  8:23     ` Yasunori Goto
2011-12-26 17:11       ` Oleg Nesterov
2011-12-27  6:48         ` Yasunori Goto
2012-01-06 10:22           ` Yasunori Goto
2012-01-06 11:01             ` Peter Zijlstra
2012-01-06 12:01               ` Yasunori Goto
2012-01-06 12:43                 ` Peter Zijlstra
2012-01-06 14:12                   ` Oleg Nesterov
2012-01-06 14:19                     ` Oleg Nesterov
2012-01-07  1:31                     ` Yasunori Goto
2012-01-16 11:51                       ` Yasunori Goto
2012-01-16 13:38                         ` Peter Zijlstra
2012-01-17  8:40                           ` Yasunori Goto
2012-01-17  9:06                             ` Ingo Molnar
2012-01-17 15:12                               ` Oleg Nesterov
2012-01-18  9:42                                 ` Ingo Molnar
2012-01-18 14:20                                   ` Oleg Nesterov
2012-01-24 10:19                                     ` Peter Zijlstra
2012-01-24 10:55                                       ` Peter Zijlstra
2012-01-24 17:25                                         ` KOSAKI Motohiro
2012-01-25 15:45                                         ` Oleg Nesterov
2012-01-25 16:51                                           ` Peter Zijlstra
2012-01-25 17:43                                             ` Oleg Nesterov
2012-01-26 15:32                                               ` Peter Zijlstra
2012-01-26 16:26                                                 ` Oleg Nesterov
2012-01-27  8:59                                                   ` Peter Zijlstra
2012-01-24 10:11                                   ` Peter Zijlstra
2012-01-26  9:39                                     ` Ingo Molnar
2012-01-28 12:03                             ` [tip:sched/core] sched: Fix ancient race in do_exit() tip-bot for Yasunori Goto
2012-01-28 21:12                               ` Linus Torvalds
2012-01-29 16:07                                 ` Oleg Nesterov
2012-01-29 17:44                                   ` Linus Torvalds
2012-01-29 18:28                                     ` Linus Torvalds
2012-01-29 18:59                                     ` Oleg Nesterov
2012-01-30 16:27                                       ` Linus Torvalds [this message]
2012-01-06 13:48             ` [BUG] TASK_DEAD task is able to be woken up in special condition Oleg Nesterov
2011-12-28 21:07         ` KOSAKI Motohiro
2012-01-24 10:23           ` Peter Zijlstra
2012-01-24 18:01             ` KOSAKI Motohiro
2012-01-25  6:15               ` Mike Galbraith
2012-01-26 21:24                 ` KOSAKI Motohiro
2012-01-25 10:10           ` Peter Zijlstra
2012-01-26 20:25             ` [tip:sched/urgent] sched: Fix rq->nr_uninterruptible update race tip-bot for Peter Zijlstra
2012-01-27  5:20               ` Rakib Mullick
2012-01-27  8:19                 ` Peter Zijlstra
2012-01-27 14:11                   ` Rakib Mullick
2012-01-26 21:21             ` [BUG] TASK_DEAD task is able to be woken up in special condition KOSAKI Motohiro
2012-01-27  8:21               ` Peter Zijlstra
2011-12-26  6:52   ` Yasunori Goto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFzavWN3gGSRtd3Lp6mzY4ndg1af4DhXy1QOCy4HQ8CA9w@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).