All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Turner <pjt@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: NeilBrown <nfbrown@novell.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Mike Galbraith <efault@gmx.de>, Ingo Molnar <mingo@kernel.org>,
	Peter Anvin <hpa@zytor.com>,
	vladimir.murzin@arm.com, linux-tip-commits@vger.kernel.org,
	jstancek@redhat.com, Oleg Nesterov <oleg@redhat.com>
Subject: Re: [tip:locking/core] sched/wait: Fix signal handling in bit wait helpers
Date: Fri, 11 Dec 2015 03:30:33 -0800	[thread overview]
Message-ID: <CAPM31R+ohAB3w+wTWj08LfM9ePP8tfyW-Vie5Uef-RwCu-b4sw@mail.gmail.com> (raw)
In-Reply-To: <20151210130948.GW6356@twins.programming.kicks-ass.net>

On Thu, Dec 10, 2015 at 5:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Dec 10, 2015 at 08:30:01AM +1100, NeilBrown wrote:
>> On Wed, Dec 09 2015, Peter Zijlstra wrote:
>>
>> > On Wed, Dec 09, 2015 at 12:06:33PM +1100, NeilBrown wrote:
>> >> On Tue, Dec 08 2015, Peter Zijlstra wrote:
>> >>
>> >> >>
>> >> >
>> >> > *sigh*, so that patch was broken.. the below might fix it, but please
>> >> > someone look at it, I seem to have a less than stellar track record
>> >> > here...
>> >>
>> >> This new change seems to be more intrusive than should be needed.
>> >> Can't we just do:
>> >>
>> >>
>> >>  __sched int bit_wait(struct wait_bit_key *word)
>> >>  {
>> >> +  long state = current->state;
>> >
>> > No, current->state can already be changed by this time.
>>
>> Does that matter?
>> It can only have changed to TASK_RUNNING - right?
>> In that case signal_pending_state() will return 0 and the bit_wait() acts
>> as though the thread was woken up normally (which it was) rather than by
>> a signal (which maybe it was too, but maybe that happened just a tiny
>> bit later).
>>
>> As long as signal delivery doesn't change ->state, we should be safe.
>> We should even be safe testing ->state *after* the call the schedule().
>
> Blergh, all I've managed to far is to confuse myself further. Even
> something like the original (+- the EINTR) should work when we consider
> the looping, even when mixed with an occasional spurious wakeup.
>
>
> int bit_wait()
> {
>         if (signal_pending_state(current->state, current))
>                 return -EINTR;
>         schedule();
> }
>
>
> This can go wrong against raising a signal thusly:
>
>         prepare_to_wait()
> 1:      if (signal_pending_state(current->state, current))
>                 // false, nothing pending
>         schedule();
>                                 set_tsk_thread_flag(t, TIF_SIGPENDING);
>
>                 <spurious wakeup>
>
>         prepare_to_wait()
>                                 wake_up_state(t, ...);
> 2:      if (signal_pending_state(current->state, current))
>                 // false, TASK_RUNNING
>
>         schedule(); // doesn't block because pending

Note that a quick inspection does not turn up _any_ TASK_INTERRUPTIBLE
callers.  When this previously occurred, it could likely only be with
a fatal signal, which would have hidden these sins.

>
>         prepare_to_wait()
> 3:      if (signal_pending_state(current->state, current))
>                 // true, pending
>

Hugh asked me about this after seeing a crash, here's another exciting
way in which the current code breaks -- this one actually quite
serious:

Consider __lock_page:

void __lock_page(struct page *page)
{
        DEFINE_WAIT_BIT(wait, &page->flags, PG_locked);
        __wait_on_bit_lock(page_waitqueue(page), &wait, bit_wait_io,
TASK_UNINTERRUPTIBLE);
}

With the current state of the world,

 __sched int bit_wait_io(struct wait_bit_key *word)
 {
-       if (signal_pending_state(current->state, current))
-               return 1;
        io_schedule();
+       if (signal_pending(current))
+               return -EINTR;
        return 0;
 }

Called from __wait_on_bit_lock.

Previously, signal_pending_state() was checked under
TASK_UNINTERRUPTIBLE (via prepare_to_wait_exclusive).  Now, we simply
check for the presence of any signal -- after we have returned to
running state, e.g. post io_schedule() when somebody has kicked the
wait-queue.

However, this now means that _wait_on_bit_lock can return -EINTR up to
__lock_page; which does not validate the return code and blindly
returns.  This looks to have been a previously existing bug, but it
was at least masked by the fact that it required a fatal signal
previously (and that the page we return unlocked is likely going to be
freed from the dying process anyway).

Peter's proposed follow-up above looks strictly more correct.  We need
to evaluate the potential existence of a signal, *after* we return
from schedule, but in the context of the state which we previously
_entered_ schedule() on.

Reviewed-by: Paul Turner <pjt@google.com>


>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2015-12-11 11:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20 15:35 [BISECTED] rcu_sched self-detected stall since 3.17 Vladimir Murzin
2015-11-20 15:35 ` Vladimir Murzin
2015-11-20 15:35 ` Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 11:50   ` Vladimir Murzin
2015-12-01 11:50   ` Vladimir Murzin
2015-12-01 13:04 ` Peter Zijlstra
2015-12-01 13:04   ` Peter Zijlstra
2015-12-01 13:04   ` Peter Zijlstra
2015-12-02  9:04   ` Vladimir Murzin
2015-12-02  9:04     ` Vladimir Murzin
2015-12-02  9:04     ` Vladimir Murzin
2015-12-04 11:52   ` [tip:locking/core] sched/wait: Fix signal handling in bit wait helpers tip-bot for Peter Zijlstra
2015-12-08 10:47     ` Peter Zijlstra
2015-12-09  1:06       ` NeilBrown
2015-12-09  7:40         ` Peter Zijlstra
2015-12-09 21:30           ` NeilBrown
2015-12-10 13:09             ` Peter Zijlstra
2015-12-11 11:30               ` Paul Turner [this message]
2015-12-11 11:39                 ` Peter Zijlstra
2015-12-11 11:53                   ` Vladimir Murzin
2015-12-11 13:08                   ` Jan Stancek
2015-12-11 13:22                     ` Peter Zijlstra
2015-12-11 17:57                   ` Vladimir Murzin
2015-12-15 18:16                   ` Oleg Nesterov
2015-12-15 19:01                 ` Oleg Nesterov
2015-12-15 16:56   ` [BISECTED] rcu_sched self-detected stall since 3.17 Oleg Nesterov
2015-12-15 16:56     ` Oleg Nesterov
2015-12-15 16:56     ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPM31R+ohAB3w+wTWj08LfM9ePP8tfyW-Vie5Uef-RwCu-b4sw@mail.gmail.com \
    --to=pjt@google.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=jstancek@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nfbrown@novell.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vladimir.murzin@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.