kernel-hardening.lists.openwall.com archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Greg KH <greg@kroah.com>, Will Deacon <will@kernel.org>,
	 Peter Zijlstra <peterz@infradead.org>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	 Kees Cook <keescook@chromium.org>,
	Maddie Stone <maddiestone@google.com>,
	 Marco Elver <elver@google.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	 Thomas Gleixner <tglx@linutronix.de>,
	kernel-team <kernel-team@android.com>,
	 Kernel Hardening <kernel-hardening@lists.openwall.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
Date: Tue, 24 Mar 2020 19:22:50 +0100	[thread overview]
Message-ID: <CAG48ez1677ihowvvgLO6i-oEu=d_woxiQj52sx0k7-nWXrBpBg@mail.gmail.com> (raw)
In-Reply-To: <20200324165938.GA2521386@kroah.com>

On Tue, Mar 24, 2020 at 5:59 PM Greg KH <greg@kroah.com> wrote:
> On Tue, Mar 24, 2020 at 05:38:30PM +0100, Jann Horn wrote:
> > On Tue, Mar 24, 2020 at 5:26 PM Greg KH <greg@kroah.com> wrote:
> > > On Tue, Mar 24, 2020 at 05:20:45PM +0100, Jann Horn wrote:
> > > > On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> > > > > Some list predicates can be used locklessly even with the non-RCU list
> > > > > implementations, since they effectively boil down to a test against
> > > > > NULL. For example, checking whether or not a list is empty is safe even
> > > > > in the presence of a concurrent, tearing write to the list head pointer.
> > > > > Similarly, checking whether or not an hlist node has been hashed is safe
> > > > > as well.
> > > > >
> > > > > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > > > > so that KCSAN and the compiler are aware of what's going on. The writer
> > > > > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > > > > implementation.
> > > > [...]
> > > > >  static inline int list_empty(const struct list_head *head)
> > > > >  {
> > > > > -       return READ_ONCE(head->next) == head;
> > > > > +       return data_race(READ_ONCE(head->next) == head);
> > > > >  }
> > > > [...]
> > > > >  static inline int hlist_unhashed(const struct hlist_node *h)
> > > > >  {
> > > > > -       return !READ_ONCE(h->pprev);
> > > > > +       return data_race(!READ_ONCE(h->pprev));
> > > > >  }
> > > >
> > > > This is probably valid in practice for hlist_unhashed(), which
> > > > compares with NULL, as long as the most significant byte of all kernel
> > > > pointers is non-zero; but I think list_empty() could realistically
> > > > return false positives in the presence of a concurrent tearing store?
> > > > This could break the following code pattern:
> > > >
> > > > /* optimistic lockless check */
> > > > if (!list_empty(&some_list)) {
> > > >   /* slowpath */
> > > >   mutex_lock(&some_mutex);
> > > >   list_for_each(tmp, &some_list) {
> > > >     ...
> > > >   }
> > > >   mutex_unlock(&some_mutex);
> > > > }
> > > >
> > > > (I'm not sure whether patterns like this appear commonly though.)
> > >
> > >
> > > I would hope not as the list could go "empty" before the lock is
> > > grabbed.  That pattern would be wrong.
> >
> > If the list becomes empty in between, the loop just iterates over
> > nothing, and the effect is no different from what you'd get if you had
> > bailed out before. But sure, you have to be aware that that can
> > happen.
>
> Doh, yeah, so it is safe, crazy, but safe :)

Here's an example of that pattern, I think (which I think is
technically incorrect if what peterz said is accurate?):

/**
 * waitqueue_active -- locklessly test for waiters on the queue
 * @wq_head: the waitqueue to test for waiters
 *
 * returns true if the wait list is not empty
 *
 * NOTE: this function is lockless and requires care, incorrect usage _will_
 * lead to sporadic and non-obvious failure.
 *
 * Use either while holding wait_queue_head::lock or when used for wakeups
 * with an extra smp_mb() like::
 *
 *      CPU0 - waker                    CPU1 - waiter
 *
 *                                      for (;;) {
 *      @cond = true;                     prepare_to_wait(&wq_head,
&wait, state);
 *      smp_mb();                         // smp_mb() from set_current_state()
 *      if (waitqueue_active(wq_head))         if (@cond)
 *        wake_up(wq_head);                      break;
 *                                        schedule();
 *                                      }
 *                                      finish_wait(&wq_head, &wait);
 *
 * Because without the explicit smp_mb() it's possible for the
 * waitqueue_active() load to get hoisted over the @cond store such that we'll
 * observe an empty wait list while the waiter might not observe @cond.
 *
 * Also note that this 'optimization' trades a spin_lock() for an smp_mb(),
 * which (when the lock is uncontended) are of roughly equal cost.
 */
static inline int waitqueue_active(struct wait_queue_head *wq_head)
{
        return !list_empty(&wq_head->head);
}

void signalfd_cleanup(struct sighand_struct *sighand)
{
        wait_queue_head_t *wqh = &sighand->signalfd_wqh;
        /*
         * The lockless check can race with remove_wait_queue() in progress,
         * but in this case its caller should run under rcu_read_lock() and
         * sighand_cachep is SLAB_TYPESAFE_BY_RCU, we can safely return.
         */
        if (likely(!waitqueue_active(wqh)))
                return;

        /* wait_queue_entry_t->func(POLLFREE) should do remove_wait_queue() */
        wake_up_poll(wqh, EPOLLHUP | POLLFREE);
}

and __add_wait_queue() just uses plain list_add(&wq_entry->entry,
&wq_head->head) under a lock.

  reply	other threads:[~2020-03-24 18:23 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
2020-03-24 16:27   ` Greg KH
2020-03-30 23:05   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
2020-03-24 16:27   ` Greg KH
2020-03-30 23:07   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
2020-03-24 16:20   ` Jann Horn
2020-03-24 16:26     ` Greg KH
2020-03-24 16:38       ` Jann Horn
2020-03-24 16:59         ` Greg KH
2020-03-24 18:22           ` Jann Horn [this message]
2020-03-24 16:23   ` Marco Elver
2020-03-24 21:33     ` Will Deacon
2020-03-31 13:10     ` Will Deacon
2020-04-01  6:34       ` Marco Elver
2020-04-01  8:40         ` Will Deacon
2020-03-24 16:51   ` Peter Zijlstra
2020-03-24 16:56     ` Jann Horn
2020-03-24 21:32       ` Will Deacon
2020-03-30 23:13         ` Paul E. McKenney
2020-04-24 17:39           ` Will Deacon
2020-04-27 19:24             ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending() Will Deacon
2020-03-24 16:30   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del() Will Deacon
2020-03-30 23:14   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation Will Deacon
2020-03-30 23:21   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists" Will Deacon
2020-03-30 23:19   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures" Will Deacon
2020-03-30 23:25   ` Paul E. McKenney
2020-03-31 13:11     ` Will Deacon
2020-03-31 13:47       ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before() Will Deacon
2020-03-30 23:30   ` Paul E. McKenney
2020-03-31 12:37     ` Will Deacon
2020-03-24 15:36 ` [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options Will Deacon
2020-03-24 16:42   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 11/21] list: Add integrity checking to hlist implementation Will Deacon
2020-03-24 15:36 ` [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node' Will Deacon
2020-03-30 23:32   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 13/21] list: Add integrity checking to hlist_nulls implementation Will Deacon
2020-03-24 15:36 ` [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON() Will Deacon
2020-03-24 16:42   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 15/21] list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper Will Deacon
2020-03-24 15:36 ` [RFC PATCH 16/21] list_bl: Extend integrity checking in deletion routines Will Deacon
2020-03-24 15:36 ` [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h Will Deacon
2020-03-24 16:28   ` Greg KH
2020-03-24 21:08     ` Will Deacon
2020-03-24 15:36 ` [RFC PATCH 18/21] list_bl: Move integrity checking out of line Will Deacon
2020-03-24 15:36 ` [RFC PATCH 19/21] list_bl: Extend integrity checking to cover the same cases as 'hlist' Will Deacon
2020-03-24 15:36 ` [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently Will Deacon
2020-03-24 16:40   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 21/21] lkdtm: Extend list corruption checks Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez1677ihowvvgLO6i-oEu=d_woxiQj52sx0k7-nWXrBpBg@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=greg@kroah.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddiestone@google.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).