LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Qian Cai <cai@lca.pw>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Marco Elver <elver@google.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next
Date: Thu, 30 Jan 2020 22:32:29 -0500
Message-ID: <4A97061E-2152-4734-92C6-F5431C27360B@lca.pw> (raw)
In-Reply-To: <20200130134851.GY14914@hirez.programming.kicks-ass.net>



> On Jan 30, 2020, at 8:48 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, Jan 30, 2020 at 02:39:38PM +0100, Marco Elver wrote:
>> On Wed, 29 Jan 2020 at 19:40, Peter Zijlstra <peterz@infradead.org> wrote:
> 
>>> It's probably not terrible to put a READ_ONCE() there; we just need to
>>> make sure the compiler doesn't do something stupid (it is known to do
>>> stupid when 'volatile' is present).
>> 
>> Maybe we need to optimize READ_ONCE().
> 
> I think recent compilers have gotten better at volatile. In part because
> of our complaints.
> 
>> 'if (data_race(..))' would also work here and has no cost.
> 
> Right, that might be the best option.
> 

OK, I’ll send a patch for that.

BTW, I have another one to report. Can’t see how the load tearing would
cause any real issue.

[  519.240629] BUG: KCSAN: data-race in osq_lock / osq_unlock

[  519.249088] write (marked) to 0xffff8bb2f133be40 of 8 bytes by task 421 on cpu 38:
[  519.257427]  osq_unlock+0xa8/0x170 kernel/locking/osq_lock.c:219
[  519.261571]  __mutex_lock+0x4b3/0xd20
[  519.265972]  mutex_lock_nested+0x31/0x40
[  519.270639]  memcg_create_kmem_cache+0x2e/0x190
[  519.275922]  memcg_kmem_cache_create_func+0x40/0x80
[  519.281553]  process_one_work+0x54c/0xbe0
[  519.286308]  worker_thread+0x80/0x650
[  519.290715]  kthread+0x1e0/0x200
[  519.294690]  ret_from_fork+0x27/0x50


void osq_unlock(struct optimistic_spin_queue *lock)
{
        struct optimistic_spin_node *node, *next;
        int curr = encode_cpu(smp_processor_id());

        /*
         * Fast path for the uncontended case.
         */
        if (likely(atomic_cmpxchg_release(&lock->tail, curr,
                                          OSQ_UNLOCKED_VAL) == curr))
                return;

        /*
         * Second most likely case.
         */
        node = this_cpu_ptr(&osq_node);
        next = xchg(&node->next, NULL);    <--------------------------
        if (next) {
                WRITE_ONCE(next->locked, 1);
                return;
        }

        next = osq_wait_next(lock, node, NULL);
        if (next)
                WRITE_ONCE(next->locked, 1);
}


[  519.301232] read to 0xffff8bb2f133be40 of 8 bytes by task 196 on cpu 12:
[  519.308705]  osq_lock+0x1e2/0x340 kernel/locking/osq_lock.c:157
[  519.312762]  __mutex_lock+0x277/0xd20
[  519.317167]  mutex_lock_nested+0x31/0x40
[  519.321838]  memcg_create_kmem_cache+0x2e/0x190
[  519.327120]  memcg_kmem_cache_create_func+0x40/0x80
[  519.332751]  process_one_work+0x54c/0xbe0
[  519.337508]  worker_thread+0x80/0x650
[  519.341922]  kthread+0x1e0/0x200
[  519.345889]  ret_from_fork+0x27/0x50


        for (;;) {
                if (prev->next == node &&         <------------------------
                    cmpxchg(&prev->next, node, NULL) == node)
                        break;

                /*
                 * We can only fail the cmpxchg() racing against an unlock(),
                 * in which case we should observe @node->locked becomming
                 * true.
                 */
                if (smp_load_acquire(&node->locked))
                        return true;

                cpu_relax();

                /*
                 * Or we race against a concurrent unqueue()'s step-B, in which
                 * case its step-C will write us a new @node->prev pointer.
                 */
                prev = READ_ONCE(node->prev);
        }


[  519.352420] Reported by Kernel Concurrency Sanitizer on:
[  519.358492] CPU: 12 PID: 196 Comm: kworker/12:1 Tainted: G        W    L    5.5.0-next-20200130+ #3
[  519.368317] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[  519.377627] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func

  reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-22 16:38 Qian Cai
2020-01-22 16:59 ` Will Deacon
2020-01-22 17:08   ` Qian Cai
2020-01-22 22:38     ` Marco Elver
2020-01-22 23:54       ` Qian Cai
2020-01-23  9:39         ` Peter Zijlstra
2020-01-28  3:11           ` Qian Cai
2020-01-28 11:46             ` Marco Elver
2020-01-28 12:53               ` Qian Cai
2020-01-28 16:52               ` Peter Zijlstra
2020-01-28 16:56               ` Peter Zijlstra
2020-01-29  0:22                 ` Paul E. McKenney
2020-01-29 15:29                   ` Marco Elver
2020-01-29 18:40                     ` Peter Zijlstra
2020-01-30 13:39                       ` Marco Elver
2020-01-30 13:48                         ` Peter Zijlstra
2020-01-31  3:32                           ` Qian Cai [this message]
2020-01-29 18:49                   ` Peter Zijlstra
2020-01-29 19:26                     ` Paul E. McKenney
2020-01-23  9:36       ` Peter Zijlstra
2020-01-28  3:12         ` Qian Cai
2020-01-28  8:18           ` Marco Elver
2020-01-28 10:10             ` Qian Cai
2020-01-28 10:29               ` Marco Elver
2020-01-22 17:09 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A97061E-2152-4734-92C6-F5431C27360B@lca.pw \
    --to=cai@lca.pw \
    --cc=elver@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git