All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: 焦晓冬 <milestonejxd@gmail.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	stern@rowland.harvard.edu, will.deacon@arm.com,
	torvalds@linux-foundation.org, npiggin@gmail.com,
	mingo@kernel.org, mpe@ellerman.id.au, oleg@redhat.com,
	benh@kernel.crashing.org, paulmck@linux.vnet.ibm.com
Subject: Re: smp_mb__after_spinlock requirement too strong?
Date: Mon, 12 Mar 2018 13:44:12 +0800	[thread overview]
Message-ID: <20180312054412.yqyde34ly3kjoajj@tardis> (raw)
In-Reply-To: <CAJDTihwMgUqjK6YL2M_WZY4VBaifVYgNvdZOF_fNrnk_p27Kvw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3794 bytes --]

On Sun, Mar 11, 2018 at 03:55:41PM +0800, 焦晓冬 wrote:
> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
> that the spinning-lock used at __schedule() should be RCsc to ensure
> visibility of writes prior to __schedule when the task is to be migrated to
> another CPU.
> 
> And this is emphasized at the comment of the newly introduced
> smp_mb__after_spinlock(),
> 
>  * This barrier must provide two things:
>  *
>  *   - it must guarantee a STORE before the spin_lock() is ordered against a
>  *     LOAD after it, see the comments at its two usage sites.
>  *
>  *   - it must ensure the critical section is RCsc.
>  *
>  * The latter is important for cases where we observe values written by other
>  * CPUs in spin-loops, without barriers, while being subject to scheduling.
>  *
>  * CPU0         CPU1            CPU2
>  *
>  *          for (;;) {
>  *            if (READ_ONCE(X))
>  *              break;
>  *          }
>  * X=1
>  *          <sched-out>
>  *                      <sched-in>
>  *                      r = X;
>  *
>  * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
>  * we get migrated and CPU2 sees X==0.
> 
> which is used at,
> 
> __schedule(bool preempt) {
>     ...
>     rq_lock(rq, &rf);
>     smp_mb__after_spinlock();
>     ...
> }
> .
> 
> If I didn't miss something, I found this kind of visibility is __not__
> necessarily
> depends on the spinning-lock at __schedule being RCsc.
> 
> In fact, as for runnable task A, the migration would be,
> 
>  CPU0         CPU1            CPU2
> 
> <ACCESS before schedule out A>
> 
> lock(rq0)
> schedule out A
> unock(rq0)
> 
>               lock(rq0)
>               remove A from rq0
>               unlock(rq0)
> 
>               lock(rq2)
>               add A into rq2
>               unlock(rq2)
>                                         lock(rq2)
>                                         schedule in A
>                                         unlock(rq2)
> 
>                                         <ACCESS after schedule in A>
> 
> <ACCESS before schedule out A> happens-before
> unlock(rq0) happends-before
> lock(rq0) happends-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
> 

But without RCsc lock, you cannot guarantee that a write propagates to
CPU 0 and CPU 2 at the same time, so the same write may propagate to
CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after
<ACCESS after scheduler in A>. So..

Regards,
Boqun

> And for stopped tasks,
> 
>  CPU0         CPU1            CPU2
> 
> <ACCESS before schedule out A>
> 
> lock(rq0)
> schedule out A
> remove A from rq0
> store-release(A->on_cpu)
> unock(rq0)
> 
>               load_acquire(A->on_cpu)
>               set_task_cpu(A, 2)
> 
>               lock(rq2)
>               add A into rq2
>               unlock(rq2)
> 
>                                         lock(rq2)
>                                         schedule in A
>                                         unlock(rq2)
> 
>                                         <ACCESS after schedule in A>
> 
> <ACCESS before schedule out A> happens-before
> store-release(A->on_cpu)  happens-before
> load_acquire(A->on_cpu)  happens-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
> 
> So, I think the only requirement to smp_mb__after_spinlock is
> to guarantee a STORE before the spin_lock() is ordered
> against a LOAD after it. So we could remove the RCsc requirement
> to allow more efficient implementation.
> 
> Did I miss something or this RCsc requirement does not really matter?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2018-03-12  5:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-11  7:55 smp_mb__after_spinlock requirement too strong? 焦晓冬
2018-03-12  5:44 ` Boqun Feng [this message]
2018-03-12  8:18   ` 焦晓冬
2018-03-12  8:56     ` Boqun Feng
2018-03-12  8:56       ` Peter Zijlstra
2018-03-12  9:13         ` 焦晓冬
2018-03-12 13:31           ` Peter Zijlstra
2018-03-12 13:24     ` Andrea Parri
2018-03-12 14:10       ` 焦晓冬

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180312054412.yqyde34ly3kjoajj@tardis \
    --to=boqun.feng@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=milestonejxd@gmail.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.