Linux-Next Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Minchan Kim <minchan@kernel.org>
Cc: Nadav Amit <namit@vmware.com>, Ingo Molnar <mingo@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus <torvalds@linux-foundation.org>
Subject: Re: linux-next: manual merge of the akpm-current tree with the tip tree
Date: Mon, 14 Aug 2017 21:57:23 +0200
Message-ID: <20170814195723.GO6524@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <20170814083839.GD26913@bbox>

On Mon, Aug 14, 2017 at 05:38:39PM +0900, Minchan Kim wrote:
> memory-barrier.txt always scares me. I have read it for a while
> and IIUC, it seems semantic of spin_unlock(&same_pte) would be
> enough without some memory-barrier inside mm_tlb_flush_nested.

Indeed, see the email I just send. Its both spin_lock() and
spin_unlock() that we care about.

Aside from the semi permeable barrier of these primitives, RCpc ensures
these orderings only work against the _same_ lock variable.

Let me try and explain the ordering for PPC (which is by far the worst
we have in this regard):


spin_lock(lock)
{
	while (test_and_set(lock))
		cpu_relax();
	lwsync();
}


spin_unlock(lock)
{
	lwsync();
	clear(lock);
}

Now LWSYNC has fairly 'simple' semantics, but with fairly horrible
ramifications. Consider LWSYNC to provide _local_ TSO ordering, this
means that it allows 'stores reordered after loads'.

For the spin_lock() that implies that all load/store's inside the lock
do indeed stay in, but the ACQUIRE is only on the LOAD of the
test_and_set(). That is, the actual _set_ can leak in. After all it can
re-order stores after load (inside the lock).

For unlock it again means all load/store's prior stay prior, and the
RELEASE is on the store clearing the lock state (nothing surprising
here).

Now the _local_ part, the main take-away is that these orderings are
strictly CPU local. What makes the spinlock work across CPUs (as we'd
very much expect it to) is the address dependency on the lock variable.

In order for the spin_lock() to succeed, it must observe the clear. Its
this link that crosses between the CPUs and builds the ordering. But
only the two CPUs agree on this order. A third CPU not involved in
this transaction can disagree on the order of events.

  reply index

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11  7:53 Stephen Rothwell
2017-08-11  9:34 ` Peter Zijlstra
2017-08-11 10:48   ` Peter Zijlstra
2017-08-11 11:45   ` Stephen Rothwell
2017-08-11 11:56     ` Ingo Molnar
2017-08-11 12:17       ` Peter Zijlstra
2017-08-11 12:44         ` Ingo Molnar
2017-08-11 13:49           ` Stephen Rothwell
2017-08-11 14:04       ` Peter Zijlstra
2017-08-13  6:06         ` Nadav Amit
2017-08-13 12:50           ` Peter Zijlstra
2017-08-14  3:16             ` Minchan Kim
2017-08-14  5:07               ` Nadav Amit
2017-08-14  5:23                 ` Minchan Kim
2017-08-14  8:38                 ` Minchan Kim
2017-08-14 19:57                   ` Peter Zijlstra [this message]
2017-08-16  4:14                     ` Minchan Kim
2017-08-14 19:38                 ` Peter Zijlstra
2017-08-15  7:51                   ` Nadav Amit
2017-08-14  3:09         ` Minchan Kim
2017-08-14 18:54           ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2019-10-31  5:43 Stephen Rothwell
2019-06-24 10:24 Stephen Rothwell
2019-05-01 11:10 Stephen Rothwell
2019-01-31  4:31 Stephen Rothwell
2018-08-20  4:32 Stephen Rothwell
2018-08-20 19:52 ` Andrew Morton
2018-03-23  5:59 Stephen Rothwell
2017-12-18  5:04 Stephen Rothwell
2017-11-10  4:33 Stephen Rothwell
2017-11-02  7:19 Stephen Rothwell
2017-08-22  6:57 Stephen Rothwell
2017-08-23  6:39 ` Vlastimil Babka
2017-04-12  6:46 Stephen Rothwell
2017-04-12 20:53 ` Vlastimil Babka
2017-04-20  2:17   ` NeilBrown
2017-03-24  5:25 Stephen Rothwell
2017-02-17  4:40 Stephen Rothwell
2016-11-14  6:08 Stephen Rothwell
2016-07-29  4:14 Stephen Rothwell
2016-06-15  5:23 Stephen Rothwell
2016-06-18 19:39 ` Manfred Spraul
2016-04-29  6:12 Stephen Rothwell
2016-04-29  6:26 ` Ingo Molnar
2016-03-02  5:40 Stephen Rothwell
2016-02-26  5:07 Stephen Rothwell
2016-02-26 21:35 ` Andrew Morton
2016-02-19  4:09 Stephen Rothwell
2016-02-19 15:26 ` Ard Biesheuvel
2015-12-07  8:06 Stephen Rothwell
2015-10-02  4:21 Stephen Rothwell
2015-07-28  6:00 Stephen Rothwell
2015-07-29 17:12 ` Andrea Arcangeli
2015-07-29 17:47   ` Andy Lutomirski
2015-07-29 18:46     ` Thomas Gleixner
2015-07-30 15:38       ` Andrea Arcangeli
2015-07-29 23:06   ` Stephen Rothwell
2015-07-29 23:07     ` Thomas Gleixner
2015-09-07 23:35   ` Stephen Rothwell
2015-09-08 18:11     ` Linus Torvalds
2015-09-08 22:56       ` Stephen Rothwell
2015-09-08 23:03         ` Linus Torvalds
2015-09-08 23:21           ` Andrew Morton
2015-09-16  6:58             ` Geert Uytterhoeven
2015-06-04 12:07 Stephen Rothwell
2015-04-08  8:28 Stephen Rothwell
2015-04-08  8:25 Stephen Rothwell
2014-03-17  9:31 Stephen Rothwell
2014-03-17  9:36 ` Peter Zijlstra
2014-03-19 23:27   ` Andrew Morton
2014-01-14  4:53 Stephen Rothwell
2014-01-14  5:04 ` Davidlohr Bueso
2014-01-14 12:51 ` Peter Zijlstra
2014-01-14 13:17   ` Geert Uytterhoeven
2014-01-14 13:33     ` Peter Zijlstra
2014-01-14 16:19     ` H. Peter Anvin
2014-01-14 15:15   ` H. Peter Anvin
2014-01-14 15:20     ` Geert Uytterhoeven
2014-01-14 15:41       ` Peter Zijlstra
2014-01-14 15:48         ` H. Peter Anvin
2014-01-07  6:00 Stephen Rothwell
2014-01-07  6:34 ` Tang Chen
2013-11-08  7:48 Stephen Rothwell
2013-11-08 18:58 ` Josh Triplett
2013-11-08 23:20   ` Stephen Rothwell
2013-11-09  0:19     ` Josh Triplett
2013-10-30  6:40 Stephen Rothwell

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170814195723.GO6524@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Next Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-next/0 linux-next/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-next linux-next/ https://lore.kernel.org/linux-next \
		linux-next@vger.kernel.org
	public-inbox-index linux-next

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-next


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git