linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	Bob Peterson <rpeterso@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Andrew Lutomirski <luto@kernel.org>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-mm <linux-mm@kvack.org>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit
Date: Mon, 26 Dec 2016 11:07:52 -0800	[thread overview]
Message-ID: <CA+55aFz1n_JSTc_u=t9Qgafk2JaffrhPAwMLn_Dr-L9UKxqHMg@mail.gmail.com> (raw)
In-Reply-To: <20161226111654.76ab0957@roar.ozlabs.ibm.com>

On Sun, Dec 25, 2016 at 5:16 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>
> I did actually play around with that. I could not get my skylake
> to forward the result from a lock op to a subsequent load (the
> latency was the same whether you use lock ; andb or lock ; andl
> (32 cycles for my test loop) whereas with non-atomic versions I
> was getting about 15 cycles for andb vs 2 for andl.

Yes, interesting. It does look like the locked ops don't end up having
the partial write issue and the size of the op doesn't matter.

But it's definitely the case that the write buffer hit immediately
after the atomic read-modify-write ends up slowing things down, so the
profile oddity isn't just a profile artifact. I wrote a stupid test
program that did an atomic increment, and then read either the same
value, or an adjacent value in memory (so same instruvtion sequence,
the difference just being what memory location the read accessed).

Reading the same value after the atomic update was *much* more
expensive than reading the adjacent value, so it causes some kind of
pipeline hickup (by about 50% of the cost of the atomic op itself:
iow, the "atomic-op followed by read same location" was over 1.5x
slower than "atomic op followed by read of another location").

So the atomic ops don't serialize things entirely, but they *hate*
having the value read (regardless of size) right after being updated,
because it causes some kind of nasty pipeline issue.

A cmpxchg does seem to avoid the issue.

             Linus

  reply	other threads:[~2016-12-26 19:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-25  3:00 [PATCH 0/2] PageWaiters again Nicholas Piggin
2016-12-25  3:00 ` [PATCH 1/2] mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked Nicholas Piggin
2016-12-25  5:13   ` Hugh Dickins
2016-12-25  3:00 ` [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit Nicholas Piggin
2016-12-25 21:51   ` Linus Torvalds
2016-12-26  1:16     ` Nicholas Piggin
2016-12-26 19:07       ` Linus Torvalds [this message]
2016-12-27 11:19         ` Nicholas Piggin
2016-12-27 18:58           ` Linus Torvalds
2016-12-27 19:23             ` Linus Torvalds
2016-12-27 19:24               ` Linus Torvalds
2016-12-27 19:40                 ` Linus Torvalds
2016-12-27 20:17                   ` Linus Torvalds
2016-12-28  3:53             ` Nicholas Piggin
2016-12-28 19:17               ` Linus Torvalds
2016-12-29  4:08                 ` Nicholas Piggin
2016-12-29  4:16                   ` Linus Torvalds
2016-12-29  5:26                     ` Nicholas Piggin
2017-01-03 10:24                       ` Mel Gorman
2017-01-03 12:29                         ` Nicholas Piggin
2017-01-03 17:18                           ` Mel Gorman
2016-12-29 22:16                     ` [PATCH] mm/filemap: fix parameters to test_bit() Olof Johansson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFz1n_JSTc_u=t9Qgafk2JaffrhPAwMLn_Dr-L9UKxqHMg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=agruenba@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rpeterso@redhat.com \
    --cc=swhiteho@redhat.com \
    --subject='Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).