From: Nicholas Piggin <npiggin@gmail.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com>, Bob Peterson <rpeterso@redhat.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Steven Whitehouse <swhiteho@redhat.com>, Andrew Lutomirski <luto@kernel.org>, Andreas Gruenbacher <agruenba@redhat.com>, Peter Zijlstra <peterz@infradead.org>, linux-mm <linux-mm@kvack.org>, Mel Gorman <mgorman@techsingularity.net> Subject: Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit Date: Mon, 26 Dec 2016 11:16:54 +1000 [thread overview] Message-ID: <20161226111654.76ab0957@roar.ozlabs.ibm.com> (raw) In-Reply-To: <CA+55aFzqgtz-782MmLOjQ2A2nB5YVyLAvveo6G_c85jqqGDA0Q@mail.gmail.com> On Sun, 25 Dec 2016 13:51:17 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Sat, Dec 24, 2016 at 7:00 PM, Nicholas Piggin <npiggin@gmail.com> wrote: > > Add a new page flag, PageWaiters, to indicate the page waitqueue has > > tasks waiting. This can be tested rather than testing waitqueue_active > > which requires another cacheline load. > > Ok, I applied this one too. I think there's room for improvement, but > I don't think it's going to help to just wait another release cycle > and hope something happens. > > Example room for improvement from a profile of unlock_page(): > > 46.44 │ lock andb $0xfe,(%rdi) > 34.22 │ mov (%rdi),%rax > > this has the old "do atomic op on a byte, then load the whole word" > issue that we used to have with the nasty zone lookup code too. And it > causes a horrible pipeline hickup because the load will not forward > the data from the (partial) store. > > Its' really a misfeature of our asm optimizations of the atomic bit > ops. Using "andb" is slightly smaller, but in this case in particular, > an "andq" would be a ton faster, and the mask still fits in an imm8, > so it's not even hugely larger. I did actually play around with that. I could not get my skylake to forward the result from a lock op to a subsequent load (the latency was the same whether you use lock ; andb or lock ; andl (32 cycles for my test loop) whereas with non-atomic versions I was getting about 15 cycles for andb vs 2 for andl. I guess the lock op drains the store queue to coherency and does not allow forwarding so as to provide the memory ordering semantics. > But it might also be a good idea to simply use a "cmpxchg" loop here. > That also gives atomicity guarantees that we don't have with the > "clear bit and then load the value". cmpxchg ends up at 19 cycles including the initial load, so it may be worthwhile. Powerpc has a similar problem with doing a clear_bit; test_bit (not the size mismatch, but forwarding from atomic ops being less capable). Thanks, Nick
next prev parent reply other threads:[~2016-12-26 1:17 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-12-25 3:00 [PATCH 0/2] PageWaiters again Nicholas Piggin 2016-12-25 3:00 ` [PATCH 1/2] mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked Nicholas Piggin 2016-12-25 5:13 ` Hugh Dickins 2016-12-25 3:00 ` [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit Nicholas Piggin 2016-12-25 21:51 ` Linus Torvalds 2016-12-26 1:16 ` Nicholas Piggin [this message] 2016-12-26 19:07 ` Linus Torvalds 2016-12-27 11:19 ` Nicholas Piggin 2016-12-27 18:58 ` Linus Torvalds 2016-12-27 19:23 ` Linus Torvalds 2016-12-27 19:24 ` Linus Torvalds 2016-12-27 19:40 ` Linus Torvalds 2016-12-27 20:17 ` Linus Torvalds 2016-12-28 3:53 ` Nicholas Piggin 2016-12-28 19:17 ` Linus Torvalds 2016-12-29 4:08 ` Nicholas Piggin 2016-12-29 4:16 ` Linus Torvalds 2016-12-29 5:26 ` Nicholas Piggin 2017-01-03 10:24 ` Mel Gorman 2017-01-03 12:29 ` Nicholas Piggin 2017-01-03 17:18 ` Mel Gorman 2016-12-29 22:16 ` [PATCH] mm/filemap: fix parameters to test_bit() Olof Johansson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161226111654.76ab0957@roar.ozlabs.ibm.com \ --to=npiggin@gmail.com \ --cc=agruenba@redhat.com \ --cc=dave.hansen@linux.intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=luto@kernel.org \ --cc=mgorman@techsingularity.net \ --cc=peterz@infradead.org \ --cc=rpeterso@redhat.com \ --cc=swhiteho@redhat.com \ --cc=torvalds@linux-foundation.org \ --subject='Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for a page bit' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).