linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Bob Peterson <rpeterso@redhat.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit
Date: Wed, 26 Oct 2016 09:32:16 -0700	[thread overview]
Message-ID: <CA+55aFzB2C0aktFZW3GquJF6dhM1904aDPrv4vdQ8=+mWO7jcg@mail.gmail.com> (raw)
In-Reply-To: <CALCETrUt+4ojyscJT1AFN5Zt3mKY0rrxcXMBOUUJzzLMWXFXHg@mail.gmail.com>

On Wed, Oct 26, 2016 at 8:51 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>
>> I get the following BUG with 4.9-rc2, CONFIG_VMAP_STACK and
>> CONFIG_DEBUG_VIRTUAL turned on:
>>
>>   kernel BUG at arch/x86/mm/physaddr.c:26!
>
> const struct zone *zone = page_zone(virt_to_page(word));
>
> If the stack is vmalloced, then you can't find the page's zone like
> that.  We could look it up the slow way (ick!), but maybe another
> solution would be to do:

Christ. It's that damn bit-wait craziness again with the idiotic zone lookup.

I complained about it a couple of weeks ago for entirely unrelated
reasons: it absolutely sucks donkey ass through a straw from a cache
standpoint too. It makes the page_waitqueue() thing very expensive, to
the point where it shows up as taking up 3% of CPU time on a real
load.,

PeterZ had a patch that fixed most of the performance trouble because
the page_waitqueue is actually never realistically contested, and by
making the bit-waiting use *two* bits you can avoid the slow-path cost
entirely.

But here we have a totally different issue, namely that we want to
wait on a virtual address.

Quite frankly, I think the solution is to just rip out all the insane
zone crap. The most important use (by far) for the bit-waitqueue is
for the page locking, and with the "use a second bit to show
contention", there is absolutely no reason to try to do some crazy
per-zone thing. It's a slow-path that never matters, and rather than
make things scale well, the only thing it does is to pretty much
guarantee at least one extra cache miss.

Adding MelG and the mm list to the cc (PeterZ was already there) here
just for the heads up.

                   Linus

  reply	other threads:[~2016-10-26 16:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-26 12:51 CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit Andreas Gruenbacher
2016-10-26 15:51 ` Andy Lutomirski
2016-10-26 16:32   ` Linus Torvalds [this message]
2016-10-26 17:15     ` Linus Torvalds
2016-10-26 17:58       ` Linus Torvalds
2016-10-26 18:04         ` Bob Peterson
2016-10-26 18:10           ` Linus Torvalds
2016-10-26 19:11             ` Bob Peterson
2016-10-26 21:01             ` Bob Peterson
2016-10-26 21:30               ` Linus Torvalds
2016-10-26 22:45                 ` Borislav Petkov
2016-10-26 23:13               ` Borislav Petkov
2016-10-27  0:37                 ` Bob Peterson
2016-10-27 12:36                   ` Borislav Petkov
2016-10-27 18:51                     ` Bob Peterson
2016-10-27 19:19                       ` Borislav Petkov
2016-10-27 21:03                         ` Bob Peterson
2016-10-27 21:19                           ` Borislav Petkov
2016-10-28  8:37                     ` [tip:x86/urgent] x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y tip-bot for Borislav Petkov
2016-10-26 20:31       ` CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit Mel Gorman
2016-10-26 21:26         ` Linus Torvalds
2016-10-26 22:03           ` Mel Gorman
2016-10-26 22:09             ` Linus Torvalds
2016-10-26 23:07               ` Mel Gorman
2016-10-27  8:08                 ` Peter Zijlstra
2016-10-27  9:07                   ` Mel Gorman
2016-10-27  9:44                     ` Peter Zijlstra
2016-10-27  9:59                       ` Mel Gorman
2016-10-27 11:56                   ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFzB2C0aktFZW3GquJF6dhM1904aDPrv4vdQ8=+mWO7jcg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=agruenba@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=rpeterso@redhat.com \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).