linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Lutomirski <amluto@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	"H. Peter Anvin" <hpa@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Alexander van Heukelum <heukelum@fastmail.fm>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Arjan van de Ven <arjan.van.de.ven@intel.com>,
	Brian Gerst <brgerst@gmail.com>,
	Alexandre Julliard <julliard@winehq.com>,
	Andi Kleen <andi@firstfloor.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] x86-64: espfix for 64-bit mode *PROTOTYPE*
Date: Tue, 22 Apr 2014 09:03:55 -0700	[thread overview]
Message-ID: <CAObL_7FGs4n6zusbdwTLi5W5q2V81Sf7pOnOmHPFyv5d7jMfvA@mail.gmail.com> (raw)
In-Reply-To: <20140422144659.GF15882@pd.tnic>

On Tue, Apr 22, 2014 at 7:46 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Apr 22, 2014 at 01:23:12PM +0200, Borislav Petkov wrote:
>> I wonder if it would be workable to use a bit in the espfix PGD to
>> denote that it has been initialized already... I hear, near NX there's
>> some room :-)
>
> Ok, I realized this won't work when I hit send... Oh well.
>
> Anyway, another dumb idea: have we considered making this lazy? I.e.,
> preallocate pages to fit the stack of NR_CPUS after smp init is done but
> not setup the percpu espfix stack. Only do that in espfix_fix_stack the
> first time we land there and haven't been setup yet on this cpu.
>
> This should cover the 1% out there who still use 16-bit segments and the
> rest simply doesn't use it and get to save themselves the PT-walk in
> start_secondary().
>
> Hmmm...

I'm going to try to do the math to see what's actually going on.

Each 4G slice contains 64kB of ministacks, which corresponds to 1024
ministacks.  Virtual addresses are divided up as:

12 bits (0..11): address within page.
9 bits (12..20): identifies the PTE within the level 1 directory
9 bits (21..29): identifies the level 1 directory (pmd?) within the
level 2 directory
9 bits (30..38): identifies the level 2 directory (pud) within the
level 3 directory

Critically, each 1024 CPUs can share the same level 1 directory --
there are just a bunch of copies of the same thing in there.
Similarly, they can share the same level 2 directory, and each slot in
that directory will point to the same level 1 directory.

For the level 3 directory, there is only one globally.  It needs 8
entries per 1024 CPUs.

I imagine there's a scalability problem here, too: it's okay if each
of a very large number of CPUs waits while shared structures are
allocated, but owners of big systems won't like it if they all
serialize on the way out.

So maybe it would make sense to refactor this into two separate
functions.  First, before we start the first non-boot CPU:

static pte_t *slice_pte_tables[NR_CPUS / 1024];
Allocate and initialize them all;

It might even make sense to do this at build time instead of run time.
 I can't imagine that parallelizing this would provide any benefit
unless it were done *very* carefully and there were hundreds of
thousands of CPUs.  At worst, we're wasting 4 bytes per CPU not
present.

Then, for the per-CPU part, have one init-once structure (please tell
me the kernel has one of these) per 64 possible CPUs.  Each CPU will
make sure that its group of 64 cpus is initialized, using the init
once mechanism, and then it will set its percpu variable accordingly.

There are only 64 CPUs per slice, so mutexes may no be so bad here.

--Andy

  reply	other threads:[~2014-04-22 16:04 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 17:36 [tip:x86/urgent] x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels tip-bot for H. Peter Anvin
2014-04-11 18:12 ` Andy Lutomirski
2014-04-11 18:20   ` H. Peter Anvin
2014-04-11 18:27 ` Brian Gerst
2014-04-11 18:29   ` H. Peter Anvin
2014-04-11 18:35     ` Brian Gerst
2014-04-11 21:16     ` Andy Lutomirski
2014-04-11 21:24       ` H. Peter Anvin
2014-04-11 21:53         ` Andy Lutomirski
2014-04-11 21:59           ` H. Peter Anvin
2014-04-11 22:15             ` Andy Lutomirski
2014-04-11 22:18               ` H. Peter Anvin
2014-04-13  4:20           ` H. Peter Anvin
2014-04-12 23:26         ` Alexander van Heukelum
2014-04-12 23:31           ` H. Peter Anvin
2014-04-12 23:49             ` Alexander van Heukelum
2014-04-13  0:03               ` H. Peter Anvin
2014-04-13  1:25                 ` Andy Lutomirski
2014-04-13  1:29                   ` Andy Lutomirski
2014-04-13  3:00                     ` H. Peter Anvin
2014-04-11 21:34       ` Linus Torvalds
2014-04-11 18:41   ` Linus Torvalds
2014-04-11 18:45     ` Brian Gerst
2014-04-11 18:50       ` Linus Torvalds
2014-04-12  4:44         ` Brian Gerst
2014-04-12 17:18           ` H. Peter Anvin
2014-04-12 19:35             ` Borislav Petkov
2014-04-12 19:44               ` H. Peter Anvin
2014-04-12 20:11                 ` Borislav Petkov
2014-04-12 20:34                   ` Brian Gerst
2014-04-12 20:59                     ` Borislav Petkov
2014-04-12 21:13                       ` Brian Gerst
2014-04-12 21:40                         ` Borislav Petkov
2014-04-14  7:21                           ` Ingo Molnar
2014-04-14  9:44                             ` Borislav Petkov
2014-04-14  9:47                               ` Ingo Molnar
2014-04-12 21:53                 ` Linus Torvalds
2014-04-12 22:25                   ` H. Peter Anvin
2014-04-13  2:56                     ` Andi Kleen
2014-04-13  3:02                       ` H. Peter Anvin
2014-04-13  3:13                       ` Linus Torvalds
2014-04-12 20:29             ` Brian Gerst
2014-04-14  7:48         ` Alexandre Julliard
2014-05-07  9:18           ` Sven Joachim
2014-05-07 10:18             ` Borislav Petkov
2014-05-07 16:57             ` Linus Torvalds
2014-05-07 17:09               ` H. Peter Anvin
2014-05-07 17:50                 ` Alexandre Julliard
2014-05-08  6:43                 ` Sven Joachim
2014-05-08 13:50                   ` H. Peter Anvin
2014-05-08 20:13                     ` H. Peter Anvin
2014-05-08 20:40                     ` H. Peter Anvin
2014-05-12 13:16               ` Josh Boyer
2014-05-12 16:52                 ` H. Peter Anvin
2014-05-14 23:43               ` [tip:x86/urgent] x86-64, modify_ldt: Make support for 16-bit segments a runtime option tip-bot for Linus Torvalds
2014-04-11 18:46     ` [tip:x86/urgent] x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels H. Peter Anvin
2014-04-14  7:27       ` Ingo Molnar
2014-04-14 15:45         ` H. Peter Anvin
2014-04-13  2:54     ` Andi Kleen
2014-04-21 22:47 ` [PATCH] x86-64: espfix for 64-bit mode *PROTOTYPE* H. Peter Anvin
2014-04-21 23:19   ` Andrew Lutomirski
2014-04-21 23:29     ` H. Peter Anvin
2014-04-22  0:37       ` Andrew Lutomirski
2014-04-22  0:53         ` H. Peter Anvin
2014-04-22  1:06           ` Andrew Lutomirski
2014-04-22  1:14             ` H. Peter Anvin
2014-04-22  1:28               ` Andrew Lutomirski
2014-04-22  1:47                 ` H. Peter Anvin
2014-04-22  1:53                   ` Andrew Lutomirski
2014-04-22 11:23                     ` Borislav Petkov
2014-04-22 14:46                       ` Borislav Petkov
2014-04-22 16:03                         ` Andrew Lutomirski [this message]
2014-04-22 16:10                           ` H. Peter Anvin
2014-04-22 16:33                             ` Andrew Lutomirski
2014-04-22 16:43                               ` Linus Torvalds
2014-04-22 17:00                                 ` Andrew Lutomirski
2014-04-22 17:04                                   ` Linus Torvalds
2014-04-22 17:11                                     ` Andrew Lutomirski
2014-04-22 17:15                                       ` H. Peter Anvin
2014-04-23  9:54                                         ` One Thousand Gnomes
2014-04-23 15:53                                           ` H. Peter Anvin
2014-04-23 17:08                                             ` Andrew Lutomirski
2014-04-23 17:16                                               ` H. Peter Anvin
2014-04-23 17:25                                                 ` Andrew Lutomirski
2014-04-23 17:28                                                   ` H. Peter Anvin
2014-04-23 17:45                                                     ` Andrew Lutomirski
2014-04-22 17:19                                       ` Linus Torvalds
2014-04-22 17:29                                         ` H. Peter Anvin
2014-04-22 17:46                                           ` Andrew Lutomirski
2014-04-22 17:59                                             ` H. Peter Anvin
2014-04-22 18:03                                             ` Brian Gerst
2014-04-22 18:06                                               ` H. Peter Anvin
2014-04-22 18:17                                                 ` Brian Gerst
2014-04-22 18:51                                                   ` H. Peter Anvin
2014-04-22 19:55                                                     ` Brian Gerst
2014-04-22 20:17                                                       ` H. Peter Anvin
2014-04-22 23:08                                                         ` Brian Gerst
2014-04-22 23:39                                                     ` Andi Kleen
2014-04-22 23:40                                                       ` H. Peter Anvin
2014-04-22 17:11                                     ` H. Peter Anvin
2014-04-22 17:26                                       ` Borislav Petkov
2014-04-22 17:29                                         ` Andrew Lutomirski
2014-04-22 19:27                                           ` Borislav Petkov
2014-04-23  6:24                                     ` H. Peter Anvin
2014-04-23  8:57                                       ` Alexandre Julliard
2014-04-22 17:09                                   ` H. Peter Anvin
2014-04-22 17:20                                     ` Andrew Lutomirski
2014-04-22 17:24                                       ` H. Peter Anvin
2014-04-22 11:25   ` Borislav Petkov
2014-04-23  1:17   ` H. Peter Anvin
2014-04-23  1:23     ` Andrew Lutomirski
2014-04-23  1:42       ` H. Peter Anvin
2014-04-23 14:24         ` Boris Ostrovsky
2014-04-23 16:56           ` H. Peter Anvin
2014-04-28 13:04             ` Konrad Rzeszutek Wilk
2014-04-25 21:02     ` Konrad Rzeszutek Wilk
2014-04-25 21:16       ` H. Peter Anvin
2014-04-24  4:13   ` comex
2014-04-24  4:53     ` Andrew Lutomirski
2014-04-24 22:24       ` H. Peter Anvin
2014-04-24 22:31         ` Andrew Lutomirski
2014-04-24 22:37           ` H. Peter Anvin
2014-04-24 22:43             ` Andrew Lutomirski
2014-04-28 23:05       ` H. Peter Anvin
2014-04-28 23:08         ` H. Peter Anvin
2014-04-29  0:02           ` Andrew Lutomirski
2014-04-29  0:15             ` H. Peter Anvin
2014-04-29  0:20             ` Andrew Lutomirski
2014-04-29  2:38               ` H. Peter Anvin
2014-04-29  2:44                 ` H. Peter Anvin
2014-04-29  3:45                 ` H. Peter Anvin
2014-04-29  3:47                   ` H. Peter Anvin
2014-04-29  4:36                   ` H. Peter Anvin
2014-04-29  7:14                     ` H. Peter Anvin
2014-04-25 12:02   ` Pavel Machek
2014-04-25 21:20     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAObL_7FGs4n6zusbdwTLi5W5q2V81Sf7pOnOmHPFyv5d7jMfvA@mail.gmail.com \
    --to=amluto@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=arjan.van.de.ven@intel.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=heukelum@fastmail.fm \
    --cc=hpa@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=julliard@winehq.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).