linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] Use global pages with PTI
@ 2018-03-23 17:44 Dave Hansen
  2018-03-23 17:44 ` [PATCH 01/11] x86/mm: factor out pageattr _PAGE_GLOBAL setting Dave Hansen
                   ` (11 more replies)
  0 siblings, 12 replies; 42+ messages in thread
From: Dave Hansen @ 2018-03-23 17:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Dave Hansen, aarcange, luto, torvalds, keescook, hughd,
	jgross, x86, namit

The later verions of the KAISER pathces (pre-PTI) allowed the user/kernel
shared areas to be GLOBAL.  The thought was that this would reduce the
TLB overhead of keeping two copies of these mappings.

During the switch over to PTI, we seem to have lost our ability to have
GLOBAL mappings.  This adds them back.

This adds one major change from the last version of the patch set
(present in the last patch).  It makes all kernel text global for non-
PCID systems.  This keeps kernel data protected always, but means that
it will be easier to find kernel gadgets via meltdown on old systems
without PCIDs.  This heuristic is, I think, a reasonable one and it
keeps us from having to create any new pti=foo options

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Cc: Nadav Amit <namit@vmware.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 00/11] [v3] Use global pages with PTI
@ 2018-04-02 17:27 Dave Hansen
  2018-04-02 17:27 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Hansen @ 2018-04-02 17:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Dave Hansen, aarcange, luto, torvalds, keescook, hughd,
	jgross, x86, namit

Changes from v2:

 * Add performance numbers to changelogs
 * Fix compile error resulting from use of x86-specific
   __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
 * Delay kernel text cloning until after we are done messing
   with it (patch 11).
 * Blacklist K8 explicitly from mapping all kernel text as
   global (this should never happen because K8 does not use
   pti when pti=auto, but we on the safe side). (patch 11)

--

The later versions of the KAISER patches (pre-PTI) allowed the
user/kernel shared areas to be GLOBAL.  The thought was that this would
reduce the TLB overhead of keeping two copies of these mappings.

During the switch over to PTI, we seem to have lost our ability to have
GLOBAL mappings.  This adds them back.

To measure the benefits of this, I took a modern Atom system without
PCIDs and ran a microbenchmark[1] (higher is better):

No Global Lines (baseline  ): 6077741 lseeks/sec
88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge:

No Global pages (baseline): 15783951 lseeks/sec
28 Global pages (this set): 16054688 lseeks/sec
                             +270737 lseeks/sec (+1.71%)

I also double-checked with a kernel compile on the Skylake system (lower
is better):

No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
                             -1.195 seconds (-0.64%)

1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Cc: Nadav Amit <namit@vmware.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 00/11] [v4] Use global pages with PTI
@ 2018-04-04  1:09 Dave Hansen
  2018-04-04  1:09 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Hansen @ 2018-04-04  1:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Dave Hansen, aarcange, luto, torvalds, keescook, hughd,
	jgross, x86, namit

Changes from v3:
 * Fix whitespace issue noticed by willy
 * Clarify comments about X86_FEATURE_PGE checks
 * Clarify commit message around the necessity of _PAGE_GLOBAL
   filtering when CR4.PGE=0 or PGE is unsupported.

Changes from v2:

 * Add performance numbers to changelogs
 * Fix compile error resulting from use of x86-specific
   __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
 * Delay kernel text cloning until after we are done messing
   with it (patch 11).
 * Blacklist K8 explicitly from mapping all kernel text as
   global (this should never happen because K8 does not use
   pti when pti=auto, but we on the safe side). (patch 11)

--

The later versions of the KAISER patches (pre-PTI) allowed the
user/kernel shared areas to be GLOBAL.  The thought was that this would
reduce the TLB overhead of keeping two copies of these mappings.

During the switch over to PTI, we seem to have lost our ability to have
GLOBAL mappings.  This adds them back.

To measure the benefits of this, I took a modern Atom system without
PCIDs and ran a microbenchmark[1] (higher is better):

No Global Lines (baseline  ): 6077741 lseeks/sec
88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge:

No Global pages (baseline): 15783951 lseeks/sec
28 Global pages (this set): 16054688 lseeks/sec
                             +270737 lseeks/sec (+1.71%)

I also double-checked with a kernel compile on the Skylake system (lower
is better):

No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
                             -1.195 seconds (-0.64%)

1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Cc: Nadav Amit <namit@vmware.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 00/11] [v5] Use global pages with PTI
@ 2018-04-06 20:55 Dave Hansen
  2018-04-06 20:55 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Hansen @ 2018-04-06 20:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Dave Hansen, aarcange, luto, torvalds, keescook, hughd,
	jgross, x86, namit

Changes from v4
 * Fix compile error reported by Tom Lendacky
 * Avoid setting _PAGE_GLOBAL on non-present entries

Changes from v3:
 * Fix whitespace issue noticed by willy
 * Clarify comments about X86_FEATURE_PGE checks
 * Clarify commit message around the necessity of _PAGE_GLOBAL
   filtering when CR4.PGE=0 or PGE is unsupported.

Changes from v2:

 * Add performance numbers to changelogs
 * Fix compile error resulting from use of x86-specific
   __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
 * Delay kernel text cloning until after we are done messing
   with it (patch 11).
 * Blacklist K8 explicitly from mapping all kernel text as
   global (this should never happen because K8 does not use
   pti when pti=auto, but we on the safe side). (patch 11)

--

The later versions of the KAISER patches (pre-PTI) allowed the
user/kernel shared areas to be GLOBAL.  The thought was that this would
reduce the TLB overhead of keeping two copies of these mappings.

During the switch over to PTI, we seem to have lost our ability to have
GLOBAL mappings.  This adds them back.

To measure the benefits of this, I took a modern Atom system without
PCIDs and ran a microbenchmark[1] (higher is better):

No Global Lines (baseline  ): 6077741 lseeks/sec
88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge:

No Global pages (baseline): 15783951 lseeks/sec
28 Global pages (this set): 16054688 lseeks/sec
                             +270737 lseeks/sec (+1.71%)

I also double-checked with a kernel compile on the Skylake system (lower
is better):

No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
                             -1.195 seconds (-0.64%)

1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Cc: Nadav Amit <namit@vmware.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-04-06 20:58 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-23 17:44 [PATCH 00/11] Use global pages with PTI Dave Hansen
2018-03-23 17:44 ` [PATCH 01/11] x86/mm: factor out pageattr _PAGE_GLOBAL setting Dave Hansen
2018-03-23 17:44 ` [PATCH 02/11] x86/mm: undo double _PAGE_PSE clearing Dave Hansen
2018-03-23 17:44 ` [PATCH 03/11] x86/mm: introduce "default" kernel PTE mask Dave Hansen
2018-03-23 17:44 ` [PATCH 04/11] x86/espfix: document use of _PAGE_GLOBAL Dave Hansen
2018-03-23 17:44 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
2018-03-23 19:15   ` Nadav Amit
2018-03-23 19:26     ` Dave Hansen
2018-03-23 19:34       ` Nadav Amit
2018-03-23 19:38         ` Dave Hansen
2018-03-24 15:10   ` kbuild test robot
2018-03-24 15:21   ` kbuild test robot
2018-03-23 17:44 ` [PATCH 06/11] x86/mm: remove extra filtering in pageattr code Dave Hansen
2018-03-23 17:44 ` [PATCH 07/11] x86/mm: comment _PAGE_GLOBAL mystery Dave Hansen
2018-03-23 17:44 ` [PATCH 08/11] x86/mm: do not forbid _PAGE_RW before init for __ro_after_init Dave Hansen
2018-03-23 17:45 ` [PATCH 09/11] x86/pti: enable global pages for shared areas Dave Hansen
2018-03-23 19:12   ` Nadav Amit
2018-03-23 19:36     ` Dave Hansen
2018-03-23 17:45 ` [PATCH 10/11] x86/pti: clear _PAGE_GLOBAL for kernel image Dave Hansen
2018-03-23 17:45 ` [PATCH 11/11] x86/pti: leave kernel text global for !PCID Dave Hansen
2018-03-23 18:26 ` [PATCH 00/11] Use global pages with PTI Linus Torvalds
2018-03-24  0:40   ` Dave Hansen
2018-03-24  0:46     ` Linus Torvalds
2018-03-24  0:54       ` Linus Torvalds
2018-03-24 11:05     ` Ingo Molnar
2018-03-27 13:36     ` Thomas Gleixner
2018-03-27 16:32       ` Dave Hansen
2018-03-27 17:51         ` Thomas Gleixner
2018-03-27 20:07           ` Ingo Molnar
2018-03-27 20:19             ` Dave Hansen
2018-03-29  0:17             ` Dave Hansen
2018-03-30 12:09               ` Ingo Molnar
2018-03-30 12:17                 ` Ingo Molnar
2018-03-30 20:26                   ` Dave Hansen
2018-03-30 20:32                     ` Thomas Gleixner
2018-03-30 21:40                       ` Dave Hansen
2018-03-31  5:39                         ` Ingo Molnar
2018-03-31 18:19                           ` Dave Hansen
2018-04-02 17:27 [PATCH 00/11] [v3] " Dave Hansen
2018-04-02 17:27 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
2018-04-04  1:09 [PATCH 00/11] [v4] Use global pages with PTI Dave Hansen
2018-04-04  1:09 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen
2018-04-05 19:49   ` Tom Lendacky
2018-04-06 20:55 [PATCH 00/11] [v5] Use global pages with PTI Dave Hansen
2018-04-06 20:55 ` [PATCH 05/11] x86/mm: do not auto-massage page protections Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).