All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 00/31 v2] PTI support for x86_32
@ 2018-02-20  3:45 David H. Gutteridge
  2018-02-20  8:40 ` Joerg Roedel
  0 siblings, 1 reply; 66+ messages in thread
From: David H. Gutteridge @ 2018-02-20  3:45 UTC (permalink / raw)
  To: joro, linux-kernel

On 09/02/18 10:25, Joerg Roedel wrote:
> Hi,
> 
> here is the second version of my PTI implementation for
> x86_32, based on tip/x86-pti-for-linus. It took a lot longer
> than I had hoped, but there have been a number of obstacles
> on the way. It also isn't the small patch-set anymore that v1
> was, but compared to it this one actually works :)
[...]
>I do not claim that I've found the best solution for every
>problem I encountered, so please review and give me feedback
>on what I should change or solve differently. Of course I am
>also interested in all bugs that may still be in there.
>
>Thanks a lot,
>
>       Joerg

Hello,

I thought I'd try my hand at testing this patch set from an end user's
perspective. I built a test kernel based on Fedora's
config-4.15.2-300.fc27.i686+PAE, the only change obviously being the
addition of CONFIG_PAGE_TABLE_ISOLATION=y. I ran this kernel in two
test environments: an LG X110 netbook, which has an Atom N270 with 1GB
of RAM (booted with "pti=on"), and a QEMU VM emulating a quad Core i7
Nehalem setup. (The X110 is the only i686 hardware I had on hand I
could practically use. I figured it'd be a suitable low-end hardware
spec to work with, even though no one realistically would force-enable
PTI on it.)

Testing consisted in part of using the laptop's Mate session to
remotely render the VM's Xfce session, while both had PTI enabled on
their test kernels. The VM also successfully ran the basic kernel
tests and the performance test suite that Fedora provides for
community testing (https://pagure.io/kernel-tests.git). (Well, it had
a hiccup with the performance testing, but that's apparently unrelated
to the PTI patches.) The laptop was also used for various everyday
activities, like web browsing using Firefox, and document editing
using LibreOffice Writer. (It obviously isn't a star at this, but it
was usable.)

General results:

X110: no issues whatsoever. (I was actually expecting more of a
noticable performance hit in some aspects.)

QEMU VM: I encountered two similar issues:

(1) There is a regression when the QXL display driver is enabled; the
VM hangs during boot. (QXL has been a source of similar trouble in the
past.) I don't have an example trace for it at present.

(2) There is a regression when the VGA display driver is enabled; it
intermittently (but reproducibly) faults, which makes it impossible
to boot to the graphical login manager.

[   25.430588] [drm] Found bochs VGA, ID 0xb0c0.
[   25.431212] [drm] Framebuffer size 16384 kB @ 0xfd000000, mmio @
0xfebd4000.
[   25.432586] [TTM] Zone  kernel: Available graphics memory: 426476 kiB
[   25.433099] [TTM] Zone highmem: Available graphics memory: 1549744
kiB
[   25.433890] [TTM] Initializing pool allocator
[   25.434863] [TTM] Initializing DMA pool allocator
[   25.436767] ------------[ cut here ]------------
[   25.439213] kernel BUG at arch/x86/mm/fault.c:268!
[   25.439218] invalid opcode: 0000 [#1] SMP PTI
[   25.439218] Modules linked in: bochs_drm(+) ttm snd_hda_core
drm_kms_helper snd_hwdep drm snd_seq snd_seq_device snd_pcm snd_timer
snd pcspkr virtio_balloon i2c_piix4 soundcore virtio_console 8139too
crc32c_intel virtio_pci virtio_ring serio_raw virtio 8139cp ata_generic
mii pata_acpi floppy qemu_fw_cfg
[   25.439236] CPU: 1 PID: 545 Comm: systemd-udevd Tainted:
G        W        4.15.0+ #1
[   25.439237] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[   25.439241] EIP: vmalloc_fault+0x1e7/0x210
[   25.439242] EFLAGS: 00010083 CPU: 1
[   25.439243] EAX: 02788000 EBX: d78ecdf8 ECX: 00000080 EDX: 00000000
[   25.439244] ESI: 000fd000 EDI: fd0000f3 EBP: f3f639a0 ESP: f3f63988
[   25.439245]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   25.439246] CR0: 80050033 CR2: f7e00000 CR3: 33e3a000 CR4: 000006f0
[   25.439249] Call Trace:
[   25.439254]  ? kvm_async_pf_task_wake+0x100/0x100
[   25.439256]  __do_page_fault+0x34d/0x4d0
[   25.439257]  ? __ioremap_caller+0x23a/0x3d0
[   25.439259]  ? kvm_async_pf_task_wake+0x100/0x100
[   25.439260]  do_page_fault+0x27/0xe0
[   25.439261]  ? kvm_async_pf_task_wake+0x100/0x100
[   25.439263]  do_async_page_fault+0x55/0x80
[   25.439265]  common_exception+0xef/0xf6
[   25.439268] EIP: memset+0xb/0x20
[   25.439268] EFLAGS: 00010206 CPU: 1
[   25.439269] EAX: 00000000 EBX: f7e00000 ECX: 00300000 EDX: 00000000
[   25.439270] ESI: f3f63b5c EDI: f7e00000 EBP: f3f63a58 ESP: f3f63a50
[   25.439271]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   25.439278]  ttm_bo_move_memcpy+0x47c/0x4a0 [ttm]
[   25.439283]  ttm_bo_handle_move_mem+0x55a/0x580 [ttm]
[   25.439286]  ? ttm_bo_mem_space+0x394/0x460 [ttm]
[   25.439290]  ttm_bo_validate+0x116/0x130 [ttm]
[   25.439294]  bochs_bo_pin+0xa1/0x170 [bochs_drm]
[   25.439297]  bochsfb_create+0xce/0x310 [bochs_drm]
[   25.439308]  __drm_fb_helper_initial_config_and_unlock+0x1cc/0x460
[drm_kms_helper]
[   25.439314]  drm_fb_helper_initial_config+0x35/0x40 [drm_kms_helper]
[   25.439317]  bochs_fbdev_init+0x74/0x80 [bochs_drm]
[   25.439319]  bochs_load+0x7a/0x90 [bochs_drm]
[   25.439333]  drm_dev_register+0x133/0x1b0 [drm]
[   25.439343]  drm_get_pci_dev+0x86/0x160 [drm]
[   25.439346]  bochs_pci_probe+0xcb/0x110 [bochs_drm]
[   25.439348]  ? bochs_load+0x90/0x90 [bochs_drm]
[   25.439351]  pci_device_probe+0xc7/0x160
[   25.439353]  driver_probe_device+0x2dc/0x460
[   25.439354]  __driver_attach+0x99/0xe0
[   25.439356]  ? driver_probe_device+0x460/0x460
[   25.439357]  bus_for_each_dev+0x5a/0xa0
[   25.439359]  driver_attach+0x19/0x20
[   25.439360]  ? driver_probe_device+0x460/0x460
[   25.439362]  bus_add_driver+0x187/0x230
[   25.439363]  ? 0xf7afa000
[   25.439364]  driver_register+0x56/0xd0
[   25.439365]  ? 0xf7afa000
[   25.439367]  __pci_register_driver+0x3a/0x40
[   25.439369]  bochs_init+0x41/0x1000 [bochs_drm]
[   25.439371]  do_one_initcall+0x49/0x170
[   25.439373]  ? _cond_resched+0x2a/0x40
[   25.439375]  ? kmem_cache_alloc_trace+0x175/0x1e0
[   25.439376]  ? do_init_module+0x21/0x1dc
[   25.439378]  ? do_init_module+0x21/0x1dc
[   25.439379]  do_init_module+0x50/0x1dc
[   25.439380]  load_module+0x1fce/0x28e0
[   25.439383]  SyS_finit_module+0x8a/0xe0
[   25.439385]  do_fast_syscall_32+0x81/0x1b0
[   25.439518]  entry_SYSENTER_32+0x5f/0xb9
[   25.439519] EIP: 0xb7f21cf9
[   25.439520] EFLAGS: 00000246 CPU: 1
[   25.439521] EAX: ffffffda EBX: 00000011 ECX: b7afae75 EDX: 00000000
[   25.439522] ESI: 019d5740 EDI: 019acc00 EBP: 019ade00 ESP: bff9bb4c
[   25.439524]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[   25.439525] Code: e2 00 f0 1f 00 81 ea 00 00 20 00 21 d0 8b 55 e8 89
c6 81 e2 ff 0f 00 00 0f ac d6 0c 8d 04 b6 c1 e0 03 39 45 ec 0f 84 27 ff
ff ff <0f> 0b 8d b4 26 00 00 00 00 83 c4 0c ba ff ff ff ff 5b 89 d0 5e
[   25.439547] EIP: vmalloc_fault+0x1e7/0x210 SS:ESP: 0068:f3f63988
[   25.439548] ---[ end trace 18f2d11043a28ec0 ]---

The Virtio and VMVGA display drivers both worked consistently for me.

I haven't tested a non-PAE kernel, but can do so if it's of interest.
Or I can provide further details or testing if need be. If so, please
CC me. I hope this is of some use.

Regards,

Dave

^ permalink raw reply	[flat|nested] 66+ messages in thread
* [PATCH 00/31 v2] PTI support for x86_32
@ 2018-02-09  9:25 ` Joerg Roedel
  0 siblings, 0 replies; 66+ messages in thread
From: Joerg Roedel @ 2018-02-09  9:25 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H . Peter Anvin
  Cc: x86, linux-kernel, linux-mm, Linus Torvalds, Andy Lutomirski,
	Dave Hansen, Josh Poimboeuf, Juergen Gross, Peter Zijlstra,
	Borislav Petkov, Jiri Kosina, Boris Ostrovsky, Brian Gerst,
	David Laight, Denys Vlasenko, Eduardo Valentin, Greg KH,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Andrea Arcangeli, Waiman Long, Pavel Machek, jroedel, joro

Hi,

here is the second version of my PTI implementation for
x86_32, based on tip/x86-pti-for-linus. It took a lot longer
than I had hoped, but there have been a number of obstacles
on the way. It also isn't the small patch-set anymore that v1
was, but compared to it this one actually works :)

The biggest changes were necessary in the entry code, a lot
of it is moving code around, but there are also significant
changes to get all cases covered. This includes NMIs and
exceptions on the kernel exit-path where we are already on
the entry-stack. To make this work I decided to mostly split
up the common kernel-exit path into a return-to-kernel,
return-to-user and return-from-nmi part.

On the page-table side I had to do a lot of special cases
for PAE because PAE paging is so, well, special. The biggest
example here is the LDT mapping code, which needs to work on
the PMD level instead of PGD when PAE is enabled.

During development I also experimented with unshared PMDs
between the kernel and the user page-tables for PAE. It
worked by allocating 8k PMDs and using the lower half for
the kernel and the upper half for the user page-table. While
this worked and allowed me to NX-protect the user-space
address-range in the kernel page-table, it also required 5
order-1 allocations in low-mem for each process. In my
testing I got this to fail pretty quickly and trigger OOM,
so I abandoned the approach for now.

Here is how I tested these patches:

	* Booted on a real machine (4C/8T, 16GB RAM) and run
	  an overnight load-test with 'perf top' running
	  (for the NMIs), the ldt_gdt selftest running in a
	  loop (for more stress on the entry/exit path) and
	  a -j16 kernel compile also running in a loop. The
	  box survived the test, which ran for more than 18
	  hours.

	* Tested most x86 selftests in the kernel on the
	  real machine. This showed no regressions. I did
	  not run the mpx and protection-key tests, as the
	  machine does not support these features, and I
	  also skipped the check_initial_reg_state test, as
	  it made problems while compiling and it didn't
	  seem relevant enough to fix that for this
	  patch-set.

	* Boot tested all valid combinations of [NO]HIGHMEM* vs.
	  VMSPLIT* vs. PAE in KVM. All booted fine.

	* Did compile-tests with various configs (allyes,
	  allmod, defconfig, ..., basically what I usually
	  use to test the iommu-tree as well). All compiled
	  fine.

	* Some basic compile, boot and runtime testing of
	  64 bit to make sure I didn't break anything there.

I did not explicitly test wine and dosemu, but since the
vm86 and the ldt_gdt self-tests all passed fine I am
confident that those will also still work.

XENPV is also untested from my side, but I added checks to
not do the stack switches in the entry-code when XENPV is
enabled, so hopefully it works. But someone should test it,
of course.

I also pushed these patches to

	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git pti-x32-v2

for easier testing.

I do not claim that I've found the best solution for every
problem I encountered, so please review and give me feedback
on what I should change or solve differently. Of course I am
also interested in all bugs that may still be in there.

Thanks a lot,

       Joerg


Joerg Roedel (31):
  x86/asm-offsets: Move TSS_sp0 and TSS_sp1 to asm-offsets.c
  x86/entry/32: Rename TSS_sysenter_sp0 to TSS_entry_stack
  x86/entry/32: Load task stack from x86_tss.sp1 in SYSENTER handler
  x86/entry/32: Put ESPFIX code into a macro
  x86/entry/32: Unshare NMI return path
  x86/entry/32: Split off return-to-kernel path
  x86/entry/32: Restore segments before int registers
  x86/entry/32: Enter the kernel via trampoline stack
  x86/entry/32: Leave the kernel via trampoline stack
  x86/entry/32: Introduce SAVE_ALL_NMI and RESTORE_ALL_NMI
  x86/entry/32: Add PTI cr3 switches to NMI handler code
  x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit points
  x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack
  x86/pgtable/pae: Unshare kernel PMDs when PTI is enabled
  x86/pgtable/32: Allocate 8k page-tables when PTI is enabled
  x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h
  x86/pgtable: Move pti_set_user_pgd() to pgtable.h
  x86/pgtable: Move two more functions from pgtable_64.h to pgtable.h
  x86/mm/pae: Populate valid user PGD entries
  x86/mm/pae: Populate the user page-table with user pgd's
  x86/mm/legacy: Populate the user page-table with user pgd's
  x86/mm/pti: Add an overflow check to pti_clone_pmds()
  x86/mm/pti: Define X86_CR3_PTI_PCID_USER_BIT on x86_32
  x86/mm/pti: Clone CPU_ENTRY_AREA on PMD level on x86_32
  x86/mm/dump_pagetables: Define INIT_PGD
  x86/pgtable/pae: Use separate kernel PMDs for user page-table
  x86/ldt: Reserve address-space range on 32 bit for the LDT
  x86/ldt: Define LDT_END_ADDR
  x86/ldt: Split out sanity check in map_ldt_struct()
  x86/ldt: Enable LDT user-mapping for PAE
  x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32

 arch/x86/entry/entry_32.S                   | 581 ++++++++++++++++++++++------
 arch/x86/include/asm/mmu_context.h          |   4 -
 arch/x86/include/asm/pgtable-2level.h       |   9 +
 arch/x86/include/asm/pgtable-2level_types.h |   3 +
 arch/x86/include/asm/pgtable-3level.h       |   7 +
 arch/x86/include/asm/pgtable-3level_types.h |   6 +-
 arch/x86/include/asm/pgtable.h              |  88 +++++
 arch/x86/include/asm/pgtable_32_types.h     |   9 +-
 arch/x86/include/asm/pgtable_64.h           |  85 ----
 arch/x86/include/asm/pgtable_64_types.h     |   4 +
 arch/x86/include/asm/pgtable_types.h        |  26 +-
 arch/x86/include/asm/processor-flags.h      |   8 +-
 arch/x86/include/asm/switch_to.h            |   6 +-
 arch/x86/kernel/asm-offsets.c               |   5 +
 arch/x86/kernel/asm-offsets_32.c            |   2 +-
 arch/x86/kernel/asm-offsets_64.c            |   2 -
 arch/x86/kernel/cpu/common.c                |   5 +-
 arch/x86/kernel/head_32.S                   |  20 +-
 arch/x86/kernel/ldt.c                       | 137 +++++--
 arch/x86/kernel/process.c                   |   2 -
 arch/x86/kernel/process_32.c                |  10 +-
 arch/x86/mm/dump_pagetables.c               |  21 +-
 arch/x86/mm/pgtable.c                       | 105 ++++-
 arch/x86/mm/pti.c                           |  24 ++
 security/Kconfig                            |   2 +-
 25 files changed, 888 insertions(+), 283 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2018-03-06 15:39 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-20  3:45 [PATCH 00/31 v2] PTI support for x86_32 David H. Gutteridge
2018-02-20  8:40 ` Joerg Roedel
  -- strict thread matches above, loose matches on Subject: below --
2018-02-09  9:25 Joerg Roedel
2018-02-09  9:25 ` Joerg Roedel
2018-02-09 12:11 ` Juergen Gross
2018-02-09 12:11   ` Juergen Gross
2018-02-09 13:35   ` Joerg Roedel
2018-02-09 13:35     ` Joerg Roedel
2018-02-09 13:54     ` Andrew Cooper
2018-02-09 13:54       ` Andrew Cooper
2018-02-09 17:47 ` Andy Lutomirski
2018-02-09 17:47   ` Andy Lutomirski
2018-02-09 19:11   ` Joerg Roedel
2018-02-09 19:11     ` Joerg Roedel
2018-02-10  9:15     ` Adam Borowski
2018-02-10  9:15       ` Adam Borowski
2018-02-10 20:22       ` Linus Torvalds
2018-02-10 20:22         ` Linus Torvalds
2018-02-11 10:59         ` Adam Borowski
2018-02-11 10:59           ` Adam Borowski
2018-02-11 17:40           ` Mark D Rustad
2018-02-11 19:42             ` Andy Lutomirski
2018-02-11 19:42               ` Andy Lutomirski
2018-02-11 20:14               ` Linus Torvalds
2018-02-11 20:14                 ` Linus Torvalds
2018-02-11 22:12               ` James Bottomley
2018-02-11 22:12                 ` James Bottomley
2018-02-11 22:30                 ` Andy Lutomirski
2018-02-11 22:30                   ` Andy Lutomirski
2018-02-11 23:47                   ` James Bottomley
2018-02-11 23:47                     ` James Bottomley
2018-02-11 22:34               ` Pavel Machek
2018-02-11 22:34                 ` Pavel Machek
2018-02-11 23:25               ` Alan Cox
2018-02-11 23:25                 ` Alan Cox
2018-02-12 10:16                 ` Anders Larsen
2018-02-12 10:16                   ` Anders Larsen
2018-02-14 10:43               ` Pavel Machek
2018-02-15  3:44                 ` joe.korty
2018-02-16 14:34                   ` Pavel Machek
2018-02-13  8:54             ` Greg KH
2018-02-13  8:54               ` Greg KH
2018-02-13 17:25               ` Linus Torvalds
2018-02-13 17:25                 ` Linus Torvalds
2018-02-14  8:54                 ` Greg KH
2018-02-14  8:54                   ` Greg KH
2018-02-21 10:26                   ` Lorenzo Colitti
2018-02-21 10:26                     ` Lorenzo Colitti
2018-02-21 16:59                     ` Arnd Bergmann
2018-02-21 16:59                       ` Arnd Bergmann
2018-02-22 11:10                       ` Greg KH
2018-02-22 11:10                         ` Greg KH
2018-02-22 11:18                         ` Arnd Bergmann
2018-02-22 11:18                           ` Arnd Bergmann
2018-03-06 15:39                 ` Jason A. Donenfeld
2018-03-06 15:39                   ` Jason A. Donenfeld
2018-03-06 15:39                   ` Jason A. Donenfeld
2018-02-11 19:13     ` Ingo Molnar
2018-02-11 19:13       ` Ingo Molnar
2018-02-12 14:51       ` Joerg Roedel
2018-02-12 14:51         ` Joerg Roedel
2018-02-09 21:09   ` Pavel Machek
2018-02-09 21:11     ` Linus Torvalds
2018-02-09 21:11       ` Linus Torvalds
2018-02-09 21:28     ` Andrew Cooper
2018-02-09 21:28       ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.