All of lore.kernel.org
 help / color / mirror / Atom feed
* "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03  8:36 Benjamin Gilbert
  2018-01-03  8:46 ` Benjamin Gilbert
  0 siblings, 1 reply; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-03  8:36 UTC (permalink / raw)
  To: linux-mm; +Cc: stable, Greg Kroah-Hartman


[-- Attachment #1.1: Type: text/plain, Size: 6124 bytes --]

Hi all,

In our regression tests on kernel 4.14.11, we're occasionally seeing a run
of "bad pmd" messages during boot, followed by a "BUG: unable to handle
kernel paging request".  This happens on no more than a couple percent of
boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
instances.  It always happens immediately after "Loading compiled-in X.509
certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
with pti=off.  Here's a sample backtrace:

[    4.762964] Loading compiled-in X.509 certificates
[    4.765620] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee000(800000007d6000e3)
[    4.769099] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee008(800000007d8000e3)
[    4.772479] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee010(800000007da000e3)
[    4.775919] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee018(800000007dc000e3)
[    4.779251] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee020(800000007de000e3)
[    4.782558] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee028(800000007e0000e3)
[    4.794160] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee030(800000007e2000e3)
[    4.797525] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee038(800000007e4000e3)
[    4.800776] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee040(800000007e6000e3)
[    4.804100] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee048(800000007e8000e3)
[    4.807437] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee050(800000007ea000e3)
[    4.810729] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee058(800000007ec000e3)
[    4.813989] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee060(800000007ee000e3)
[    4.817294] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee068(800000007f0000e3)
[    4.820713] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee070(800000007f2000e3)
[    4.823943] ../source/mm/pgtable-generic.c:40: bad pmd
ffff8b39bf7ee078(800000007f4000e3)
[    4.827311] BUG: unable to handle kernel paging request at
fffffe27c1fdfba0
[    4.830109] IP: free_page_and_swap_cache+0x6/0xa0
[    4.831999] PGD 7f7ef067 P4D 7f7ef067 PUD 0
[    4.833779] Oops: 0000 [#1] SMP PTI
[    4.835197] Modules linked in:
[    4.836450] CPU: 0 PID: 45 Comm: modprobe Not tainted 4.14.11-coreos #1
[    4.839009] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    4.841551] task: ffff8b39b5a71e40 task.stack: ffffb92580558000
[    4.844062] RIP: 0010:free_page_and_swap_cache+0x6/0xa0
[    4.846238] RSP: 0018:ffffb9258055bc98 EFLAGS: 00010297
[    4.848300] RAX: 0000000000000000 RBX: fffffe27c0001000 RCX:
ffff8b39bf7ef4f8
[    4.851184] RDX: 000000000007f7ee RSI: fffffe27c1fdfb80 RDI:
fffffe27c1fdfb80
[    4.854090] RBP: ffff8b39bf7ee000 R08: 0000000000000000 R09:
0000000000000162
[    4.856946] R10: ffffffffffffff90 R11: 0000000000000161 R12:
fffffe27ffe00000
[    4.859777] R13: ffff8b39bf7ef000 R14: fffffe2800000000 R15:
ffffb9258055bd60
[    4.862602] FS:  0000000000000000(0000) GS:ffff8b39bd200000(0000)
knlGS:0000000000000000
[    4.865860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.868175] CR2: fffffe27c1fdfba0 CR3: 000000002d00a001 CR4:
00000000001606f0
[    4.871162] Call Trace:
[    4.872188]  free_pgd_range+0x3a5/0x5b0
[    4.873781]  free_ldt_pgtables.part.2+0x60/0xa0
[    4.875679]  ? arch_tlb_finish_mmu+0x42/0x70
[    4.877476]  ? tlb_finish_mmu+0x1f/0x30
[    4.878999]  exit_mmap+0x5b/0x1a0
[    4.880327]  ? dput+0xb8/0x1e0
[    4.881575]  ? hrtimer_try_to_cancel+0x25/0x110
[    4.883388]  mmput+0x52/0x110
[    4.884620]  do_exit+0x330/0xb10
[    4.886044]  ? task_work_run+0x6b/0xa0
[    4.887544]  do_group_exit+0x3c/0xa0
[    4.889012]  SyS_exit_group+0x10/0x10
[    4.890473]  entry_SYSCALL_64_fastpath+0x1a/0x7d
[    4.892364] RIP: 0033:0x7f4a41d4ded9
[    4.893812] RSP: 002b:00007ffe25d85708 EFLAGS: 00000246 ORIG_RAX:
00000000000000e7
[    4.896974] RAX: ffffffffffffffda RBX: 00005601b3c9e2e0 RCX:
00007f4a41d4ded9
[    4.899830] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
0000000000000001
[    4.902647] RBP: 00005601b3c9d0e8 R08: 000000000000003c R09:
00000000000000e7
[    4.905743] R10: ffffffffffffff90 R11: 0000000000000246 R12:
00005601b3c9d090
[    4.908659] R13: 0000000000000004 R14: 0000000000000001 R15:
00007ffe25d85828
[    4.911495] Code: e0 01 48 83 f8 01 19 c0 25 01 fe ff ff 05 00 02 00 00
3e 29 43 1c 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53
<48> 8b 57 20 48 89 fb 48 8d 42 ff 83 e2 01 48 0f 44 c7 48 8b 48
[    4.919014] RIP: free_page_and_swap_cache+0x6/0xa0 RSP: ffffb9258055bc98
[    4.921801] CR2: fffffe27c1fdfba0
[    4.923232] ---[ end trace e79ccb938bf80a4e ]---
[    4.925166] Kernel panic - not syncing: Fatal exception
[    4.927390] Kernel Offset: 0x1c000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Traces were obtained via virtual serial port.  The backtrace varies a bit,
as does the comm.

The kernel config and a collection of backtraces are attached.  Our diff on
top of vanilla 4.14.11 (unchanged from 4.14.10, and containing nothing
especially relevant):

https://github.com/coreos/linux/compare/v4.14.11...coreos:v4.14.11-coreos

I'm happy to try test builds, etc.  For ease of reproduction if needed, an
affected OS image:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu_image.img.bz2

and a wrapper script to start it with QEMU:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu.sh

Get in with "ssh -p 2222 core@localhost".  Corresponding debug symbols:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-kernel-4.14.11.tbz2
https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-modules-4.14.11.tbz2

--Benjamin Gilbert

[-- Attachment #1.2: Type: text/html, Size: 7749 bytes --]

[-- Attachment #2: config-4.14.11.gz --]
[-- Type: application/x-gzip, Size: 29195 bytes --]

[-- Attachment #3: pmd-logs.tar.gz --]
[-- Type: application/x-gzip, Size: 20143 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03  8:36 "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs Benjamin Gilbert
@ 2018-01-03  8:46 ` Benjamin Gilbert
  2018-01-03  9:20     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-03  8:46 UTC (permalink / raw)
  To: linux-mm; +Cc: stable, Greg Kroah-Hartman

[resending with less web]

Hi all,

In our regression tests on kernel 4.14.11, we're occasionally seeing a run
of "bad pmd" messages during boot, followed by a "BUG: unable to handle
kernel paging request".  This happens on no more than a couple percent of
boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
instances.  It always happens immediately after "Loading compiled-in X.509
certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
with pti=off.  Here's a sample backtrace:

[    4.762964] Loading compiled-in X.509 certificates
[    4.765620] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee000(800000007d6000e3)
[    4.769099] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee008(800000007d8000e3)
[    4.772479] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee010(800000007da000e3)
[    4.775919] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee018(800000007dc000e3)
[    4.779251] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee020(800000007de000e3)
[    4.782558] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee028(800000007e0000e3)
[    4.794160] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee030(800000007e2000e3)
[    4.797525] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee038(800000007e4000e3)
[    4.800776] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee040(800000007e6000e3)
[    4.804100] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee048(800000007e8000e3)
[    4.807437] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee050(800000007ea000e3)
[    4.810729] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee058(800000007ec000e3)
[    4.813989] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee060(800000007ee000e3)
[    4.817294] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee068(800000007f0000e3)
[    4.820713] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee070(800000007f2000e3)
[    4.823943] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee078(800000007f4000e3)
[    4.827311] BUG: unable to handle kernel paging request at fffffe27c1fdfba0
[    4.830109] IP: free_page_and_swap_cache+0x6/0xa0
[    4.831999] PGD 7f7ef067 P4D 7f7ef067 PUD 0
[    4.833779] Oops: 0000 [#1] SMP PTI
[    4.835197] Modules linked in:
[    4.836450] CPU: 0 PID: 45 Comm: modprobe Not tainted 4.14.11-coreos #1
[    4.839009] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    4.841551] task: ffff8b39b5a71e40 task.stack: ffffb92580558000
[    4.844062] RIP: 0010:free_page_and_swap_cache+0x6/0xa0
[    4.846238] RSP: 0018:ffffb9258055bc98 EFLAGS: 00010297
[    4.848300] RAX: 0000000000000000 RBX: fffffe27c0001000 RCX: ffff8b39bf7ef4f8
[    4.851184] RDX: 000000000007f7ee RSI: fffffe27c1fdfb80 RDI: fffffe27c1fdfb80
[    4.854090] RBP: ffff8b39bf7ee000 R08: 0000000000000000 R09: 0000000000000162
[    4.856946] R10: ffffffffffffff90 R11: 0000000000000161 R12: fffffe27ffe00000
[    4.859777] R13: ffff8b39bf7ef000 R14: fffffe2800000000 R15: ffffb9258055bd60
[    4.862602] FS:  0000000000000000(0000) GS:ffff8b39bd200000(0000) knlGS:0000000000000000
[    4.865860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.868175] CR2: fffffe27c1fdfba0 CR3: 000000002d00a001 CR4: 00000000001606f0
[    4.871162] Call Trace:
[    4.872188]  free_pgd_range+0x3a5/0x5b0
[    4.873781]  free_ldt_pgtables.part.2+0x60/0xa0
[    4.875679]  ? arch_tlb_finish_mmu+0x42/0x70
[    4.877476]  ? tlb_finish_mmu+0x1f/0x30
[    4.878999]  exit_mmap+0x5b/0x1a0
[    4.880327]  ? dput+0xb8/0x1e0
[    4.881575]  ? hrtimer_try_to_cancel+0x25/0x110
[    4.883388]  mmput+0x52/0x110
[    4.884620]  do_exit+0x330/0xb10
[    4.886044]  ? task_work_run+0x6b/0xa0
[    4.887544]  do_group_exit+0x3c/0xa0
[    4.889012]  SyS_exit_group+0x10/0x10
[    4.890473]  entry_SYSCALL_64_fastpath+0x1a/0x7d
[    4.892364] RIP: 0033:0x7f4a41d4ded9
[    4.893812] RSP: 002b:00007ffe25d85708 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[    4.896974] RAX: ffffffffffffffda RBX: 00005601b3c9e2e0 RCX: 00007f4a41d4ded9
[    4.899830] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
[    4.902647] RBP: 00005601b3c9d0e8 R08: 000000000000003c R09: 00000000000000e7
[    4.905743] R10: ffffffffffffff90 R11: 0000000000000246 R12: 00005601b3c9d090
[    4.908659] R13: 0000000000000004 R14: 0000000000000001 R15: 00007ffe25d85828
[    4.911495] Code: e0 01 48 83 f8 01 19 c0 25 01 fe ff ff 05 00 02 00 00 3e 29 43 1c 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 <48> 8b 57 20 48 89 fb 48 8d 42 ff 83 e2 01 48 0f 44 c7 48 8b 48
[    4.919014] RIP: free_page_and_swap_cache+0x6/0xa0 RSP: ffffb9258055bc98
[    4.921801] CR2: fffffe27c1fdfba0
[    4.923232] ---[ end trace e79ccb938bf80a4e ]---
[    4.925166] Kernel panic - not syncing: Fatal exception
[    4.927390] Kernel Offset: 0x1c000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Traces were obtained via virtual serial port.  The backtrace varies a bit,
as does the comm.

The kernel config and a collection of backtraces are attached.  Our diff on
top of vanilla 4.14.11 (unchanged from 4.14.10, and containing nothing
especially relevant):

https://github.com/coreos/linux/compare/v4.14.11...coreos:v4.14.11-coreos

I'm happy to try test builds, etc.  For ease of reproduction if needed, an
affected OS image:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu_image.img.bz2

and a wrapper script to start it with QEMU:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu.sh

Get in with "ssh -p 2222 core@localhost".  Corresponding debug symbols:

https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-kernel-4.14.11.tbz2
https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-modules-4.14.11.tbz2

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03  8:46 ` Benjamin Gilbert
@ 2018-01-03  9:20     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-03  9:20 UTC (permalink / raw)
  To: Benjamin Gilbert, x86; +Cc: linux-kernel, linux-mm, stable

On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> [resending with less web]

(adding lkml and x86 developers)

> Hi all,
> 
> In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> of "bad pmd" messages during boot, followed by a "BUG: unable to handle
> kernel paging request".  This happens on no more than a couple percent of
> boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
> instances.  It always happens immediately after "Loading compiled-in X.509
> certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
> with pti=off.  Here's a sample backtrace:
> 
> [    4.762964] Loading compiled-in X.509 certificates
> [    4.765620] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee000(800000007d6000e3)
> [    4.769099] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee008(800000007d8000e3)
> [    4.772479] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee010(800000007da000e3)
> [    4.775919] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee018(800000007dc000e3)
> [    4.779251] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee020(800000007de000e3)
> [    4.782558] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee028(800000007e0000e3)
> [    4.794160] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee030(800000007e2000e3)
> [    4.797525] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee038(800000007e4000e3)
> [    4.800776] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee040(800000007e6000e3)
> [    4.804100] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee048(800000007e8000e3)
> [    4.807437] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee050(800000007ea000e3)
> [    4.810729] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee058(800000007ec000e3)
> [    4.813989] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee060(800000007ee000e3)
> [    4.817294] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee068(800000007f0000e3)
> [    4.820713] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee070(800000007f2000e3)
> [    4.823943] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee078(800000007f4000e3)
> [    4.827311] BUG: unable to handle kernel paging request at fffffe27c1fdfba0
> [    4.830109] IP: free_page_and_swap_cache+0x6/0xa0
> [    4.831999] PGD 7f7ef067 P4D 7f7ef067 PUD 0
> [    4.833779] Oops: 0000 [#1] SMP PTI
> [    4.835197] Modules linked in:
> [    4.836450] CPU: 0 PID: 45 Comm: modprobe Not tainted 4.14.11-coreos #1
> [    4.839009] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
> [    4.841551] task: ffff8b39b5a71e40 task.stack: ffffb92580558000
> [    4.844062] RIP: 0010:free_page_and_swap_cache+0x6/0xa0
> [    4.846238] RSP: 0018:ffffb9258055bc98 EFLAGS: 00010297
> [    4.848300] RAX: 0000000000000000 RBX: fffffe27c0001000 RCX: ffff8b39bf7ef4f8
> [    4.851184] RDX: 000000000007f7ee RSI: fffffe27c1fdfb80 RDI: fffffe27c1fdfb80
> [    4.854090] RBP: ffff8b39bf7ee000 R08: 0000000000000000 R09: 0000000000000162
> [    4.856946] R10: ffffffffffffff90 R11: 0000000000000161 R12: fffffe27ffe00000
> [    4.859777] R13: ffff8b39bf7ef000 R14: fffffe2800000000 R15: ffffb9258055bd60
> [    4.862602] FS:  0000000000000000(0000) GS:ffff8b39bd200000(0000) knlGS:0000000000000000
> [    4.865860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.868175] CR2: fffffe27c1fdfba0 CR3: 000000002d00a001 CR4: 00000000001606f0
> [    4.871162] Call Trace:
> [    4.872188]  free_pgd_range+0x3a5/0x5b0
> [    4.873781]  free_ldt_pgtables.part.2+0x60/0xa0
> [    4.875679]  ? arch_tlb_finish_mmu+0x42/0x70
> [    4.877476]  ? tlb_finish_mmu+0x1f/0x30
> [    4.878999]  exit_mmap+0x5b/0x1a0
> [    4.880327]  ? dput+0xb8/0x1e0
> [    4.881575]  ? hrtimer_try_to_cancel+0x25/0x110
> [    4.883388]  mmput+0x52/0x110
> [    4.884620]  do_exit+0x330/0xb10
> [    4.886044]  ? task_work_run+0x6b/0xa0
> [    4.887544]  do_group_exit+0x3c/0xa0
> [    4.889012]  SyS_exit_group+0x10/0x10
> [    4.890473]  entry_SYSCALL_64_fastpath+0x1a/0x7d
> [    4.892364] RIP: 0033:0x7f4a41d4ded9
> [    4.893812] RSP: 002b:00007ffe25d85708 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> [    4.896974] RAX: ffffffffffffffda RBX: 00005601b3c9e2e0 RCX: 00007f4a41d4ded9
> [    4.899830] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [    4.902647] RBP: 00005601b3c9d0e8 R08: 000000000000003c R09: 00000000000000e7
> [    4.905743] R10: ffffffffffffff90 R11: 0000000000000246 R12: 00005601b3c9d090
> [    4.908659] R13: 0000000000000004 R14: 0000000000000001 R15: 00007ffe25d85828
> [    4.911495] Code: e0 01 48 83 f8 01 19 c0 25 01 fe ff ff 05 00 02 00 00 3e 29 43 1c 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 <48> 8b 57 20 48 89 fb 48 8d 42 ff 83 e2 01 48 0f 44 c7 48 8b 48
> [    4.919014] RIP: free_page_and_swap_cache+0x6/0xa0 RSP: ffffb9258055bc98
> [    4.921801] CR2: fffffe27c1fdfba0
> [    4.923232] ---[ end trace e79ccb938bf80a4e ]---
> [    4.925166] Kernel panic - not syncing: Fatal exception
> [    4.927390] Kernel Offset: 0x1c000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> Traces were obtained via virtual serial port.  The backtrace varies a bit,
> as does the comm.
> 
> The kernel config and a collection of backtraces are attached.  Our diff on
> top of vanilla 4.14.11 (unchanged from 4.14.10, and containing nothing
> especially relevant):
> 
> https://github.com/coreos/linux/compare/v4.14.11...coreos:v4.14.11-coreos
> 
> I'm happy to try test builds, etc.  For ease of reproduction if needed, an
> affected OS image:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu_image.img.bz2
> 
> and a wrapper script to start it with QEMU:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu.sh
> 
> Get in with "ssh -p 2222 core@localhost".  Corresponding debug symbols:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-kernel-4.14.11.tbz2
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-modules-4.14.11.tbz2

Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
is also there (or not)?  That might narrow down the issue to being a
backport or a "real" problem here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03  9:20     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-03  9:20 UTC (permalink / raw)
  To: Benjamin Gilbert, x86; +Cc: linux-kernel, linux-mm, stable

On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> [resending with less web]

(adding lkml and x86 developers)

> Hi all,
> 
> In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> of "bad pmd" messages during boot, followed by a "BUG: unable to handle
> kernel paging request".  This happens on no more than a couple percent of
> boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
> instances.  It always happens immediately after "Loading compiled-in X.509
> certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
> with pti=off.  Here's a sample backtrace:
> 
> [    4.762964] Loading compiled-in X.509 certificates
> [    4.765620] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee000(800000007d6000e3)
> [    4.769099] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee008(800000007d8000e3)
> [    4.772479] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee010(800000007da000e3)
> [    4.775919] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee018(800000007dc000e3)
> [    4.779251] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee020(800000007de000e3)
> [    4.782558] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee028(800000007e0000e3)
> [    4.794160] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee030(800000007e2000e3)
> [    4.797525] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee038(800000007e4000e3)
> [    4.800776] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee040(800000007e6000e3)
> [    4.804100] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee048(800000007e8000e3)
> [    4.807437] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee050(800000007ea000e3)
> [    4.810729] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee058(800000007ec000e3)
> [    4.813989] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee060(800000007ee000e3)
> [    4.817294] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee068(800000007f0000e3)
> [    4.820713] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee070(800000007f2000e3)
> [    4.823943] ../source/mm/pgtable-generic.c:40: bad pmd ffff8b39bf7ee078(800000007f4000e3)
> [    4.827311] BUG: unable to handle kernel paging request at fffffe27c1fdfba0
> [    4.830109] IP: free_page_and_swap_cache+0x6/0xa0
> [    4.831999] PGD 7f7ef067 P4D 7f7ef067 PUD 0
> [    4.833779] Oops: 0000 [#1] SMP PTI
> [    4.835197] Modules linked in:
> [    4.836450] CPU: 0 PID: 45 Comm: modprobe Not tainted 4.14.11-coreos #1
> [    4.839009] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
> [    4.841551] task: ffff8b39b5a71e40 task.stack: ffffb92580558000
> [    4.844062] RIP: 0010:free_page_and_swap_cache+0x6/0xa0
> [    4.846238] RSP: 0018:ffffb9258055bc98 EFLAGS: 00010297
> [    4.848300] RAX: 0000000000000000 RBX: fffffe27c0001000 RCX: ffff8b39bf7ef4f8
> [    4.851184] RDX: 000000000007f7ee RSI: fffffe27c1fdfb80 RDI: fffffe27c1fdfb80
> [    4.854090] RBP: ffff8b39bf7ee000 R08: 0000000000000000 R09: 0000000000000162
> [    4.856946] R10: ffffffffffffff90 R11: 0000000000000161 R12: fffffe27ffe00000
> [    4.859777] R13: ffff8b39bf7ef000 R14: fffffe2800000000 R15: ffffb9258055bd60
> [    4.862602] FS:  0000000000000000(0000) GS:ffff8b39bd200000(0000) knlGS:0000000000000000
> [    4.865860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.868175] CR2: fffffe27c1fdfba0 CR3: 000000002d00a001 CR4: 00000000001606f0
> [    4.871162] Call Trace:
> [    4.872188]  free_pgd_range+0x3a5/0x5b0
> [    4.873781]  free_ldt_pgtables.part.2+0x60/0xa0
> [    4.875679]  ? arch_tlb_finish_mmu+0x42/0x70
> [    4.877476]  ? tlb_finish_mmu+0x1f/0x30
> [    4.878999]  exit_mmap+0x5b/0x1a0
> [    4.880327]  ? dput+0xb8/0x1e0
> [    4.881575]  ? hrtimer_try_to_cancel+0x25/0x110
> [    4.883388]  mmput+0x52/0x110
> [    4.884620]  do_exit+0x330/0xb10
> [    4.886044]  ? task_work_run+0x6b/0xa0
> [    4.887544]  do_group_exit+0x3c/0xa0
> [    4.889012]  SyS_exit_group+0x10/0x10
> [    4.890473]  entry_SYSCALL_64_fastpath+0x1a/0x7d
> [    4.892364] RIP: 0033:0x7f4a41d4ded9
> [    4.893812] RSP: 002b:00007ffe25d85708 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> [    4.896974] RAX: ffffffffffffffda RBX: 00005601b3c9e2e0 RCX: 00007f4a41d4ded9
> [    4.899830] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> [    4.902647] RBP: 00005601b3c9d0e8 R08: 000000000000003c R09: 00000000000000e7
> [    4.905743] R10: ffffffffffffff90 R11: 0000000000000246 R12: 00005601b3c9d090
> [    4.908659] R13: 0000000000000004 R14: 0000000000000001 R15: 00007ffe25d85828
> [    4.911495] Code: e0 01 48 83 f8 01 19 c0 25 01 fe ff ff 05 00 02 00 00 3e 29 43 1c 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 <48> 8b 57 20 48 89 fb 48 8d 42 ff 83 e2 01 48 0f 44 c7 48 8b 48
> [    4.919014] RIP: free_page_and_swap_cache+0x6/0xa0 RSP: ffffb9258055bc98
> [    4.921801] CR2: fffffe27c1fdfba0
> [    4.923232] ---[ end trace e79ccb938bf80a4e ]---
> [    4.925166] Kernel panic - not syncing: Fatal exception
> [    4.927390] Kernel Offset: 0x1c000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> Traces were obtained via virtual serial port.  The backtrace varies a bit,
> as does the comm.
> 
> The kernel config and a collection of backtraces are attached.  Our diff on
> top of vanilla 4.14.11 (unchanged from 4.14.10, and containing nothing
> especially relevant):
> 
> https://github.com/coreos/linux/compare/v4.14.11...coreos:v4.14.11-coreos
> 
> I'm happy to try test builds, etc.  For ease of reproduction if needed, an
> affected OS image:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu_image.img.bz2
> 
> and a wrapper script to start it with QEMU:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/coreos_production_qemu.sh
> 
> Get in with "ssh -p 2222 core@localhost".  Corresponding debug symbols:
> 
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-kernel-4.14.11.tbz2
> https://storage.googleapis.com/builds.developer.core-os.net/boards/amd64-usr/1632.0.0%2Bjenkins2-master%2Blocal-999/pkgs/sys-kernel/coreos-modules-4.14.11.tbz2

Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
is also there (or not)?  That might narrow down the issue to being a
backport or a "real" problem here.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03  9:20     ` Greg Kroah-Hartman
@ 2018-01-03 15:48       ` Ingo Molnar
  -1 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-03 15:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Benjamin Gilbert, x86, linux-kernel, linux-mm, stable


* Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> > [resending with less web]
> 
> (adding lkml and x86 developers)
> 
> > Hi all,
> > 
> > In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> > of "bad pmd" messages during boot, followed by a "BUG: unable to handle
> > kernel paging request".  This happens on no more than a couple percent of
> > boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
> > instances.  It always happens immediately after "Loading compiled-in X.509
> > certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
> > with pti=off.  Here's a sample backtrace:

A few other things to check:

first please test the latest WIP.x86/pti branch which has a couple of fixes.

In a -stable kernel tree you should be able to do:

  git pull --no-tags git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/pti

in particular this recent fix from a couple of hours ago might make a difference:

  52994c256df3: x86/pti: Make sure the user/kernel PTEs match

Note that this commit:

  694d99d40972: x86/cpu, x86/pti: Do not enable PTI on AMD processors

disables PTI on AMD CPUs - so if you'd like to test it more broadly on all CPUs 
then you'll need to add "pti=on" to your boot commandline.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 15:48       ` Ingo Molnar
  0 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-03 15:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Benjamin Gilbert, x86, linux-kernel, linux-mm, stable


* Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> > [resending with less web]
> 
> (adding lkml and x86 developers)
> 
> > Hi all,
> > 
> > In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> > of "bad pmd" messages during boot, followed by a "BUG: unable to handle
> > kernel paging request".  This happens on no more than a couple percent of
> > boots, but we've seen it on AWS HVM, GCE, Oracle Cloud VMs, and local QEMU
> > instances.  It always happens immediately after "Loading compiled-in X.509
> > certificates".  I can't reproduce it on 4.14.10, nor, so far, on 4.14.11
> > with pti=off.  Here's a sample backtrace:

A few other things to check:

first please test the latest WIP.x86/pti branch which has a couple of fixes.

In a -stable kernel tree you should be able to do:

  git pull --no-tags git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/pti

in particular this recent fix from a couple of hours ago might make a difference:

  52994c256df3: x86/pti: Make sure the user/kernel PTEs match

Note that this commit:

  694d99d40972: x86/cpu, x86/pti: Do not enable PTI on AMD processors

disables PTI on AMD CPUs - so if you'd like to test it more broadly on all CPUs 
then you'll need to add "pti=on" to your boot commandline.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 15:48       ` Ingo Molnar
@ 2018-01-03 22:32         ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-03 22:32 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable

On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> first please test the latest WIP.x86/pti branch which has a couple of fixes.

I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
of local patches which shouldn't affect the resulting binary).

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 22:32         ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-03 22:32 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable

On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> first please test the latest WIP.x86/pti branch which has a couple of fixes.

I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
of local patches which shouldn't affect the resulting binary).

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:32         ` Benjamin Gilbert
@ 2018-01-03 22:34           ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:34 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:

> On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> > first please test the latest WIP.x86/pti branch which has a couple of fixes.
> 
> I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
> of local patches which shouldn't affect the resulting binary).

Can you please send me your .config and a full dmesg ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 22:34           ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:34 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:

> On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> > first please test the latest WIP.x86/pti branch which has a couple of fixes.
> 
> I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
> of local patches which shouldn't affect the resulting binary).

Can you please send me your .config and a full dmesg ?

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:34           ` Thomas Gleixner
  (?)
@ 2018-01-03 22:49           ` Benjamin Gilbert
  2018-01-03 22:57               ` Thomas Gleixner
  -1 siblings, 1 reply; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-03 22:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable

[-- Attachment #1: Type: text/plain, Size: 237 bytes --]

On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> Can you please send me your .config and a full dmesg ?

I've attached a serial log from a local QEMU.  I can rerun with a higher
loglevel if need be.

--Benjamin Gilbert

[-- Attachment #2: config-4.14.11.gz --]
[-- Type: application/gzip, Size: 29195 bytes --]

[-- Attachment #3: console.txt.gz --]
[-- Type: application/gzip, Size: 7113 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:49           ` Benjamin Gilbert
@ 2018-01-03 22:57               ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:57 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, LKML, linux-mm, stable,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > Can you please send me your .config and a full dmesg ?
> 
> I've attached a serial log from a local QEMU.  I can rerun with a higher
> loglevel if need be.

Thanks!

Cc'ing Andy who might have an idea and he's probably more away than I
am. Will have a look tomorrow if Andy does not beat me to it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 22:57               ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:57 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, LKML, linux-mm, stable,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > Can you please send me your .config and a full dmesg ?
> 
> I've attached a serial log from a local QEMU.  I can rerun with a higher
> loglevel if need be.

Thanks!

Cc'ing Andy who might have an idea and he's probably more away than I
am. Will have a look tomorrow if Andy does not beat me to it.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:57               ` Thomas Gleixner
@ 2018-01-03 22:58                 ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:58 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, LKML, linux-mm, stable,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra



On Wed, 3 Jan 2018, Thomas Gleixner wrote:

> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> > On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > > Can you please send me your .config and a full dmesg ?
> > 
> > I've attached a serial log from a local QEMU.  I can rerun with a higher
> > loglevel if need be.
> 
> Thanks!
> 
> Cc'ing Andy who might have an idea and he's probably more away than I

s/away/awake/ just to demonstrate the state I'm in ...

> am. Will have a look tomorrow if Andy does not beat me to it.
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 22:58                 ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 22:58 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Ingo Molnar, Greg Kroah-Hartman, x86, LKML, linux-mm, stable,
	Andy Lutomirski, Dave Hansen, Peter Zijlstra



On Wed, 3 Jan 2018, Thomas Gleixner wrote:

> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> > On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > > Can you please send me your .config and a full dmesg ?
> > 
> > I've attached a serial log from a local QEMU.  I can rerun with a higher
> > loglevel if need be.
> 
> Thanks!
> 
> Cc'ing Andy who might have an idea and he's probably more away than I

s/away/awake/ just to demonstrate the state I'm in ...

> am. Will have a look tomorrow if Andy does not beat me to it.
> 
> Thanks,
> 
> 	tglx
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:58                 ` Thomas Gleixner
@ 2018-01-03 23:44                   ` Andy Lutomirski
  -1 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-03 23:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra



> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> 
> 
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>> 
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
>>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
>>>> Can you please send me your .config and a full dmesg ?
>>> 
>>> I've attached a serial log from a local QEMU.  I can rerun with a higher
>>> loglevel if need be.
>> 
>> Thanks!
>> 
>> Cc'ing Andy who might have an idea and he's probably more away than I
> 
> s/away/awake/ just to demonstrate the state I'm in ...
> 
>> am. Will have a look tomorrow if Andy does not beat me to it.

Can you forward me more of the thread?

>> 
>> Thanks,
>> 
>>    tglx
>> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 23:44                   ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-03 23:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra



> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> 
> 
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>> 
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
>>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
>>>> Can you please send me your .config and a full dmesg ?
>>> 
>>> I've attached a serial log from a local QEMU.  I can rerun with a higher
>>> loglevel if need be.
>> 
>> Thanks!
>> 
>> Cc'ing Andy who might have an idea and he's probably more away than I
> 
> s/away/awake/ just to demonstrate the state I'm in ...
> 
>> am. Will have a look tomorrow if Andy does not beat me to it.

Can you forward me more of the thread?

>> 
>> Thanks,
>> 
>>    tglx
>> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 23:44                   ` Andy Lutomirski
@ 2018-01-03 23:46                     ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 23:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
> >> 
> >>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> >>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> >>>> Can you please send me your .config and a full dmesg ?
> >>> 
> >>> I've attached a serial log from a local QEMU.  I can rerun with a higher
> >>> loglevel if need be.
> >> 
> >> Thanks!
> >> 
> >> Cc'ing Andy who might have an idea and he's probably more away than I
> > 
> > s/away/awake/ just to demonstrate the state I'm in ...
> > 
> >> am. Will have a look tomorrow if Andy does not beat me to it.
> 
> Can you forward me more of the thread?

On the way.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-03 23:46                     ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-03 23:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
> >> 
> >>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> >>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> >>>> Can you please send me your .config and a full dmesg ?
> >>> 
> >>> I've attached a serial log from a local QEMU.  I can rerun with a higher
> >>> loglevel if need be.
> >> 
> >> Thanks!
> >> 
> >> Cc'ing Andy who might have an idea and he's probably more away than I
> > 
> > s/away/awake/ just to demonstrate the state I'm in ...
> > 
> >> am. Will have a look tomorrow if Andy does not beat me to it.
> 
> Can you forward me more of the thread?

On the way.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03 22:58                 ` Thomas Gleixner
@ 2018-01-04  0:27                   ` Andy Lutomirski
  -1 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  0:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra



> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> 
> 
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>> 
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
>>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
>>>> Can you please send me your .config and a full dmesg ?
>>> 
>>> I've attached a serial log from a local QEMU.  I can rerun with a higher
>>> loglevel if need be.
>> 
>> Thanks!
>> 
>> Cc'ing Andy who might have an idea and he's probably more away than I
> 
> s/away/awake/ just to demonstrate the state I'm in ...
> 
>> am. Will have a look tomorrow if Andy does not beat me to it.

How much memory does the affected system have?  It sounds like something is mapped in the LDT region and is getting corrupted because the LDT code expects to own that region.

I got almost exactly this failure in an earlier version of the code when I typed the LDT base address macro.

I'll try to reproduce.

>> 
>> Thanks,
>> 
>>    tglx
>> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  0:27                   ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  0:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra



> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> 
> 
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>> 
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
>>>> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
>>>> Can you please send me your .config and a full dmesg ?
>>> 
>>> I've attached a serial log from a local QEMU.  I can rerun with a higher
>>> loglevel if need be.
>> 
>> Thanks!
>> 
>> Cc'ing Andy who might have an idea and he's probably more away than I
> 
> s/away/awake/ just to demonstrate the state I'm in ...
> 
>> am. Will have a look tomorrow if Andy does not beat me to it.

How much memory does the affected system have?  It sounds like something is mapped in the LDT region and is getting corrupted because the LDT code expects to own that region.

I got almost exactly this failure in an earlier version of the code when I typed the LDT base address macro.

I'll try to reproduce.

>> 
>> Thanks,
>> 
>>    tglx
>> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-03  9:20     ` Greg Kroah-Hartman
@ 2018-01-04  0:33       ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  0:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> is also there (or not)?

I haven't been able to reproduce this on 4.15-rc6.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  0:33       ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  0:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> is also there (or not)?

I haven't been able to reproduce this on 4.15-rc6.

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:33       ` Benjamin Gilbert
@ 2018-01-04  0:37         ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04  0:37 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:

> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > is also there (or not)?
> 
> I haven't been able to reproduce this on 4.15-rc6.

Hmm. So we need to scrutinize the subtle differences between 4.15-rc6 and 4.14.11....

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  0:37         ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04  0:37 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, 3 Jan 2018, Benjamin Gilbert wrote:

> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > is also there (or not)?
> 
> I haven't been able to reproduce this on 4.15-rc6.

Hmm. So we need to scrutinize the subtle differences between 4.15-rc6 and 4.14.11....

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:33       ` Benjamin Gilbert
@ 2018-01-04  0:37         ` Andy Lutomirski
  -1 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  0:37 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable,
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra



> On Jan 3, 2018, at 4:33 PM, Benjamin Gilbert <benjamin.gilbert@coreos.com> wrote:
> 
>> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
>> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
>> is also there (or not)?
> 
> I haven't been able to reproduce this on 4.15-rc6.

Ah.  Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified to do nothing, and the read /sys/kernel/debug/page_tables/current (or current_kernel, or whatever it's called).  The problem may be obvious.

> 
> --Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  0:37         ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  0:37 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, x86, linux-kernel, linux-mm, stable,
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra



> On Jan 3, 2018, at 4:33 PM, Benjamin Gilbert <benjamin.gilbert@coreos.com> wrote:
> 
>> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
>> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
>> is also there (or not)?
> 
> I haven't been able to reproduce this on 4.15-rc6.

Ah.  Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified to do nothing, and the read /sys/kernel/debug/page_tables/current (or current_kernel, or whatever it's called).  The problem may be obvious.

> 
> --Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:27                   ` Andy Lutomirski
@ 2018-01-04  0:38                     ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  0:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 04:27:04PM -0800, Andy Lutomirski wrote:
> How much memory does the affected system have?  It sounds like something
> is mapped in the LDT region and is getting corrupted because the LDT code
> expects to own that region.

We've seen this on systems from 1 to 7 GB.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  0:38                     ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  0:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, Greg Kroah-Hartman, x86, LKML,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 04:27:04PM -0800, Andy Lutomirski wrote:
> How much memory does the affected system have?  It sounds like something
> is mapped in the LDT region and is getting corrupted because the LDT code
> expects to own that region.

We've seen this on systems from 1 to 7 GB.

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:33       ` Benjamin Gilbert
@ 2018-01-04  1:37         ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  1:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 04:33:03PM -0800, Benjamin Gilbert wrote:
> I haven't been able to reproduce this on 4.15-rc6.

This is bad data.  I was caught by the fact that 4.14.11 has
PAGE_TABLE_ISOLATION default y but 4.15-rc6 doesn't.  Retesting.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  1:37         ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  1:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 04:33:03PM -0800, Benjamin Gilbert wrote:
> I haven't been able to reproduce this on 4.15-rc6.

This is bad data.  I was caught by the fact that 4.14.11 has
PAGE_TABLE_ISOLATION default y but 4.15-rc6 doesn't.  Retesting.

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:37         ` Andy Lutomirski
  (?)
@ 2018-01-04  4:35         ` Benjamin Gilbert
  2018-01-04  4:45             ` Andy Lutomirski
  -1 siblings, 1 reply; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  4:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, x86, LKML, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
> current_kernel, or whatever it's called).  The problem may be obvious.

current_kernel attached.  I have not seen any crashes with
free_ldt_pgtables() stubbed out.

--Benjamin Gilbert

[-- Attachment #2: current_kernel.gz --]
[-- Type: application/gzip, Size: 13830 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  1:37         ` Benjamin Gilbert
@ 2018-01-04  4:36           ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  4:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 05:37:42PM -0800, Benjamin Gilbert wrote:
> I was caught by the fact that 4.14.11 has PAGE_TABLE_ISOLATION default y
> but 4.15-rc6 doesn't.  Retesting.

It turns out that 4.15-rc6 has the same problem as 4.14.11.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  4:36           ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04  4:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: x86, linux-kernel, linux-mm, stable, Thomas Gleixner,
	Ingo Molnar, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Wed, Jan 03, 2018 at 05:37:42PM -0800, Benjamin Gilbert wrote:
> I was caught by the fact that 4.14.11 has PAGE_TABLE_ISOLATION default y
> but 4.15-rc6 doesn't.  Retesting.

It turns out that 4.15-rc6 has the same problem as 4.14.11.

--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  4:35         ` Benjamin Gilbert
@ 2018-01-04  4:45             ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  4:45 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, X86 ML, LKML, linux-mm, stable,
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra

On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
<benjamin.gilbert@coreos.com> wrote:
> On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> current_kernel, or whatever it's called).  The problem may be obvious.
>
> current_kernel attached.  I have not seen any crashes with
> free_ldt_pgtables() stubbed out.

I haven't reproduced it, but I think I see what's wrong.  KASLR sets
vaddr_end to a totally bogus value.  It should be no larger than
LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
the LDT range.  If it weren't for that, it could just as easily land
in the cpu_entry_area range.  This will need fixing in all versions
that aren't still called KAISER.

Our memory map code is utter shite.  This kind of bug should not be
possible without a giant warning at boot that something is screwed up.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  4:45             ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04  4:45 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: Greg Kroah-Hartman, X86 ML, LKML, linux-mm, stable,
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra

On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
<benjamin.gilbert@coreos.com> wrote:
> On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> current_kernel, or whatever it's called).  The problem may be obvious.
>
> current_kernel attached.  I have not seen any crashes with
> free_ldt_pgtables() stubbed out.

I haven't reproduced it, but I think I see what's wrong.  KASLR sets
vaddr_end to a totally bogus value.  It should be no larger than
LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
the LDT range.  If it weren't for that, it could just as easily land
in the cpu_entry_area range.  This will need fixing in all versions
that aren't still called KAISER.

Our memory map code is utter shite.  This kind of bug should not be
possible without a giant warning at boot that something is screwed up.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  0:37         ` Thomas Gleixner
@ 2018-01-04  7:14           ` Ingo Molnar
  -1 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, x86, linux-kernel,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> 
> > On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > > is also there (or not)?
> > 
> > I haven't been able to reproduce this on 4.15-rc6.
> 
> Hmm. So we need to scrutinize the subtle differences between 4.15-rc6 and 4.14.11....

So here's a list of candidate 'missing commits':

triton:~/tip> git log --oneline --no-merges WIP.x86/pti..linus arch/x86 | grep -viE 'apic|irq|vector|probe|kvm|timer|rdt|crypto|platform|tsc|insn|xen|mpx|umip|efi|build|parav|SEV|kmemch|power|stacktrace|unwind|kmmio|dma|boot|PCI|resource|init|virt|kexec|unused|perf|5-level'
10a7e9d84915: Do not hash userspace addresses in fault handlers
f5b5fab1780c: x86/decoder: Fix and update the opcodes map
88edb57d1e0b: x86/vdso: Change time() prototype to match __vdso_time()
d553d03f7057: x86: Fix Sparse warnings about non-static functions
f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
328b4ed93b69: x86: don't hash faulting address in oops printout
b562c171cf01: locking/refcounts: Do not force refcount_t usage as GPL-only export
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
12a78d43de76: x86/decoder: Add new TEST instruction pattern
30bb9811856f: x86/topology: Avoid wasting 128k for package id array
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
9f08890ab906: x86/pvclock: add setter for pvclock_pvti_cpu0_va
c5e260890d5f: x86/mm: Remove unnecessary TLB flush for SME in-place encryption
4a75aeacda3c: ACPI / APEI: Remove arch_apei_flush_tlb_one()
e4dca7b7aa08: treewide: Fix function prototypes for module_param_call()
7ed4325a44ea: Drivers: hv: vmbus: Make panic reporting to be more useful
6aa7de059173: locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
506458efaf15: locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
0cfe5b5fc027: x86: Use ARRAY_SIZE
c1bd743e54cd: arch/x86: remove redundant null checks before kmem_cache_destroy
a4c1887d4c14: locking/arch: Remove dummy arch_{read,spin,write}_lock_flags() implementations
0160fb177d48: locking/arch: Remove dummy arch_{read,spin,write}_relax() implementations
19c60923010b: locking/arch, x86: Add __down_read_killable()
39208aa7ecb7: locking/refcounts, x86/asm: Enable CONFIG_ARCH_HAS_REFCOUNT
564c9cc84e2a: locking/refcounts, x86/asm: Use unique .text section for refcount exceptions
30c23f29d2d5: locking/x86: Use named operands in rwsem.h

Note the exclusion regex pattern which might be overly aggressive.

Taking out the commits that should have no real effect leads to this list:

f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
6aa7de059173: locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
506458efaf15: locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
a4c1887d4c14: locking/arch: Remove dummy arch_{read,spin,write}_lock_flags() implementations
0160fb177d48: locking/arch: Remove dummy arch_{read,spin,write}_relax() implementations
19c60923010b: locking/arch, x86: Add __down_read_killable()
39208aa7ecb7: locking/refcounts, x86/asm: Enable CONFIG_ARCH_HAS_REFCOUNT
564c9cc84e2a: locking/refcounts, x86/asm: Use unique .text section for refcount exceptions
30c23f29d2d5: locking/x86: Use named operands in rwsem.h

And taking out the locking commits which should have no effect on x86 ordering 
gives this (possibly overly aggressively trimmed) list:

f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE

I think microcode and DAX changes are probably innocent, the IDT loading should 
only affect SMP bootstrap, and the ACPI irq, deferred-pages, cpufreq-info and 
sparsemem fixes are probably unrelated as well. This leaves:

9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE

These will cherry-pick cleanly, so it would be nice to test them on top of of the 
-stable kernel that fails:

  for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34; do git cherry-pick $N; done

if this brute-force approach resolves the problem then we have a shorter list of 
fixes to look at.

If it doesn't fix the problem then the problem is either:

 - fixed by one of the other commits
 - or is fixed by one of the non-x86 upstream commits (of which there are over 10,000)
 - or the problem is non-deterministic,
 - or the problem is build layout dependent,
 - (or it's something I missed to consider)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  7:14           ` Ingo Molnar
  0 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, x86, linux-kernel,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> 
> > On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > > is also there (or not)?
> > 
> > I haven't been able to reproduce this on 4.15-rc6.
> 
> Hmm. So we need to scrutinize the subtle differences between 4.15-rc6 and 4.14.11....

So here's a list of candidate 'missing commits':

triton:~/tip> git log --oneline --no-merges WIP.x86/pti..linus arch/x86 | grep -viE 'apic|irq|vector|probe|kvm|timer|rdt|crypto|platform|tsc|insn|xen|mpx|umip|efi|build|parav|SEV|kmemch|power|stacktrace|unwind|kmmio|dma|boot|PCI|resource|init|virt|kexec|unused|perf|5-level'
10a7e9d84915: Do not hash userspace addresses in fault handlers
f5b5fab1780c: x86/decoder: Fix and update the opcodes map
88edb57d1e0b: x86/vdso: Change time() prototype to match __vdso_time()
d553d03f7057: x86: Fix Sparse warnings about non-static functions
f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
328b4ed93b69: x86: don't hash faulting address in oops printout
b562c171cf01: locking/refcounts: Do not force refcount_t usage as GPL-only export
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
12a78d43de76: x86/decoder: Add new TEST instruction pattern
30bb9811856f: x86/topology: Avoid wasting 128k for package id array
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
9f08890ab906: x86/pvclock: add setter for pvclock_pvti_cpu0_va
c5e260890d5f: x86/mm: Remove unnecessary TLB flush for SME in-place encryption
4a75aeacda3c: ACPI / APEI: Remove arch_apei_flush_tlb_one()
e4dca7b7aa08: treewide: Fix function prototypes for module_param_call()
7ed4325a44ea: Drivers: hv: vmbus: Make panic reporting to be more useful
6aa7de059173: locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
506458efaf15: locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
0cfe5b5fc027: x86: Use ARRAY_SIZE
c1bd743e54cd: arch/x86: remove redundant null checks before kmem_cache_destroy
a4c1887d4c14: locking/arch: Remove dummy arch_{read,spin,write}_lock_flags() implementations
0160fb177d48: locking/arch: Remove dummy arch_{read,spin,write}_relax() implementations
19c60923010b: locking/arch, x86: Add __down_read_killable()
39208aa7ecb7: locking/refcounts, x86/asm: Enable CONFIG_ARCH_HAS_REFCOUNT
564c9cc84e2a: locking/refcounts, x86/asm: Use unique .text section for refcount exceptions
30c23f29d2d5: locking/x86: Use named operands in rwsem.h

Note the exclusion regex pattern which might be overly aggressive.

Taking out the commits that should have no real effect leads to this list:

f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
6aa7de059173: locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
506458efaf15: locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
a4c1887d4c14: locking/arch: Remove dummy arch_{read,spin,write}_lock_flags() implementations
0160fb177d48: locking/arch: Remove dummy arch_{read,spin,write}_relax() implementations
19c60923010b: locking/arch, x86: Add __down_read_killable()
39208aa7ecb7: locking/refcounts, x86/asm: Enable CONFIG_ARCH_HAS_REFCOUNT
564c9cc84e2a: locking/refcounts, x86/asm: Use unique .text section for refcount exceptions
30c23f29d2d5: locking/x86: Use named operands in rwsem.h

And taking out the locking commits which should have no effect on x86 ordering 
gives this (possibly overly aggressively trimmed) list:

f4e9b7af0cd5: x86/microcode/AMD: Add support for fam17h microcode loading
e3811a3f74bd: x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
1501899a898d: mm: fix device-dax pud write-faults triggered by get_user_pages()
55d2d0ad2fb4: x86/idt: Load idt early in start_secondary
9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
252714155f04: x86/acpi: Handle SCI interrupts above legacy space gracefully
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
fcdaf842bd8f: mm, sparse: do not swamp log with huge vmemmap allocation failures
353b1e7b5859: x86/mm: set fields in deferred pages
7d5905dc14a8: x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
b29c6ef7bb12: x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE

I think microcode and DAX changes are probably innocent, the IDT loading should 
only affect SMP bootstrap, and the ACPI irq, deferred-pages, cpufreq-info and 
sparsemem fixes are probably unrelated as well. This leaves:

9d0b62328d34: x86/tlb: Disable interrupts when changing CR4
0c3292ca8025: x86/tlb: Refactor CR4 setting and shadow write
be62a3204406: x86/mm: Limit mmap() of /dev/mem to valid physical addresses
1e0f25dbf246: x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
4d2dc2cc766c: fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
450cbdd0125c: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE

These will cherry-pick cleanly, so it would be nice to test them on top of of the 
-stable kernel that fails:

  for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34; do git cherry-pick $N; done

if this brute-force approach resolves the problem then we have a shorter list of 
fixes to look at.

If it doesn't fix the problem then the problem is either:

 - fixed by one of the other commits
 - or is fixed by one of the non-x86 upstream commits (of which there are over 10,000)
 - or the problem is non-deterministic,
 - or the problem is build layout dependent,
 - (or it's something I missed to consider)

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  7:14           ` Ingo Molnar
@ 2018-01-04  7:18             ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04  7:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
>  - (or it's something I missed to consider)

It was a operator error, the issue is also on 4.15-rc6, see another
email in this thread :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  7:18             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04  7:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
>  - (or it's something I missed to consider)

It was a operator error, the issue is also on 4.15-rc6, see another
email in this thread :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  7:18             ` Greg Kroah-Hartman
@ 2018-01-04  7:20               ` Ingo Molnar
  -1 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> >  - (or it's something I missed to consider)
> 
> It was a operator error, the issue is also on 4.15-rc6, see another
> email in this thread :)

ah, ok :-)

Nevertheless it made sense to go through all the backport candidate commits again, 
nothing stuck out as a must-have for -stable! ;-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  7:20               ` Ingo Molnar
  0 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> >  - (or it's something I missed to consider)
> 
> It was a operator error, the issue is also on 4.15-rc6, see another
> email in this thread :)

ah, ok :-)

Nevertheless it made sense to go through all the backport candidate commits again, 
nothing stuck out as a must-have for -stable! ;-)

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  7:14           ` Ingo Molnar
@ 2018-01-04  7:22             ` Ingo Molnar
  -1 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, x86, linux-kernel,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Ingo Molnar <mingo@kernel.org> wrote:

> These will cherry-pick cleanly, so it would be nice to test them on top of of the 
> -stable kernel that fails:
> 
>   for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34; do git cherry-pick $N; done
> 
> if this brute-force approach resolves the problem then we have a shorter list of 
> fixes to look at.

As per Greg's followup this should not matter - but nevertheless for completeness 
these commits also need f54bb2ec02c83 as a dependency, so the full list is:

   for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34 f54bb2ec02c83; do git cherry-pick $N; done

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  7:22             ` Ingo Molnar
  0 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-01-04  7:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, x86, linux-kernel,
	linux-mm, stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra


* Ingo Molnar <mingo@kernel.org> wrote:

> These will cherry-pick cleanly, so it would be nice to test them on top of of the 
> -stable kernel that fails:
> 
>   for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34; do git cherry-pick $N; done
> 
> if this brute-force approach resolves the problem then we have a shorter list of 
> fixes to look at.

As per Greg's followup this should not matter - but nevertheless for completeness 
these commits also need f54bb2ec02c83 as a dependency, so the full list is:

   for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025 9d0b62328d34 f54bb2ec02c83; do git cherry-pick $N; done

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  7:20               ` Ingo Molnar
@ 2018-01-04  8:03                 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04  8:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Thu, Jan 04, 2018 at 08:20:31AM +0100, Ingo Molnar wrote:
> 
> * Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> > On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> > >  - (or it's something I missed to consider)
> > 
> > It was a operator error, the issue is also on 4.15-rc6, see another
> > email in this thread :)
> 
> ah, ok :-)
> 
> Nevertheless it made sense to go through all the backport candidate commits again, 
> nothing stuck out as a must-have for -stable! ;-)

Yes, thanks for doing that, much appreciated, there's been too many
patches flying around and I am always worried I have missed something.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04  8:03                 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 59+ messages in thread
From: Greg Kroah-Hartman @ 2018-01-04  8:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Benjamin Gilbert, x86, linux-kernel, linux-mm,
	stable, Andy Lutomirski, Dave Hansen, Peter Zijlstra

On Thu, Jan 04, 2018 at 08:20:31AM +0100, Ingo Molnar wrote:
> 
> * Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> > On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> > >  - (or it's something I missed to consider)
> > 
> > It was a operator error, the issue is also on 4.15-rc6, see another
> > email in this thread :)
> 
> ah, ok :-)
> 
> Nevertheless it made sense to go through all the backport candidate commits again, 
> nothing stuck out as a must-have for -stable! ;-)

Yes, thanks for doing that, much appreciated, there's been too many
patches flying around and I am always worried I have missed something.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04  4:45             ` Andy Lutomirski
@ 2018-01-04 12:28               ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04 12:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
> <benjamin.gilbert@coreos.com> wrote:
> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
> >> current_kernel, or whatever it's called).  The problem may be obvious.
> >
> > current_kernel attached.  I have not seen any crashes with
> > free_ldt_pgtables() stubbed out.
> 
> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
> vaddr_end to a totally bogus value.  It should be no larger than
> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
> the LDT range.  If it weren't for that, it could just as easily land
> in the cpu_entry_area range.  This will need fixing in all versions
> that aren't still called KAISER.
> 
> Our memory map code is utter shite.  This kind of bug should not be
> possible without a giant warning at boot that something is screwed up.

You're right it's utter shite and the KASLR folks who added this insanity
of making vaddr_end depend on a gazillion of config options and not
documenting it in mm.txt or elsewhere where it's obvious to find should
really sit back and think hard about their half baken 'security' features.

Just look at the insanity of comment above the vaddr_end ifdef maze.

Benjamin, can you test the patch below please?

Thanks,

	tglx

8<--------------
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
-fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR 
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
@@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR 
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
 # define VMALLOC_SIZE_TB	_AC(32, UL)
 # define __VMALLOC_BASE		_AC(0xffffc90000000000, UL)
 # define __VMEMMAP_BASE		_AC(0xffffea0000000000, UL)
-# define LDT_PGD_ENTRY		_AC(-4, UL)
+# define LDT_PGD_ENTRY		_AC(-3, UL)
 # define LDT_BASE_ADDR		(LDT_PGD_ENTRY << PGDIR_SHIFT)
 #endif
 
@@ -110,7 +110,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define ESPFIX_PGD_ENTRY	_AC(-2, UL)
 #define ESPFIX_BASE_ADDR	(ESPFIX_PGD_ENTRY << P4D_SHIFT)
 
-#define CPU_ENTRY_AREA_PGD	_AC(-3, UL)
+#define CPU_ENTRY_AREA_PGD	_AC(-4, UL)
 #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)
 
 #define EFI_VA_START		( -4 * (_AC(1, UL) << 30))
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -34,25 +34,14 @@
 #define TB_SHIFT 40
 
 /*
- * Virtual address start and end range for randomization. The end changes base
- * on configuration to have the highest amount of space for randomization.
- * It increases the possible random position for each randomized region.
+ * Virtual address start and end range for randomization.
  *
- * You need to add an if/def entry if you introduce a new memory region
- * compatible with KASLR. Your entry must be in logical order with memory
- * layout. For example, ESPFIX is before EFI because its virtual address is
- * before. You also need to add a BUILD_BUG_ON() in kernel_randomize_memory() to
- * ensure that this order is correct and won't be changed.
+ * The end address could depend on more configuration options to make the
+ * highest amount of space for randomization available, but that's too hard
+ * to keep straight.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-
-#if defined(CONFIG_X86_ESPFIX64)
-static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
-#elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_END;
-#else
-static const unsigned long vaddr_end = __START_KERNEL_map;
-#endif
+static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
@@ -101,15 +90,11 @@ void __init kernel_randomize_memory(void
 	unsigned long remain_entropy;
 
 	/*
-	 * All these BUILD_BUG_ON checks ensures the memory layout is
-	 * consistent with the vaddr_start/vaddr_end variables.
+	 * These BUILD_BUG_ON checks ensure the memory layout is consistent
+	 * with the vaddr_start/vaddr_end variables. These checks are
+	 * limited....
 	 */
 	BUILD_BUG_ON(vaddr_start >= vaddr_end);
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-		     vaddr_end >= EFI_VA_END);
-	BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
-		      IS_ENABLED(CONFIG_EFI)) &&
-		     vaddr_end >= __START_KERNEL_map);
 	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
 
 	if (!kaslr_memory_enabled())

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04 12:28               ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04 12:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
> <benjamin.gilbert@coreos.com> wrote:
> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
> >> current_kernel, or whatever it's called).  The problem may be obvious.
> >
> > current_kernel attached.  I have not seen any crashes with
> > free_ldt_pgtables() stubbed out.
> 
> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
> vaddr_end to a totally bogus value.  It should be no larger than
> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
> the LDT range.  If it weren't for that, it could just as easily land
> in the cpu_entry_area range.  This will need fixing in all versions
> that aren't still called KAISER.
> 
> Our memory map code is utter shite.  This kind of bug should not be
> possible without a giant warning at boot that something is screwed up.

You're right it's utter shite and the KASLR folks who added this insanity
of making vaddr_end depend on a gazillion of config options and not
documenting it in mm.txt or elsewhere where it's obvious to find should
really sit back and think hard about their half baken 'security' features.

Just look at the insanity of comment above the vaddr_end ifdef maze.

Benjamin, can you test the patch below please?

Thanks,

	tglx

8<--------------
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
-fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR 
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
@@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR 
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
 # define VMALLOC_SIZE_TB	_AC(32, UL)
 # define __VMALLOC_BASE		_AC(0xffffc90000000000, UL)
 # define __VMEMMAP_BASE		_AC(0xffffea0000000000, UL)
-# define LDT_PGD_ENTRY		_AC(-4, UL)
+# define LDT_PGD_ENTRY		_AC(-3, UL)
 # define LDT_BASE_ADDR		(LDT_PGD_ENTRY << PGDIR_SHIFT)
 #endif
 
@@ -110,7 +110,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define ESPFIX_PGD_ENTRY	_AC(-2, UL)
 #define ESPFIX_BASE_ADDR	(ESPFIX_PGD_ENTRY << P4D_SHIFT)
 
-#define CPU_ENTRY_AREA_PGD	_AC(-3, UL)
+#define CPU_ENTRY_AREA_PGD	_AC(-4, UL)
 #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)
 
 #define EFI_VA_START		( -4 * (_AC(1, UL) << 30))
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -34,25 +34,14 @@
 #define TB_SHIFT 40
 
 /*
- * Virtual address start and end range for randomization. The end changes base
- * on configuration to have the highest amount of space for randomization.
- * It increases the possible random position for each randomized region.
+ * Virtual address start and end range for randomization.
  *
- * You need to add an if/def entry if you introduce a new memory region
- * compatible with KASLR. Your entry must be in logical order with memory
- * layout. For example, ESPFIX is before EFI because its virtual address is
- * before. You also need to add a BUILD_BUG_ON() in kernel_randomize_memory() to
- * ensure that this order is correct and won't be changed.
+ * The end address could depend on more configuration options to make the
+ * highest amount of space for randomization available, but that's too hard
+ * to keep straight.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-
-#if defined(CONFIG_X86_ESPFIX64)
-static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
-#elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_END;
-#else
-static const unsigned long vaddr_end = __START_KERNEL_map;
-#endif
+static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
@@ -101,15 +90,11 @@ void __init kernel_randomize_memory(void
 	unsigned long remain_entropy;
 
 	/*
-	 * All these BUILD_BUG_ON checks ensures the memory layout is
-	 * consistent with the vaddr_start/vaddr_end variables.
+	 * These BUILD_BUG_ON checks ensure the memory layout is consistent
+	 * with the vaddr_start/vaddr_end variables. These checks are
+	 * limited....
 	 */
 	BUILD_BUG_ON(vaddr_start >= vaddr_end);
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-		     vaddr_end >= EFI_VA_END);
-	BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
-		      IS_ENABLED(CONFIG_EFI)) &&
-		     vaddr_end >= __START_KERNEL_map);
 	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
 
 	if (!kaslr_memory_enabled())

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04 12:28               ` Thomas Gleixner
@ 2018-01-04 16:17                 ` Andy Lutomirski
  -1 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04 16:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Benjamin Gilbert, Greg Kroah-Hartman, X86 ML,
	LKML, linux-mm, stable, Ingo Molnar, Dave Hansen, Peter Zijlstra,
	Thomas Garnier, Alexander Kuleshov

On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> <benjamin.gilbert@coreos.com> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called).  The problem may be obvious.
>> >
>> > current_kernel attached.  I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
>> vaddr_end to a totally bogus value.  It should be no larger than
>> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
>> the LDT range.  If it weren't for that, it could just as easily land
>> in the cpu_entry_area range.  This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite.  This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
>         tglx
>
> 8<--------------
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
>  ... unused hole ...
>  ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
>  ... unused hole ...
> -fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
>  ... unused hole ...
>  ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
>  ... unused hole ...
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +... unused hole ...
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
>  # define VMALLOC_SIZE_TB       _AC(32, UL)
>  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
>  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> -# define LDT_PGD_ENTRY         _AC(-4, UL)
> +# define LDT_PGD_ENTRY         _AC(-3, UL)
>  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
>  #endif

If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
just sort the damn list, but that's not ready yet.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04 16:17                 ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-01-04 16:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Benjamin Gilbert, Greg Kroah-Hartman, X86 ML,
	LKML, linux-mm, stable, Ingo Molnar, Dave Hansen, Peter Zijlstra,
	Thomas Garnier, Alexander Kuleshov

On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> <benjamin.gilbert@coreos.com> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called).  The problem may be obvious.
>> >
>> > current_kernel attached.  I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
>> vaddr_end to a totally bogus value.  It should be no larger than
>> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
>> the LDT range.  If it weren't for that, it could just as easily land
>> in the cpu_entry_area range.  This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite.  This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
>         tglx
>
> 8<--------------
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
>  ... unused hole ...
>  ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
>  ... unused hole ...
> -fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
>  ... unused hole ...
>  ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
>  ... unused hole ...
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +... unused hole ...
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
>  # define VMALLOC_SIZE_TB       _AC(32, UL)
>  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
>  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> -# define LDT_PGD_ENTRY         _AC(-4, UL)
> +# define LDT_PGD_ENTRY         _AC(-3, UL)
>  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
>  #endif

If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
just sort the damn list, but that's not ready yet.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04 16:17                 ` Andy Lutomirski
@ 2018-01-04 16:34                   ` Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04 16:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Thu, 4 Jan 2018, Andy Lutomirski wrote:
> On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > --- a/arch/x86/include/asm/pgtable_64_types.h
> > +++ b/arch/x86/include/asm/pgtable_64_types.h
> > @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
> >  # define VMALLOC_SIZE_TB       _AC(32, UL)
> >  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
> >  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> > -# define LDT_PGD_ENTRY         _AC(-4, UL)
> > +# define LDT_PGD_ENTRY         _AC(-3, UL)
> >  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
> >  #endif
> 
> If you actually change the memory map order, you need to change the
> shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
> just sort the damn list, but that's not ready yet.

Yes, I forgot that in the first attempt. Noticed myself when dumping it,
but that should be irrelevant to figure out whether it fixes the problem at
hand.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04 16:34                   ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04 16:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Gilbert, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Thu, 4 Jan 2018, Andy Lutomirski wrote:
> On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > --- a/arch/x86/include/asm/pgtable_64_types.h
> > +++ b/arch/x86/include/asm/pgtable_64_types.h
> > @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
> >  # define VMALLOC_SIZE_TB       _AC(32, UL)
> >  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
> >  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> > -# define LDT_PGD_ENTRY         _AC(-4, UL)
> > +# define LDT_PGD_ENTRY         _AC(-3, UL)
> >  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
> >  #endif
> 
> If you actually change the memory map order, you need to change the
> shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
> just sort the damn list, but that's not ready yet.

Yes, I forgot that in the first attempt. Noticed myself when dumping it,
but that should be irrelevant to figure out whether it fixes the problem at
hand.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
  2018-01-04 12:28               ` Thomas Gleixner
@ 2018-01-04 19:38                 ` Benjamin Gilbert
  -1 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04 19:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Thu, Jan 04, 2018 at 01:28:59PM +0100, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > Our memory map code is utter shite.  This kind of bug should not be
> > possible without a giant warning at boot that something is screwed up.
> 
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
> 
> Just look at the insanity of comment above the vaddr_end ifdef maze.
> 
> Benjamin, can you test the patch below please?

Seems to work!

Thanks,
--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
@ 2018-01-04 19:38                 ` Benjamin Gilbert
  0 siblings, 0 replies; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04 19:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Greg Kroah-Hartman, X86 ML, LKML, linux-mm,
	stable, Ingo Molnar, Dave Hansen, Peter Zijlstra, Thomas Garnier,
	Alexander Kuleshov

On Thu, Jan 04, 2018 at 01:28:59PM +0100, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > Our memory map code is utter shite.  This kind of bug should not be
> > possible without a giant warning at boot that something is screwed up.
> 
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
> 
> Just look at the insanity of comment above the vaddr_end ifdef maze.
> 
> Benjamin, can you test the patch below please?

Seems to work!

Thanks,
--Benjamin Gilbert

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [tip:x86/pti] x86/mm: Map cpu_entry_area at the same place on 4/5 level
  2018-01-04 12:28               ` Thomas Gleixner
                                 ` (2 preceding siblings ...)
  (?)
@ 2018-01-04 22:10               ` tip-bot for Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-01-04 22:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, dave.hansen, luto, kuleshovmail, benjamin.gilbert, stable,
	hpa, gregkh, tglx, linux-kernel, peterz

Commit-ID:  f2078904810373211fb15f91888fba14c01a4acc
Gitweb:     https://git.kernel.org/tip/f2078904810373211fb15f91888fba14c01a4acc
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 4 Jan 2018 13:01:40 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 4 Jan 2018 23:04:57 +0100

x86/mm: Map cpu_entry_area at the same place on 4/5 level

There is no reason for 4 and 5 level pagetables to have a different
layout. It just makes determining vaddr_end for KASLR harder than
necessary.

Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos

---
 Documentation/x86/x86_64/mm.txt         | 7 ++++---
 arch/x86/include/asm/pgtable_64_types.h | 4 ++--
 arch/x86/mm/dump_pagetables.c           | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index ddd5ffd..f7dabe1 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,8 +12,8 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
-fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
@@ -37,7 +37,8 @@ ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 6233e55..61b4b60 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
 # define VMALLOC_SIZE_TB	_AC(32, UL)
 # define __VMALLOC_BASE		_AC(0xffffc90000000000, UL)
 # define __VMEMMAP_BASE		_AC(0xffffea0000000000, UL)
-# define LDT_PGD_ENTRY		_AC(-4, UL)
+# define LDT_PGD_ENTRY		_AC(-3, UL)
 # define LDT_BASE_ADDR		(LDT_PGD_ENTRY << PGDIR_SHIFT)
 #endif
 
@@ -110,7 +110,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define ESPFIX_PGD_ENTRY	_AC(-2, UL)
 #define ESPFIX_BASE_ADDR	(ESPFIX_PGD_ENTRY << P4D_SHIFT)
 
-#define CPU_ENTRY_AREA_PGD	_AC(-3, UL)
+#define CPU_ENTRY_AREA_PGD	_AC(-4, UL)
 #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)
 
 #define EFI_VA_START		( -4 * (_AC(1, UL) << 30))
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index f56902c..2a4849e 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -61,10 +61,10 @@ enum address_markers_idx {
 	KASAN_SHADOW_START_NR,
 	KASAN_SHADOW_END_NR,
 #endif
+	CPU_ENTRY_AREA_NR,
 #if defined(CONFIG_MODIFY_LDT_SYSCALL) && !defined(CONFIG_X86_5LEVEL)
 	LDT_NR,
 #endif
-	CPU_ENTRY_AREA_NR,
 #ifdef CONFIG_X86_ESPFIX64
 	ESPFIX_START_NR,
 #endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess
  2018-01-04 12:28               ` Thomas Gleixner
                                 ` (3 preceding siblings ...)
  (?)
@ 2018-01-04 22:10               ` tip-bot for Thomas Gleixner
  2018-01-04 23:29                 ` Benjamin Gilbert
  -1 siblings, 1 reply; 59+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-01-04 22:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: benjamin.gilbert, kuleshovmail, hpa, tglx, linux-kernel, mingo,
	gregkh, luto, dave.hansen, stable, peterz

Commit-ID:  1b3ef54207f068dae9c36d891ff69dd4d37c5c2f
Gitweb:     https://git.kernel.org/tip/1b3ef54207f068dae9c36d891ff69dd4d37c5c2f
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 4 Jan 2018 12:32:03 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 4 Jan 2018 23:04:57 +0100

x86/kaslr: Fix the vaddr_end mess

vaddr_end for KASLR is only documented in the KASLR code itself and is
adjusted depending on config options. So it's not surprising that a change
of the memory layout causes KASLR to have the wrong vaddr_end. This can map
arbitrary stuff into other areas causing hard to understand problems.

Remove the whole ifdef magic and define the start of the cpu_entry_area to
be the end of the KASLR vaddr range.

Add documentation to that effect.

Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
---
 Documentation/x86/x86_64/mm.txt         |  6 ++++++
 arch/x86/include/asm/pgtable_64_types.h |  8 +++++++-
 arch/x86/mm/kaslr.c                     | 34 ++++++++++-----------------------
 3 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index f7dabe1..ea91cb6 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,7 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
+				    vaddr_end for KASLR
 fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
 fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
@@ -37,6 +38,7 @@ ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
+				    vaddr_end for KASLR
 fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
 ... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
@@ -71,3 +73,7 @@ during EFI runtime calls.
 Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
 physical memory, vmalloc/ioremap space and virtual memory map are randomized.
 Their order is preserved but their base will be offset early at boot time.
+
+Be very careful vs. KASLR when changing anything here. The KASLR address
+range must not overlap with anything except the KASAN shadow area, which is
+correct as KASAN disables KASLR.
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 61b4b60..6b8f73d 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -75,7 +75,13 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_SIZE	(_AC(1, UL) << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
-/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
+/*
+ * See Documentation/x86/x86_64/mm.txt for a description of the memory map.
+ *
+ * Be very careful vs. KASLR when changing anything here. The KASLR address
+ * range must not overlap with anything except the KASAN shadow area, which
+ * is correct as KASAN disables KASLR.
+ */
 #define MAXMEM			_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
 
 #ifdef CONFIG_X86_5LEVEL
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 879ef93..b805a61 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -34,25 +34,14 @@
 #define TB_SHIFT 40
 
 /*
- * Virtual address start and end range for randomization. The end changes base
- * on configuration to have the highest amount of space for randomization.
- * It increases the possible random position for each randomized region.
+ * Virtual address start and end range for randomization.
  *
- * You need to add an if/def entry if you introduce a new memory region
- * compatible with KASLR. Your entry must be in logical order with memory
- * layout. For example, ESPFIX is before EFI because its virtual address is
- * before. You also need to add a BUILD_BUG_ON() in kernel_randomize_memory() to
- * ensure that this order is correct and won't be changed.
+ * The end address could depend on more configuration options to make the
+ * highest amount of space for randomization available, but that's too hard
+ * to keep straight and caused issues already.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-
-#if defined(CONFIG_X86_ESPFIX64)
-static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
-#elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_END;
-#else
-static const unsigned long vaddr_end = __START_KERNEL_map;
-#endif
+static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
@@ -101,16 +90,13 @@ void __init kernel_randomize_memory(void)
 	unsigned long remain_entropy;
 
 	/*
-	 * All these BUILD_BUG_ON checks ensures the memory layout is
-	 * consistent with the vaddr_start/vaddr_end variables.
+	 * These BUILD_BUG_ON checks ensure the memory layout is consistent
+	 * with the vaddr_start/vaddr_end variables. These checks are very
+	 * limited....
 	 */
 	BUILD_BUG_ON(vaddr_start >= vaddr_end);
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-		     vaddr_end >= EFI_VA_END);
-	BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
-		      IS_ENABLED(CONFIG_EFI)) &&
-		     vaddr_end >= __START_KERNEL_map);
-	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
+	BUILD_BUG_ON)(vaddr_end != CPU_ENTRY_AREA_BASE);
+-	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
 
 	if (!kaslr_memory_enabled())
 		return;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess
  2018-01-04 22:10               ` [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess tip-bot for Thomas Gleixner
@ 2018-01-04 23:29                 ` Benjamin Gilbert
  2018-01-04 23:32                   ` Thomas Gleixner
  0 siblings, 1 reply; 59+ messages in thread
From: Benjamin Gilbert @ 2018-01-04 23:29 UTC (permalink / raw)
  To: luto, gregkh, stable, peterz, dave.hansen, kuleshovmail, tglx,
	hpa, linux-kernel, mingo

On Thu, Jan 04, 2018 at 02:10:44PM -0800, tip-bot for Thomas Gleixner wrote:
> +	BUILD_BUG_ON)(vaddr_end != CPU_ENTRY_AREA_BASE);
                    ^^

Note typo.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess
  2018-01-04 23:29                 ` Benjamin Gilbert
@ 2018-01-04 23:32                   ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2018-01-04 23:32 UTC (permalink / raw)
  To: Benjamin Gilbert
  Cc: luto, gregkh, stable, peterz, dave.hansen, kuleshovmail, hpa,
	linux-kernel, mingo

On Thu, 4 Jan 2018, Benjamin Gilbert wrote:

> On Thu, Jan 04, 2018 at 02:10:44PM -0800, tip-bot for Thomas Gleixner wrote:
> > +	BUILD_BUG_ON)(vaddr_end != CPU_ENTRY_AREA_BASE);
>                     ^^
Darn...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess
  2018-01-04 12:28               ` Thomas Gleixner
                                 ` (4 preceding siblings ...)
  (?)
@ 2018-01-04 23:48               ` tip-bot for Thomas Gleixner
  -1 siblings, 0 replies; 59+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-01-04 23:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, tglx, dave.hansen, kuleshovmail, mingo, gregkh,
	benjamin.gilbert, hpa, peterz, luto, stable

Commit-ID:  1dddd25125112ba49706518ac9077a1026a18f37
Gitweb:     https://git.kernel.org/tip/1dddd25125112ba49706518ac9077a1026a18f37
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 4 Jan 2018 12:32:03 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 5 Jan 2018 00:39:57 +0100

x86/kaslr: Fix the vaddr_end mess

vaddr_end for KASLR is only documented in the KASLR code itself and is
adjusted depending on config options. So it's not surprising that a change
of the memory layout causes KASLR to have the wrong vaddr_end. This can map
arbitrary stuff into other areas causing hard to understand problems.

Remove the whole ifdef magic and define the start of the cpu_entry_area to
be the end of the KASLR vaddr range.

Add documentation to that effect.

Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
---
 Documentation/x86/x86_64/mm.txt         |  6 ++++++
 arch/x86/include/asm/pgtable_64_types.h |  8 +++++++-
 arch/x86/mm/kaslr.c                     | 32 +++++++++-----------------------
 3 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index f7dabe1..ea91cb6 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,7 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
+				    vaddr_end for KASLR
 fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
 fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
@@ -37,6 +38,7 @@ ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
+				    vaddr_end for KASLR
 fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
 ... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
@@ -71,3 +73,7 @@ during EFI runtime calls.
 Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
 physical memory, vmalloc/ioremap space and virtual memory map are randomized.
 Their order is preserved but their base will be offset early at boot time.
+
+Be very careful vs. KASLR when changing anything here. The KASLR address
+range must not overlap with anything except the KASAN shadow area, which is
+correct as KASAN disables KASLR.
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 61b4b60..6b8f73d 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -75,7 +75,13 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_SIZE	(_AC(1, UL) << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
-/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
+/*
+ * See Documentation/x86/x86_64/mm.txt for a description of the memory map.
+ *
+ * Be very careful vs. KASLR when changing anything here. The KASLR address
+ * range must not overlap with anything except the KASAN shadow area, which
+ * is correct as KASAN disables KASLR.
+ */
 #define MAXMEM			_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
 
 #ifdef CONFIG_X86_5LEVEL
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 879ef93..aedebd2 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -34,25 +34,14 @@
 #define TB_SHIFT 40
 
 /*
- * Virtual address start and end range for randomization. The end changes base
- * on configuration to have the highest amount of space for randomization.
- * It increases the possible random position for each randomized region.
+ * Virtual address start and end range for randomization.
  *
- * You need to add an if/def entry if you introduce a new memory region
- * compatible with KASLR. Your entry must be in logical order with memory
- * layout. For example, ESPFIX is before EFI because its virtual address is
- * before. You also need to add a BUILD_BUG_ON() in kernel_randomize_memory() to
- * ensure that this order is correct and won't be changed.
+ * The end address could depend on more configuration options to make the
+ * highest amount of space for randomization available, but that's too hard
+ * to keep straight and caused issues already.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-
-#if defined(CONFIG_X86_ESPFIX64)
-static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
-#elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_END;
-#else
-static const unsigned long vaddr_end = __START_KERNEL_map;
-#endif
+static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
@@ -101,15 +90,12 @@ void __init kernel_randomize_memory(void)
 	unsigned long remain_entropy;
 
 	/*
-	 * All these BUILD_BUG_ON checks ensures the memory layout is
-	 * consistent with the vaddr_start/vaddr_end variables.
+	 * These BUILD_BUG_ON checks ensure the memory layout is consistent
+	 * with the vaddr_start/vaddr_end variables. These checks are very
+	 * limited....
 	 */
 	BUILD_BUG_ON(vaddr_start >= vaddr_end);
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-		     vaddr_end >= EFI_VA_END);
-	BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
-		      IS_ENABLED(CONFIG_EFI)) &&
-		     vaddr_end >= __START_KERNEL_map);
+	BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE);
 	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
 
 	if (!kaslr_memory_enabled())

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2018-01-04 23:54 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-03  8:36 "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs Benjamin Gilbert
2018-01-03  8:46 ` Benjamin Gilbert
2018-01-03  9:20   ` Greg Kroah-Hartman
2018-01-03  9:20     ` Greg Kroah-Hartman
2018-01-03 15:48     ` Ingo Molnar
2018-01-03 15:48       ` Ingo Molnar
2018-01-03 22:32       ` Benjamin Gilbert
2018-01-03 22:32         ` Benjamin Gilbert
2018-01-03 22:34         ` Thomas Gleixner
2018-01-03 22:34           ` Thomas Gleixner
2018-01-03 22:49           ` Benjamin Gilbert
2018-01-03 22:57             ` Thomas Gleixner
2018-01-03 22:57               ` Thomas Gleixner
2018-01-03 22:58               ` Thomas Gleixner
2018-01-03 22:58                 ` Thomas Gleixner
2018-01-03 23:44                 ` Andy Lutomirski
2018-01-03 23:44                   ` Andy Lutomirski
2018-01-03 23:46                   ` Thomas Gleixner
2018-01-03 23:46                     ` Thomas Gleixner
2018-01-04  0:27                 ` Andy Lutomirski
2018-01-04  0:27                   ` Andy Lutomirski
2018-01-04  0:38                   ` Benjamin Gilbert
2018-01-04  0:38                     ` Benjamin Gilbert
2018-01-04  0:33     ` Benjamin Gilbert
2018-01-04  0:33       ` Benjamin Gilbert
2018-01-04  0:37       ` Thomas Gleixner
2018-01-04  0:37         ` Thomas Gleixner
2018-01-04  7:14         ` Ingo Molnar
2018-01-04  7:14           ` Ingo Molnar
2018-01-04  7:18           ` Greg Kroah-Hartman
2018-01-04  7:18             ` Greg Kroah-Hartman
2018-01-04  7:20             ` Ingo Molnar
2018-01-04  7:20               ` Ingo Molnar
2018-01-04  8:03               ` Greg Kroah-Hartman
2018-01-04  8:03                 ` Greg Kroah-Hartman
2018-01-04  7:22           ` Ingo Molnar
2018-01-04  7:22             ` Ingo Molnar
2018-01-04  0:37       ` Andy Lutomirski
2018-01-04  0:37         ` Andy Lutomirski
2018-01-04  4:35         ` Benjamin Gilbert
2018-01-04  4:45           ` Andy Lutomirski
2018-01-04  4:45             ` Andy Lutomirski
2018-01-04 12:28             ` Thomas Gleixner
2018-01-04 12:28               ` Thomas Gleixner
2018-01-04 16:17               ` Andy Lutomirski
2018-01-04 16:17                 ` Andy Lutomirski
2018-01-04 16:34                 ` Thomas Gleixner
2018-01-04 16:34                   ` Thomas Gleixner
2018-01-04 19:38               ` Benjamin Gilbert
2018-01-04 19:38                 ` Benjamin Gilbert
2018-01-04 22:10               ` [tip:x86/pti] x86/mm: Map cpu_entry_area at the same place on 4/5 level tip-bot for Thomas Gleixner
2018-01-04 22:10               ` [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess tip-bot for Thomas Gleixner
2018-01-04 23:29                 ` Benjamin Gilbert
2018-01-04 23:32                   ` Thomas Gleixner
2018-01-04 23:48               ` tip-bot for Thomas Gleixner
2018-01-04  1:37       ` "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs Benjamin Gilbert
2018-01-04  1:37         ` Benjamin Gilbert
2018-01-04  4:36         ` Benjamin Gilbert
2018-01-04  4:36           ` Benjamin Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.