All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3
@ 2018-04-17 16:11 Borislav Petkov
  2018-04-17 16:11 ` [PATCH 1/9] x86/dumpstack: Remove code_bytes Borislav Petkov
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

Hi,

here's v3 now that the merge window is done, with hopefully all review
feedback (thanks Josh et al!) incorporated.

Thx.

Borislav Petkov (9):
  x86/dumpstack: Remove code_bytes
  x86/dumpstack: Unexport oops_begin()
  x86/dumpstack: Carve out Code: dumping into a function
  x86/dumpstack: Improve opcodes dumping in the Code: section
  x86/dumpstack: Add loglevel argument to show_opcodes()
  x86/fault: Dump user opcode bytes on fatal faults
  x86/dumpstack: Add a show_ip() function
  x86/dumpstack: Save first regs set for the executive summary
  x86/dumpstack: Explain the reasoning for the prologue and buffer size

 Documentation/admin-guide/kernel-parameters.txt |   5 -
 arch/x86/include/asm/stacktrace.h               |   2 +
 arch/x86/kernel/dumpstack.c                     | 144 ++++++++++++------------
 arch/x86/kernel/process_32.c                    |   8 +-
 arch/x86/mm/fault.c                             |   7 +-
 5 files changed, 80 insertions(+), 86 deletions(-)

-- 
2.13.0

Changelog:

v2:

here's v2 with the dumpstack cleanups. This one gets rid of code_bytes=
as it was discussed last time. As a result, the code got even leaner and
simpler. I like that. :)

Thx.

Borislav Petkov (9):
  x86/dumstack: Remove code_bytes
  x86/dumpstack: Unexport oops_begin()
  x86/dumpstack: Carve out Code: dumping into a function
  x86/dumpstack: Improve opcodes dumping in the Code: section
  x86/dumpstack: Add loglevel argument to show_opcodes()
  x86/fault: Dump user opcode bytes on fatal faults
  x86/dumpstack: Add a show_ip() function
  x86/dumpstack: Save first regs set for the executive summary
  x86/dumpstack: Explain the reasoning for the prologue and buffer size

 Documentation/admin-guide/kernel-parameters.txt |   5 -
 arch/x86/include/asm/stacktrace.h               |   2 +
 arch/x86/kernel/dumpstack.c                     | 138 ++++++++++++------------
 arch/x86/kernel/process_32.c                    |   4 +-
 arch/x86/mm/fault.c                             |   7 +-
 5 files changed, 78 insertions(+), 78 deletions(-)


v1:

Hi,

here's v2 of the dumpstack cleanups.

I've split them into more fine-grained pieces to show each change. The
relevant parts are the saving of the executive registers of the first
time we oops and dumping them in the end + opcode bytes for user faults.
I've tested splats in a 80x25 screen and the registers, RIP and opcode
bytes fit all in.

I'm adding exemplary dumps from 32-bit and 64-bit at the end of this mail.

I still have on my TODO list to experiment with console log levels and
see whether we can do a best-of-both-worlds thing there.

v0:

Hi,

so I've been thinking about doing this for a while now: be able to dump
the opcode bytes around the user rIP just like we do for kernel faults.

Why?

See patch 5's commit message. That's why I've marked it RFC.

The rest is cleanups: we're copying the opcodes byte-by-byte and that's
just wasteful.

Also, we're using probe_kernel_read() underneath and it does
__copy_from_user_inatomic() which makes copying user opcode bytes
trivial.

With that, it looks like this:

[  696.837457] strsep[1733]: segfault at 40066b ip 00007fad558fccf8 sp 00007ffc5e662520 error 7 in libc-2.26.so[7fad55876000+1ad000]
[  696.837538] Code: 1b 48 89 fd 48 89 df e8 77 99 f9 ff 48 01 d8 80 38 00 75 17 48 c7 45 00 00 00 00 00 48 83 c4 08 48 89 d8 5b 5d c3 0f 1f 44 00 00 <c6> 00 00 48 83 c0 01 48 89 45 00 48 83 c4 08 48 89 d8 5b 5d c3

and the code matches, as expected:

0000000000086cc0 <__strsep_g@@GLIBC_2.2.5>:
   86cc0:       55                      push   %rbp
   86cc1:       53                      push   %rbx
   86cc2:       48 83 ec 08             sub    $0x8,%rsp
   86cc6:       48 8b 1f                mov    (%rdi),%rbx
   86cc9:       48 85 db                test   %rbx,%rbx
   86ccc:       74 1b                   je     86ce9 <__strsep_g@@GLIBC_2.2.5+0x29>
   86cce:       48 89 fd                mov    %rdi,%rbp
   86cd1:       48 89 df                mov    %rbx,%rdi
   86cd4:       e8 77 99 f9 ff          callq  20650 <*ABS*+0x854e0@plt>
   86cd9:       48 01 d8                add    %rbx,%rax
   86cdc:       80 38 00                cmpb   $0x0,(%rax)
   86cdf:       75 17                   jne    86cf8 <__strsep_g@@GLIBC_2.2.5+0x38>
   86ce1:       48 c7 45 00 00 00 00    movq   $0x0,0x0(%rbp)
   86ce8:       00 
   86ce9:       48 83 c4 08             add    $0x8,%rsp
   86ced:       48 89 d8                mov    %rbx,%rax
   86cf0:       5b                      pop    %rbx
   86cf1:       5d                      pop    %rbp
   86cf2:       c3                      retq   
   86cf3:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
   86cf8:       c6 00 00                movb   $0x0,(%rax)
   86cfb:       48 83 c0 01             add    $0x1,%rax
   86cff:       48 89 45 00             mov    %rax,0x0(%rbp)
   86d03:       48 83 c4 08             add    $0x8,%rsp
   86d07:       48 89 d8                mov    %rbx,%rax
   86d0a:       5b                      pop    %rbx
   86d0b:       5d                      pop    %rbp
   86d0c:       c3                      retq

Comments and suggestions are welcome!

Thx.

Example dumps

v3:

64-bit:

[   34.688928] sysrq: SysRq : Trigger a crash
[   34.690799] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   34.692653] PGD 7aac2067 P4D 7aac2067 PUD 7aac3067 PMD 0 
[   34.692653] Oops: 0002 [#1] PREEMPT SMP
[   34.692653] CPU: 0 PID: 3695 Comm: bash Not tainted 4.16.0+ #14
[   34.692653] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   34.692653] RIP: 0010:sysrq_handle_crash+0x17/0x20
[   34.692653] Code: d1 e8 9d f1 b6 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e8 46 0c bd ff c7 05 74 0e 19 01 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 c3 0f 1f 44 00 00 e8 46 0a c2 ff fb e9 30 
[   34.692653] RSP: 0018:ffffc90001b57df0 EFLAGS: 00010246
[   34.692653] RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000000
[   34.692653] RDX: 0000000000000000 RSI: ffffffff81101f2a RDI: 0000000000000063
[   34.692653] RBP: ffffffff8226fec0 R08: 0000000000000183 R09: 00000000000a8320
[   34.692653] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
[   34.692653] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   34.692653] FS:  00007ffff7fdb700(0000) GS:ffff88007ec00000(0000) knlGS:0000000000000000
[   34.692653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.692653] CR2: 0000000000000000 CR3: 0000000079462000 CR4: 00000000000406f0
[   34.692653] Call Trace:
[   34.692653]  __handle_sysrq+0x9e/0x160
[   34.692653]  write_sysrq_trigger+0x2b/0x30
[   34.692653]  proc_reg_write+0x38/0x70
[   34.692653]  __vfs_write+0x36/0x160
[   34.692653]  ? __fd_install+0x69/0x110
[   34.692653]  ? preempt_count_add+0x74/0xb0
[   34.692653]  ? _raw_spin_lock+0x13/0x30
[   34.692653]  ? set_close_on_exec+0x41/0x80
[   34.692653]  ? preempt_count_sub+0xa8/0x100
[   34.692653]  vfs_write+0xc0/0x190
[   34.692653]  ksys_write+0x64/0xe0
[   34.692653]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[   34.692653]  do_syscall_64+0x70/0x130
[   34.692653]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[   34.692653] RIP: 0033:0x7ffff74b9620
[   34.692653] Code: 73 01 c3 48 8b 0d 68 98 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d bd f1 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce 8f 01 00 48 89 04 24 
[   34.692653] RSP: 002b:00007fffffffe6f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   34.692653] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ffff74b9620
[   34.692653] RDX: 0000000000000002 RSI: 0000000000705408 RDI: 0000000000000001
[   34.692653] RBP: 0000000000705408 R08: 000000000000000a R09: 00007ffff7fdb700
[   34.692653] R10: 00007ffff77826a0 R11: 0000000000000246 R12: 00007ffff77842a0
[   34.692653] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[   34.692653] Modules linked in:
[   34.692653] CR2: 0000000000000000
[   34.728373] ---[ end trace 84a5f329ce73ad83 ]---
[   34.730511] RIP: 0010:sysrq_handle_crash+0x17/0x20
[   34.732585] Code: d1 e8 9d f1 b6 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e8 46 0c bd ff c7 05 74 0e 19 01 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 c3 0f 1f 44 00 00 e8 46 0a c2 ff fb e9 30 
[   34.739863] RSP: 0018:ffffc90001b57df0 EFLAGS: 00010246
[   34.740612] RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000000
[   34.741653] RDX: 0000000000000000 RSI: ffffffff81101f2a RDI: 0000000000000063
[   34.742585] RBP: ffffffff8226fec0 R08: 0000000000000183 R09: 00000000000a8320
[   34.743517] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
[   34.744500] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   34.745626] FS:  00007ffff7fdb700(0000) GS:ffff88007ec00000(0000) knlGS:0000000000000000
[   34.746691] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.747422] CR2: 0000000000000000 CR3: 0000000079462000 CR4: 00000000000406f0
[   34.748382] Kernel panic - not syncing: Fatal exception
[   34.749531] Kernel Offset: disabled
[   34.750005] ---[ end Kernel panic - not syncing: Fatal exception ]---

32-bit:

[  103.959732] sysrq: SysRq : Trigger a crash
[  103.964190] BUG: unable to handle kernel NULL pointer dereference at 00000000
[  103.968108] *pde = 00000000 
[  103.968108] Oops: 0002 [#1] PREEMPT SMP
[  103.968108] CPU: 5 PID: 2117 Comm: bash Not tainted 4.16.0+ #15
[  103.968108] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[  103.968108] EIP: sysrq_handle_crash+0x1d/0x30
[  103.968108] Code: ff eb d6 e8 a5 f4 b9 ff 90 8d 74 26 00 0f 1f 44 00 00 55 89 e5 e8 03 f0 bf ff c7 05 34 b2 c1 c1 01 00 00 00 0f ae f8 0f 1f 00 <c6> 05 00 00 00 00 01 5d c3 8d 76 00 8d bc 27 00 00 00 00 0f 1f 44 
[  103.968108] EAX: 00000000 EBX: 0000000a ECX: 00000000 EDX: c1505ad0
[  103.968108] ESI: 00000063 EDI: 00000000 EBP: f374fe80 ESP: f374fe80
[  103.968108] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
[  103.968108] CR0: 80050033 CR2: 00000000 CR3: 33044000 CR4: 000406d0
[  103.968108] Call Trace:
[  103.968108]  __handle_sysrq+0x93/0x130
[  103.968108]  ? sysrq_filter+0x3c0/0x3c0
[  103.968108]  write_sysrq_trigger+0x27/0x40
[  103.968108]  proc_reg_write+0x4d/0x80
[  103.968108]  ? proc_reg_poll+0x70/0x70
[  103.968108]  __vfs_write+0x38/0x160
[  103.968108]  ? preempt_count_sub+0xa0/0x110
[  103.968108]  ? set_close_on_exec+0x4b/0x60
[  103.968108]  ? preempt_count_sub+0xa0/0x110
[  103.968108]  ? __fd_install+0x51/0xd0
[  103.968108]  ? __sb_start_write+0x4c/0xc0
[  103.968108]  ? preempt_count_sub+0xa0/0x110
[  103.968108]  vfs_write+0x98/0x180
[  103.968108]  ksys_write+0x51/0xb0
[  103.968108]  SyS_write+0x16/0x20
[  103.968108]  do_fast_syscall_32+0x99/0x200
[  103.968108]  entry_SYSENTER_32+0x53/0x86
[  103.968108] EIP: 0xb7f71b35
[  103.968108] Code: 89 e5 8b 55 08 8b 80 64 cd ff ff 85 d2 74 02 89 02 5d c3 8b 04 24 c3 8b 0c 24 c3 8b 1c 24 c3 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  103.968108] EAX: ffffffda EBX: 00000001 ECX: 09b11a08 EDX: 00000002
[  103.968108] ESI: 00000002 EDI: b7f3cd80 EBP: 09b11a08 ESP: bfeeb390
[  103.968108] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[  103.968108] Modules linked in:
[  103.968108] CR2: 0000000000000000
[  104.023961] ---[ end trace 705add298921f2dd ]---
[  104.025249] EIP: sysrq_handle_crash+0x1d/0x30
[  104.026323] Code: ff eb d6 e8 a5 f4 b9 ff 90 8d 74 26 00 0f 1f 44 00 00 55 89 e5 e8 03 f0 bf ff c7 05 34 b2 c1 c1 01 00 00 00 0f ae f8 0f 1f 00 <c6> 05 00 00 00 00 01 5d c3 8d 76 00 8d bc 27 00 00 00 00 0f 1f 44 
[  104.034894] EAX: 00000000 EBX: 0000000a ECX: 00000000 EDX: c1505ad0
[  104.036643] ESI: 00000063 EDI: 00000000 EBP: f374fe80 ESP: c1c1187c
[  104.038432] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
[  104.040185] CR0: 80050033 CR2: 00000000 CR3: 33044000 CR4: 000406d0
[  104.041826] Kernel panic - not syncing: Fatal exception
[  104.043607] Kernel Offset: disabled
[  104.044170] ---[ end Kernel panic - not syncing: Fatal exception ]---




v2:

64-bit:

[   53.534957] sysrq: SysRq : Trigger a crash
[   53.536939] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   53.539982] PGD 79149067 P4D 79149067 PUD 793a5067 PMD 0 
[   53.540897] Oops: 0002 [#1] PREEMPT SMP
[   53.540897] CPU: 6 PID: 3700 Comm: bash Not tainted 4.16.0-rc5+ #11
[   53.540897] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   53.540897] RIP: 0010:sysrq_handle_crash+0x17/0x20
[   53.540897] Code: d1 e8 6d 08 b7 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e8 76 1f bd ff c7 05 a4 12 19 01 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 c3 0f 1f 44 00 00 e8 c6 1b c2 ff fb e9 80 
[   53.540897] RSP: 0018:ffffc9000053bdf0 EFLAGS: 00010246
[   53.540897] RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000000
[   53.540897] RDX: 0000000000000000 RSI: ffffffff81101e0a RDI: 0000000000000063
[   53.540897] RBP: ffffffff822714c0 R08: 0000000000000185 R09: 00000000000829ad
[   53.540897] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
[   53.540897] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   53.540897] FS:  00007ffff7fdb700(0000) GS:ffff88007ed80000(0000) knlGS:0000000000000000
[   53.540897] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.540897] CR2: 0000000000000000 CR3: 0000000079107000 CR4: 00000000000406e0
[   53.540897] Call Trace:
[   53.540897]  __handle_sysrq+0x9e/0x160
[   53.540897]  write_sysrq_trigger+0x2b/0x30
[   53.540897]  proc_reg_write+0x38/0x70
[   53.540897]  __vfs_write+0x36/0x160
[   53.540897]  ? __fd_install+0x69/0x110
[   53.540897]  ? preempt_count_add+0x74/0xb0
[   53.540897]  ? _raw_spin_lock+0x13/0x30
[   53.540897]  ? set_close_on_exec+0x41/0x80
[   53.540897]  ? preempt_count_sub+0xa8/0x100
[   53.540897]  vfs_write+0xc0/0x190
[   53.540897]  SyS_write+0x64/0xe0
[   53.540897]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[   53.540897]  do_syscall_64+0x70/0x130
[   53.540897]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[   53.540897] RIP: 0033:0x7ffff74b9620
[   53.540897] Code: 73 01 c3 48 8b 0d 68 98 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d bd f1 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce 8f 01 00 48 89 04 24 
[   53.540897] RSP: 002b:00007fffffffe6f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   53.540897] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ffff74b9620
[   53.540897] RDX: 0000000000000002 RSI: 0000000000705408 RDI: 0000000000000001
[   53.540897] RBP: 0000000000705408 R08: 000000000000000a R09: 00007ffff7fdb700
[   53.540897] R10: 00007ffff77826a0 R11: 0000000000000246 R12: 00007ffff77842a0
[   53.540897] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[   53.540897] Modules linked in:
[   53.540897] CR2: 0000000000000000
[   53.576029] ---[ end trace 9b6fe8eba592293d ]---
[   53.578109] RIP: 0010:sysrq_handle_crash+0x17/0x20
[   53.580191] Code: d1 e8 6d 08 b7 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e8 76 1f bd ff c7 05 a4 12 19 01 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 c3 0f 1f 44 00 00 e8 c6 1b c2 ff fb e9 80 
[   53.587244] RSP: 0018:ffffc9000053bdf0 EFLAGS: 00010246
[   53.587928] RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000000
[   53.588929] RDX: 0000000000000000 RSI: ffffffff81101e0a RDI: 0000000000000063
[   53.589956] RBP: ffffffff822714c0 R08: 0000000000000185 R09: 00000000000829ad
[   53.590886] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
[   53.591812] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   53.592781] Kernel panic - not syncing: Fatal exception
[   53.594100] Kernel Offset: disabled
[   53.594571] ---[ end Kernel panic - not syncing: Fatal exception ]---

[   22.737752] strsep[3728]: segfault at 40066b ip 00007ffff7abe22b sp 00007fffffffea60 error 7 in libc-2.19.so[7ffff7a33000+19f000]
[   22.742487] Code: 48 89 fd 53 48 83 ec 08 48 8b 1f 48 85 db 74 67 0f b6 06 84 c0 74 33 80 7e 01 00 74 22 48 89 df e8 5a 8a ff ff 48 85 c0 74 20 <c6> 00 00 48 83 c0 01 48 89 45 00 48 89 d8 48 83 c4 08 5b 5d c3 0f


32-bit
------

[  151.053373] sysrq: SysRq : Trigger a crash
[  151.056586] BUG: unable to handle kernel NULL pointer dereference at 00000000
[  151.060237] *pde = 00000000 
[  151.060484] Oops: 0002 [#1] PREEMPT SMP
[  151.060484] CPU: 1 PID: 2070 Comm: bash Not tainted 4.16.0-rc5+ #12
[  151.060484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[  151.060484] EIP: sysrq_handle_crash+0x1d/0x30
[  151.060484] Code: ff eb d6 e8 75 0f ba ff 90 8d 74 26 00 0f 1f 44 00 00 55 89 e5 e8 03 07 c0 ff c7 05 34 72 c1 c1 01 00 00 00 0f ae f8 0f 1f 00 <c6> 05 00 00 00 00 01 5d c3 8d 76 00 8d bc 27 00 00 00 00 0f 1f 44 
[  151.060484] EAX: 00000000 EBX: 0000000a ECX: 00000000 EDX: c1503f70
[  151.060484] ESI: 00000063 EDI: 00000000 EBP: f36d7e8c ESP: f36d7e8c
[  151.060484]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  151.060484] CR0: 80050033 CR2: 00000000 CR3: 33d64000 CR4: 000406d0
[  151.060484] Call Trace:
[  151.060484]  __handle_sysrq+0x93/0x130
[  151.060484]  ? sysrq_filter+0x3c0/0x3c0
[  151.060484]  write_sysrq_trigger+0x27/0x40
[  151.060484]  proc_reg_write+0x4d/0x80
[  151.060484]  ? proc_reg_poll+0x70/0x70
[  151.060484]  __vfs_write+0x38/0x160
[  151.060484]  ? preempt_count_sub+0xa0/0x110
[  151.060484]  ? __fd_install+0x51/0xd0
[  151.060484]  ? __sb_start_write+0x4c/0xc0
[  151.060484]  ? preempt_count_sub+0xa0/0x110
[  151.060484]  vfs_write+0x98/0x180
[  151.060484]  SyS_write+0x4f/0xb0
[  151.060484]  do_fast_syscall_32+0x99/0x200
[  151.060484]  entry_SYSENTER_32+0x53/0x86
[  151.060484] EIP: 0xb7f25b35
[  151.060484] Code: 89 e5 8b 55 08 8b 80 64 cd ff ff 85 d2 74 02 89 02 5d c3 8b 04 24 c3 8b 0c 24 c3 8b 1c 24 c3 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  151.060484] EAX: ffffffda EBX: 00000001 ECX: 08b14a08 EDX: 00000002
[  151.060484] ESI: 00000002 EDI: b7ef0d80 EBP: 08b14a08 ESP: bfc53830
[  151.060484]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[  151.060484] Modules linked in:
[  151.060484] CR2: 0000000000000000
[  151.128925] ---[ end trace 822f779813ab57e1 ]---
[  151.136624] EIP: sysrq_handle_crash+0x1d/0x30
[  151.136625] Code: ff eb d6 e8 75 0f ba ff 90 8d 74 26 00 0f 1f 44 00 00 55 89 e5 e8 03 07 c0 ff c7 05 34 72 c1 c1 01 00 00 00 0f ae f8 0f 1f 00 <c6> 05 00 00 00 00 01 5d c3 8d 76 00 8d bc 27 00 00 00 00 0f 1f 44 
[  151.136658] EAX: 00000000 EBX: 0000000a ECX: 00000000 EDX: c1503f70
[  151.136659] ESI: 00000063 EDI: 00000000 EBP: f36d7e8c ESP: c1c0d87c
[  151.136661]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  151.136662] Kernel panic - not syncing: Fatal exception
[  151.137001] Kernel Offset: disabled
[  151.140587] ---[ end Kernel panic - not syncing: Fatal exception ]---

[  103.241026] strsep32[2125]: segfault at 4336a7 ip b7df6758 sp bfc73fd0 error 7 in libc-2.26.so[b7d76000+1cd000]
[  103.252505] Code: 1d 83 ec 08 ff 74 24 1c 56 e8 14 d6 ff ff 01 f0 83 c4 10 80 38 00 75 12 c7 03 00 00 00 00 83 c4 04 89 f0 5b 5e c3 8d 74 26 00 <c6> 00 00 83 c0 01 89 03 83 c4 04 89 f0 5b 5e c3 66 90 66 90 66 90

-- 
2.13.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/9] x86/dumpstack: Remove code_bytes
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:18   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 2/9] x86/dumpstack: Unexport oops_begin() Borislav Petkov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

This was added by

  86c418374223 ("[PATCH] i386: add option to show more code in oops reports")

long time ago but experience shows that 64 instruction bytes are plenty
when deciphering an oops. So get rid of it.

Removing it will simplify further enhancements to the opcodes dumping
machinery coming in the following patches.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 Documentation/admin-guide/kernel-parameters.txt |  5 -----
 arch/x86/kernel/dumpstack.c                     | 27 ++++---------------------
 2 files changed, 4 insertions(+), 28 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28ecdb6d..47aa554e41b7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -587,11 +587,6 @@
 			Sets the size of memory pool for coherent, atomic dma
 			allocations, by default set to 256K.
 
-	code_bytes	[X86] How many bytes of object code to print
-			in an oops report.
-			Range: 0 - 8192
-			Default: 64
-
 	com20020=	[HW,NET] ARCnet - COM20020 chipset
 			Format:
 			<io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 18fa9d74c182..593db796374d 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -22,9 +22,10 @@
 #include <asm/stacktrace.h>
 #include <asm/unwind.h>
 
+#define OPCODE_BUFSIZE 64
+
 int panic_on_unrecovered_nmi;
 int panic_on_io_nmi;
-static unsigned int code_bytes = 64;
 static int die_counter;
 
 bool in_task_stack(unsigned long *stack, struct task_struct *task,
@@ -356,26 +357,6 @@ void die(const char *str, struct pt_regs *regs, long err)
 	oops_end(flags, regs, sig);
 }
 
-static int __init code_bytes_setup(char *s)
-{
-	ssize_t ret;
-	unsigned long val;
-
-	if (!s)
-		return -EINVAL;
-
-	ret = kstrtoul(s, 0, &val);
-	if (ret)
-		return ret;
-
-	code_bytes = val;
-	if (code_bytes > 8192)
-		code_bytes = 8192;
-
-	return 1;
-}
-__setup("code_bytes=", code_bytes_setup);
-
 void show_regs(struct pt_regs *regs)
 {
 	bool all = true;
@@ -393,8 +374,8 @@ void show_regs(struct pt_regs *regs)
 	 * time of the fault..
 	 */
 	if (!user_mode(regs)) {
-		unsigned int code_prologue = code_bytes * 43 / 64;
-		unsigned int code_len = code_bytes;
+		unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
+		unsigned int code_len = OPCODE_BUFSIZE;
 		unsigned char c;
 		u8 *ip;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/9] x86/dumpstack: Unexport oops_begin()
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
  2018-04-17 16:11 ` [PATCH 1/9] x86/dumpstack: Remove code_bytes Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:19   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 3/9] x86/dumpstack: Carve out Code: dumping into a function Borislav Petkov
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

The only user outside of arch/ is not a module since

  86cd47334b00 ("ACPI, APEI, GHES, Prevent GHES to be built as module")

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/dumpstack.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 593db796374d..579455c2b91e 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -268,7 +268,6 @@ unsigned long oops_begin(void)
 	bust_spinlocks(1);
 	return flags;
 }
-EXPORT_SYMBOL_GPL(oops_begin);
 NOKPROBE_SYMBOL(oops_begin);
 
 void __noreturn rewind_stack_do_exit(int signr);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/9] x86/dumpstack: Carve out Code: dumping into a function
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
  2018-04-17 16:11 ` [PATCH 1/9] x86/dumpstack: Remove code_bytes Borislav Petkov
  2018-04-17 16:11 ` [PATCH 2/9] x86/dumpstack: Unexport oops_begin() Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:19   ` [tip:x86/cleanups] x86/dumpstack: Carve out code-dumping " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 4/9] x86/dumpstack: Improve opcodes dumping in the Code: section Borislav Petkov
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

No functionality change, carve it out into a separate function for later
changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/dumpstack.c | 57 ++++++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 579455c2b91e..eb9d6c00a52f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -70,6 +70,35 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
+static void show_opcodes(u8 *rip)
+{
+	unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
+	unsigned int code_len = OPCODE_BUFSIZE;
+	unsigned char c;
+	u8 *ip;
+	int i;
+
+	printk(KERN_DEFAULT "Code: ");
+
+	ip = (u8 *)rip - code_prologue;
+	if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
+		/* try starting at IP */
+		ip = (u8 *)rip;
+		code_len = code_len - code_prologue + 1;
+	}
+	for (i = 0; i < code_len; i++, ip++) {
+		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
+			pr_cont(" Bad RIP value.");
+			break;
+		}
+		if (ip == (u8 *)rip)
+			pr_cont("<%02x> ", c);
+		else
+			pr_cont("%02x ", c);
+	}
+	pr_cont("\n");
+}
+
 void show_iret_regs(struct pt_regs *regs)
 {
 	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
@@ -359,7 +388,6 @@ void die(const char *str, struct pt_regs *regs, long err)
 void show_regs(struct pt_regs *regs)
 {
 	bool all = true;
-	int i;
 
 	show_regs_print_info(KERN_DEFAULT);
 
@@ -373,32 +401,7 @@ void show_regs(struct pt_regs *regs)
 	 * time of the fault..
 	 */
 	if (!user_mode(regs)) {
-		unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
-		unsigned int code_len = OPCODE_BUFSIZE;
-		unsigned char c;
-		u8 *ip;
-
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-
-		printk(KERN_DEFAULT "Code: ");
-
-		ip = (u8 *)regs->ip - code_prologue;
-		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-			/* try starting at IP */
-			ip = (u8 *)regs->ip;
-			code_len = code_len - code_prologue + 1;
-		}
-		for (i = 0; i < code_len; i++, ip++) {
-			if (ip < (u8 *)PAGE_OFFSET ||
-					probe_kernel_address(ip, c)) {
-				pr_cont(" Bad RIP value.");
-				break;
-			}
-			if (ip == (u8 *)regs->ip)
-				pr_cont("<%02x> ", c);
-			else
-				pr_cont("%02x ", c);
-		}
+		show_opcodes((u8 *)regs->ip);
 	}
-	pr_cont("\n");
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/9] x86/dumpstack: Improve opcodes dumping in the Code: section
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (2 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 3/9] x86/dumpstack: Carve out Code: dumping into a function Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:20   ` [tip:x86/cleanups] x86/dumpstack: Improve opcodes dumping in the code section tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 5/9] x86/dumpstack: Add loglevel argument to show_opcodes() Borislav Petkov
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

The code used to iterate byte-by-byte over the bytes around RIP and that
is expensive: disabling pagefaults around it, copy_from_user, etc...

Make it read the whole buffer of OPCODE_BUFSIZE size in one go. Use a
statically allocated 64 bytes buffer so that concurrent show_opcodes()
do not interleave in the output even though in the majority of the cases
we sync on die_lock. Except the #PF path which doesn't...

Also, do the PAGE_OFFSET check outside of the function because latter
will be reused in other context.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/dumpstack.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index eb9d6c00a52f..1d6698b54527 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -72,29 +72,24 @@ static void printk_stack_address(unsigned long address, int reliable,
 
 static void show_opcodes(u8 *rip)
 {
-	unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
-	unsigned int code_len = OPCODE_BUFSIZE;
-	unsigned char c;
+	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;
+	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
 
 	printk(KERN_DEFAULT "Code: ");
 
 	ip = (u8 *)rip - code_prologue;
-	if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-		/* try starting at IP */
-		ip = (u8 *)rip;
-		code_len = code_len - code_prologue + 1;
+	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
+		pr_cont("Bad RIP value.\n");
+		return;
 	}
-	for (i = 0; i < code_len; i++, ip++) {
-		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-			pr_cont(" Bad RIP value.");
-			break;
-		}
-		if (ip == (u8 *)rip)
-			pr_cont("<%02x> ", c);
+
+	for (i = 0; i < OPCODE_BUFSIZE; i++, ip++) {
+		if (ip == rip)
+			pr_cont("<%02x> ", opcodes[i]);
 		else
-			pr_cont("%02x ", c);
+			pr_cont("%02x ", opcodes[i]);
 	}
 	pr_cont("\n");
 }
@@ -402,6 +397,10 @@ void show_regs(struct pt_regs *regs)
 	 */
 	if (!user_mode(regs)) {
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-		show_opcodes((u8 *)regs->ip);
+
+		if (regs->ip < PAGE_OFFSET)
+			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
+		else
+			show_opcodes((u8 *)regs->ip);
 	}
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/9] x86/dumpstack: Add loglevel argument to show_opcodes()
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (3 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 4/9] x86/dumpstack: Improve opcodes dumping in the Code: section Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:20   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults Borislav Petkov
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

Will be used in the next patch.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/stacktrace.h | 1 +
 arch/x86/kernel/dumpstack.c       | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 133d9425fced..0630eeb18bbc 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -111,4 +111,5 @@ static inline unsigned long caller_frame_pointer(void)
 	return (unsigned long)frame;
 }
 
+void show_opcodes(u8 *rip, const char *loglvl);
 #endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 1d6698b54527..1592d0c3ebb5 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -70,14 +70,14 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
-static void show_opcodes(u8 *rip)
+void show_opcodes(u8 *rip, const char *loglvl)
 {
 	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;
 	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
 
-	printk(KERN_DEFAULT "Code: ");
+	printk("%sCode: ", loglvl);
 
 	ip = (u8 *)rip - code_prologue;
 	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
@@ -401,6 +401,6 @@ void show_regs(struct pt_regs *regs)
 		if (regs->ip < PAGE_OFFSET)
 			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
 		else
-			show_opcodes((u8 *)regs->ip);
+			show_opcodes((u8 *)regs->ip, KERN_DEFAULT);
 	}
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (4 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 5/9] x86/dumpstack: Add loglevel argument to show_opcodes() Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:21   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 7/9] x86/dumpstack: Add a show_ip() function Borislav Petkov
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

Sometimes it is useful to see which user opcode bytes RIP points to
when a fault happens: be it to rule out RIP corruption, to dump info
early during boot, when doing core dumps is impossible due to not having
writable fs yet.

Sometimes it is useful if debugging an issue and one doesn't have access
to the executable which caused the fault in order to disassemble it.

That last aspect might have some security implications so
show_unhandled_signals could be revisited for that or a new config
option added.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/fault.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 73bd8c95ac71..a3fd94eff04d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -828,6 +828,8 @@ static inline void
 show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		unsigned long address, struct task_struct *tsk)
 {
+	const char *loglvl = task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG;
+
 	if (!unhandled_signal(tsk, SIGSEGV))
 		return;
 
@@ -835,13 +837,14 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		return;
 
 	printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",
-		task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
-		tsk->comm, task_pid_nr(tsk), address,
+		loglvl, tsk->comm, task_pid_nr(tsk), address,
 		(void *)regs->ip, (void *)regs->sp, error_code);
 
 	print_vma_addr(KERN_CONT " in ", regs->ip);
 
 	printk(KERN_CONT "\n");
+
+	show_opcodes((u8 *)regs->ip, loglvl);
 }
 
 static void
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/9] x86/dumpstack: Add a show_ip() function
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (5 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:21   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 8/9] x86/dumpstack: Save first regs set for the executive summary Borislav Petkov
  2018-04-17 16:11 ` [PATCH 9/9] x86/dumpstack: Explain the reasoning for the prologue and buffer size Borislav Petkov
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

... which shows the Instruction Pointer along with the insn bytes around
it. Use it whenever we print rIP. Drop the rIP < PAGE_OFFSET check since
our probe_kernel_read() can handle any address properly.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/stacktrace.h |  1 +
 arch/x86/kernel/dumpstack.c       | 23 +++++++++++++----------
 arch/x86/kernel/process_32.c      |  8 +++-----
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 0630eeb18bbc..b6dc698f992a 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -112,4 +112,5 @@ static inline unsigned long caller_frame_pointer(void)
 }
 
 void show_opcodes(u8 *rip, const char *loglvl);
+void show_ip(struct pt_regs *regs, const char *loglvl);
 #endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 1592d0c3ebb5..82da808b5c36 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -94,9 +94,19 @@ void show_opcodes(u8 *rip, const char *loglvl)
 	pr_cont("\n");
 }
 
+void show_ip(struct pt_regs *regs, const char *loglvl)
+{
+#ifdef CONFIG_X86_32
+	printk("%sEIP: %pS\n", loglvl, (void *)regs->ip);
+#else
+	printk("%sRIP: %04x:%pS\n", loglvl, (int)regs->cs, (void *)regs->ip);
+#endif
+	show_opcodes((u8 *)regs->ip, loglvl);
+}
+
 void show_iret_regs(struct pt_regs *regs)
 {
-	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
+	show_ip(regs, KERN_DEFAULT);
 	printk(KERN_DEFAULT "RSP: %04x:%016lx EFLAGS: %08lx", (int)regs->ss,
 		regs->sp, regs->flags);
 }
@@ -392,15 +402,8 @@ void show_regs(struct pt_regs *regs)
 	__show_regs(regs, all);
 
 	/*
-	 * When in-kernel, we also print out the stack and code at the
-	 * time of the fault..
+	 * When in-kernel, we also print out the stack at the time of the fault..
 	 */
-	if (!user_mode(regs)) {
+	if (!user_mode(regs))
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-
-		if (regs->ip < PAGE_OFFSET)
-			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
-		else
-			show_opcodes((u8 *)regs->ip, KERN_DEFAULT);
-	}
 }
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 5224c6099184..0ae659de21eb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -76,16 +76,14 @@ void __show_regs(struct pt_regs *regs, int all)
 		savesegment(gs, gs);
 	}
 
-	printk(KERN_DEFAULT "EIP: %pS\n", (void *)regs->ip);
-	printk(KERN_DEFAULT "EFLAGS: %08lx CPU: %d\n", regs->flags,
-		raw_smp_processor_id());
+	show_ip(regs, KERN_DEFAULT);
 
 	printk(KERN_DEFAULT "EAX: %08lx EBX: %08lx ECX: %08lx EDX: %08lx\n",
 		regs->ax, regs->bx, regs->cx, regs->dx);
 	printk(KERN_DEFAULT "ESI: %08lx EDI: %08lx EBP: %08lx ESP: %08lx\n",
 		regs->si, regs->di, regs->bp, sp);
-	printk(KERN_DEFAULT " DS: %04x ES: %04x FS: %04x GS: %04x SS: %04x\n",
-	       (u16)regs->ds, (u16)regs->es, (u16)regs->fs, gs, ss);
+	printk(KERN_DEFAULT "DS: %04x ES: %04x FS: %04x GS: %04x SS: %04x EFLAGS: %08lx\n",
+	       (u16)regs->ds, (u16)regs->es, (u16)regs->fs, gs, ss, regs->flags);
 
 	if (!all)
 		return;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/9] x86/dumpstack: Save first regs set for the executive summary
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (6 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 7/9] x86/dumpstack: Add a show_ip() function Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:22   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  2018-04-17 16:11 ` [PATCH 9/9] x86/dumpstack: Explain the reasoning for the prologue and buffer size Borislav Petkov
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

Save the regs set when we call __die() for the first time and print it
in oops_end().

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/dumpstack.c | 32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 82da808b5c36..ee344030fd0a 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -28,6 +28,8 @@ int panic_on_unrecovered_nmi;
 int panic_on_io_nmi;
 static int die_counter;
 
+static struct pt_regs exec_summary_regs;
+
 bool in_task_stack(unsigned long *stack, struct task_struct *task,
 		   struct stack_info *info)
 {
@@ -321,6 +323,9 @@ void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
 	raw_local_irq_restore(flags);
 	oops_exit();
 
+	/* Executive summary in case the oops scrolled away */
+	__show_regs(&exec_summary_regs, true);
+
 	if (!signr)
 		return;
 	if (in_interrupt())
@@ -339,10 +344,10 @@ NOKPROBE_SYMBOL(oops_end);
 
 int __die(const char *str, struct pt_regs *regs, long err)
 {
-#ifdef CONFIG_X86_32
-	unsigned short ss;
-	unsigned long sp;
-#endif
+	/* Save the regs of the first oops for the executive summary later. */
+	if (!die_counter)
+		exec_summary_regs = *regs;
+
 	printk(KERN_DEFAULT
 	       "%s: %04lx [#%d]%s%s%s%s%s\n", str, err & 0xffff, ++die_counter,
 	       IS_ENABLED(CONFIG_PREEMPT) ? " PREEMPT"         : "",
@@ -352,26 +357,13 @@ int __die(const char *str, struct pt_regs *regs, long err)
 	       IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION) ?
 	       (boot_cpu_has(X86_FEATURE_PTI) ? " PTI" : " NOPTI") : "");
 
+	show_regs(regs);
+	print_modules();
+
 	if (notify_die(DIE_OOPS, str, regs, err,
 			current->thread.trap_nr, SIGSEGV) == NOTIFY_STOP)
 		return 1;
 
-	print_modules();
-	show_regs(regs);
-#ifdef CONFIG_X86_32
-	if (user_mode(regs)) {
-		sp = regs->sp;
-		ss = regs->ss;
-	} else {
-		sp = kernel_stack_pointer(regs);
-		savesegment(ss, ss);
-	}
-	printk(KERN_EMERG "EIP: %pS SS:ESP: %04x:%08lx\n",
-	       (void *)regs->ip, ss, sp);
-#else
-	/* Executive summary in case the oops scrolled away */
-	printk(KERN_ALERT "RIP: %pS RSP: %016lx\n", (void *)regs->ip, regs->sp);
-#endif
 	return 0;
 }
 NOKPROBE_SYMBOL(__die);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 9/9] x86/dumpstack: Explain the reasoning for the prologue and buffer size
  2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
                   ` (7 preceding siblings ...)
  2018-04-17 16:11 ` [PATCH 8/9] x86/dumpstack: Save first regs set for the executive summary Borislav Petkov
@ 2018-04-17 16:11 ` Borislav Petkov
  2018-04-26 14:22   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  8 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2018-04-17 16:11 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

The whole reasoning behind the amount of opcode bytes dumped and
prologue length isn't very clear so let's hold down some of the reasons
for why it is done the way it is.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/dumpstack.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index ee344030fd0a..666a284116ac 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -72,6 +72,25 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
+/*
+ * There are a couple of reasons for the 2/3rd prologue, courtesy of Linus:
+ *
+ * In case where we don't have the exact kernel image (which, if we did, we can
+ * simply disassemble and navigate to the RIP), the purpose of the bigger
+ * prologue is to have more context and to be able to correlate the code from
+ * the different toolchains better.
+ *
+ * In addition, it helps in recreating the register allocation of the failing
+ * kernel and thus make sense of the register dump.
+ *
+ * What is more, the additional complication of a variable length insn arch like
+ * x86 warrants having longer byte sequence before rIP so that the disassembler
+ * can "sync" up properly and find instruction boundaries when decoding the
+ * opcode bytes.
+ *
+ * Thus, the 2/3rds prologue and 64 byte OPCODE_BUFSIZE is just a random
+ * guesstimate in attempt to achieve all of the above.
+ */
 void show_opcodes(u8 *rip, const char *loglvl)
 {
 	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Remove code_bytes
  2018-04-17 16:11 ` [PATCH 1/9] x86/dumpstack: Remove code_bytes Borislav Petkov
@ 2018-04-26 14:18   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jpoimboe, mingo, peterz, luto, hpa, torvalds, bp, tglx

Commit-ID:  5d12f0424edf01ccd8abbcba1c7d45fe0b23c779
Gitweb:     https://git.kernel.org/tip/5d12f0424edf01ccd8abbcba1c7d45fe0b23c779
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:16 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:25 +0200

x86/dumpstack: Remove code_bytes

This was added by

  86c418374223 ("[PATCH] i386: add option to show more code in oops reports")

long time ago but experience shows that 64 instruction bytes are plenty
when deciphering an oops. So get rid of it.

Removing it will simplify further enhancements to the opcodes dumping
machinery coming in the following patches.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-2-bp@alien8.de

---
 Documentation/admin-guide/kernel-parameters.txt |  5 -----
 arch/x86/kernel/dumpstack.c                     | 27 ++++---------------------
 2 files changed, 4 insertions(+), 28 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28ecdb6d..47aa554e41b7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -587,11 +587,6 @@
 			Sets the size of memory pool for coherent, atomic dma
 			allocations, by default set to 256K.
 
-	code_bytes	[X86] How many bytes of object code to print
-			in an oops report.
-			Range: 0 - 8192
-			Default: 64
-
 	com20020=	[HW,NET] ARCnet - COM20020 chipset
 			Format:
 			<io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 18fa9d74c182..593db796374d 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -22,9 +22,10 @@
 #include <asm/stacktrace.h>
 #include <asm/unwind.h>
 
+#define OPCODE_BUFSIZE 64
+
 int panic_on_unrecovered_nmi;
 int panic_on_io_nmi;
-static unsigned int code_bytes = 64;
 static int die_counter;
 
 bool in_task_stack(unsigned long *stack, struct task_struct *task,
@@ -356,26 +357,6 @@ void die(const char *str, struct pt_regs *regs, long err)
 	oops_end(flags, regs, sig);
 }
 
-static int __init code_bytes_setup(char *s)
-{
-	ssize_t ret;
-	unsigned long val;
-
-	if (!s)
-		return -EINVAL;
-
-	ret = kstrtoul(s, 0, &val);
-	if (ret)
-		return ret;
-
-	code_bytes = val;
-	if (code_bytes > 8192)
-		code_bytes = 8192;
-
-	return 1;
-}
-__setup("code_bytes=", code_bytes_setup);
-
 void show_regs(struct pt_regs *regs)
 {
 	bool all = true;
@@ -393,8 +374,8 @@ void show_regs(struct pt_regs *regs)
 	 * time of the fault..
 	 */
 	if (!user_mode(regs)) {
-		unsigned int code_prologue = code_bytes * 43 / 64;
-		unsigned int code_len = code_bytes;
+		unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
+		unsigned int code_len = OPCODE_BUFSIZE;
 		unsigned char c;
 		u8 *ip;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Unexport oops_begin()
  2018-04-17 16:11 ` [PATCH 2/9] x86/dumpstack: Unexport oops_begin() Borislav Petkov
@ 2018-04-26 14:19   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mingo, tglx, luto, peterz, jpoimboe, hpa, torvalds, bp

Commit-ID:  5df61707f0bdf8dce714a14806740e6abf2114c7
Gitweb:     https://git.kernel.org/tip/5df61707f0bdf8dce714a14806740e6abf2114c7
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:17 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:26 +0200

x86/dumpstack: Unexport oops_begin()

The only user outside of arch/ is not a module since

  86cd47334b00 ("ACPI, APEI, GHES, Prevent GHES to be built as module")

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-3-bp@alien8.de

---
 arch/x86/kernel/dumpstack.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 593db796374d..579455c2b91e 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -268,7 +268,6 @@ unsigned long oops_begin(void)
 	bust_spinlocks(1);
 	return flags;
 }
-EXPORT_SYMBOL_GPL(oops_begin);
 NOKPROBE_SYMBOL(oops_begin);
 
 void __noreturn rewind_stack_do_exit(int signr);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Carve out code-dumping into a function
  2018-04-17 16:11 ` [PATCH 3/9] x86/dumpstack: Carve out Code: dumping into a function Borislav Petkov
@ 2018-04-26 14:19   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, bp, hpa, linux-kernel, jpoimboe, mingo, peterz, luto, torvalds

Commit-ID:  f0a1d7c11c3ebe2f601b448d13e7fbc3a0364a03
Gitweb:     https://git.kernel.org/tip/f0a1d7c11c3ebe2f601b448d13e7fbc3a0364a03
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:18 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:26 +0200

x86/dumpstack: Carve out code-dumping into a function

No functionality change, carve it out into a separate function for later
changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-4-bp@alien8.de

---
 arch/x86/kernel/dumpstack.c | 57 ++++++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 579455c2b91e..eb9d6c00a52f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -70,6 +70,35 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
+static void show_opcodes(u8 *rip)
+{
+	unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
+	unsigned int code_len = OPCODE_BUFSIZE;
+	unsigned char c;
+	u8 *ip;
+	int i;
+
+	printk(KERN_DEFAULT "Code: ");
+
+	ip = (u8 *)rip - code_prologue;
+	if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
+		/* try starting at IP */
+		ip = (u8 *)rip;
+		code_len = code_len - code_prologue + 1;
+	}
+	for (i = 0; i < code_len; i++, ip++) {
+		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
+			pr_cont(" Bad RIP value.");
+			break;
+		}
+		if (ip == (u8 *)rip)
+			pr_cont("<%02x> ", c);
+		else
+			pr_cont("%02x ", c);
+	}
+	pr_cont("\n");
+}
+
 void show_iret_regs(struct pt_regs *regs)
 {
 	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
@@ -359,7 +388,6 @@ void die(const char *str, struct pt_regs *regs, long err)
 void show_regs(struct pt_regs *regs)
 {
 	bool all = true;
-	int i;
 
 	show_regs_print_info(KERN_DEFAULT);
 
@@ -373,32 +401,7 @@ void show_regs(struct pt_regs *regs)
 	 * time of the fault..
 	 */
 	if (!user_mode(regs)) {
-		unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
-		unsigned int code_len = OPCODE_BUFSIZE;
-		unsigned char c;
-		u8 *ip;
-
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-
-		printk(KERN_DEFAULT "Code: ");
-
-		ip = (u8 *)regs->ip - code_prologue;
-		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-			/* try starting at IP */
-			ip = (u8 *)regs->ip;
-			code_len = code_len - code_prologue + 1;
-		}
-		for (i = 0; i < code_len; i++, ip++) {
-			if (ip < (u8 *)PAGE_OFFSET ||
-					probe_kernel_address(ip, c)) {
-				pr_cont(" Bad RIP value.");
-				break;
-			}
-			if (ip == (u8 *)regs->ip)
-				pr_cont("<%02x> ", c);
-			else
-				pr_cont("%02x ", c);
-		}
+		show_opcodes((u8 *)regs->ip);
 	}
-	pr_cont("\n");
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Improve opcodes dumping in the code section
  2018-04-17 16:11 ` [PATCH 4/9] x86/dumpstack: Improve opcodes dumping in the Code: section Borislav Petkov
@ 2018-04-26 14:20   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, luto, hpa, jpoimboe, peterz, bp, tglx, torvalds

Commit-ID:  9e4a90fd34445df64a13d136676a31a4dd22aea3
Gitweb:     https://git.kernel.org/tip/9e4a90fd34445df64a13d136676a31a4dd22aea3
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:19 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:26 +0200

x86/dumpstack: Improve opcodes dumping in the code section

The code used to iterate byte-by-byte over the bytes around RIP and that
is expensive: disabling pagefaults around it, copy_from_user, etc...

Make it read the whole buffer of OPCODE_BUFSIZE size in one go. Use a
statically allocated 64 bytes buffer so that concurrent show_opcodes()
do not interleave in the output even though in the majority of the cases
it's serialized via die_lock. Except the #PF path which doesn't...

Also, do the PAGE_OFFSET check outside of the function because latter
will be reused in other context.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-5-bp@alien8.de

---
 arch/x86/kernel/dumpstack.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index eb9d6c00a52f..1d6698b54527 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -72,29 +72,24 @@ static void printk_stack_address(unsigned long address, int reliable,
 
 static void show_opcodes(u8 *rip)
 {
-	unsigned int code_prologue = OPCODE_BUFSIZE * 43 / 64;
-	unsigned int code_len = OPCODE_BUFSIZE;
-	unsigned char c;
+	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;
+	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
 
 	printk(KERN_DEFAULT "Code: ");
 
 	ip = (u8 *)rip - code_prologue;
-	if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-		/* try starting at IP */
-		ip = (u8 *)rip;
-		code_len = code_len - code_prologue + 1;
+	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
+		pr_cont("Bad RIP value.\n");
+		return;
 	}
-	for (i = 0; i < code_len; i++, ip++) {
-		if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) {
-			pr_cont(" Bad RIP value.");
-			break;
-		}
-		if (ip == (u8 *)rip)
-			pr_cont("<%02x> ", c);
+
+	for (i = 0; i < OPCODE_BUFSIZE; i++, ip++) {
+		if (ip == rip)
+			pr_cont("<%02x> ", opcodes[i]);
 		else
-			pr_cont("%02x ", c);
+			pr_cont("%02x ", opcodes[i]);
 	}
 	pr_cont("\n");
 }
@@ -402,6 +397,10 @@ void show_regs(struct pt_regs *regs)
 	 */
 	if (!user_mode(regs)) {
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-		show_opcodes((u8 *)regs->ip);
+
+		if (regs->ip < PAGE_OFFSET)
+			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
+		else
+			show_opcodes((u8 *)regs->ip);
 	}
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Add loglevel argument to show_opcodes()
  2018-04-17 16:11 ` [PATCH 5/9] x86/dumpstack: Add loglevel argument to show_opcodes() Borislav Petkov
@ 2018-04-26 14:20   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, torvalds, peterz, bp, mingo, linux-kernel, luto, jpoimboe

Commit-ID:  e8b6f984516b1fcb0ccf4469ca42777c9c2dc76d
Gitweb:     https://git.kernel.org/tip/e8b6f984516b1fcb0ccf4469ca42777c9c2dc76d
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:20 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:26 +0200

x86/dumpstack: Add loglevel argument to show_opcodes()

Will be used in the next patch.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-6-bp@alien8.de

---
 arch/x86/include/asm/stacktrace.h | 1 +
 arch/x86/kernel/dumpstack.c       | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 133d9425fced..0630eeb18bbc 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -111,4 +111,5 @@ static inline unsigned long caller_frame_pointer(void)
 	return (unsigned long)frame;
 }
 
+void show_opcodes(u8 *rip, const char *loglvl);
 #endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 1d6698b54527..1592d0c3ebb5 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -70,14 +70,14 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
-static void show_opcodes(u8 *rip)
+void show_opcodes(u8 *rip, const char *loglvl)
 {
 	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;
 	u8 opcodes[OPCODE_BUFSIZE];
 	u8 *ip;
 	int i;
 
-	printk(KERN_DEFAULT "Code: ");
+	printk("%sCode: ", loglvl);
 
 	ip = (u8 *)rip - code_prologue;
 	if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) {
@@ -401,6 +401,6 @@ void show_regs(struct pt_regs *regs)
 		if (regs->ip < PAGE_OFFSET)
 			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
 		else
-			show_opcodes((u8 *)regs->ip);
+			show_opcodes((u8 *)regs->ip, KERN_DEFAULT);
 	}
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/fault: Dump user opcode bytes on fatal faults
  2018-04-17 16:11 ` [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults Borislav Petkov
@ 2018-04-26 14:21   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, bp, torvalds, mingo, tglx, linux-kernel, luto, peterz, hpa

Commit-ID:  ba54d856a9d8a9c56b87e20c88602b7e3cb568fb
Gitweb:     https://git.kernel.org/tip/ba54d856a9d8a9c56b87e20c88602b7e3cb568fb
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:21 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:27 +0200

x86/fault: Dump user opcode bytes on fatal faults

Sometimes it is useful to see which user opcode bytes RIP points to
when a fault happens: be it to rule out RIP corruption, to dump info
early during boot, when doing core dumps is impossible due to not having
a writable filesystem yet.

Sometimes it is useful if debugging an issue and one doesn't have access
to the executable which caused the fault in order to disassemble it.

That last aspect might have some security implications so
show_unhandled_signals could be revisited for that or a new config option
added.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-7-bp@alien8.de

---
 arch/x86/mm/fault.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 73bd8c95ac71..a3fd94eff04d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -828,6 +828,8 @@ static inline void
 show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		unsigned long address, struct task_struct *tsk)
 {
+	const char *loglvl = task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG;
+
 	if (!unhandled_signal(tsk, SIGSEGV))
 		return;
 
@@ -835,13 +837,14 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		return;
 
 	printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",
-		task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
-		tsk->comm, task_pid_nr(tsk), address,
+		loglvl, tsk->comm, task_pid_nr(tsk), address,
 		(void *)regs->ip, (void *)regs->sp, error_code);
 
 	print_vma_addr(KERN_CONT " in ", regs->ip);
 
 	printk(KERN_CONT "\n");
+
+	show_opcodes((u8 *)regs->ip, loglvl);
 }
 
 static void

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Add a show_ip() function
  2018-04-17 16:11 ` [PATCH 7/9] x86/dumpstack: Add a show_ip() function Borislav Petkov
@ 2018-04-26 14:21   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, peterz, tglx, bp, hpa, jpoimboe, luto, torvalds, linux-kernel

Commit-ID:  7cccf0725cf7402514e09c52b089430005798b7f
Gitweb:     https://git.kernel.org/tip/7cccf0725cf7402514e09c52b089430005798b7f
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:22 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:27 +0200

x86/dumpstack: Add a show_ip() function

... which shows the Instruction Pointer along with the insn bytes around
it. Use it whenever rIP is printed. Drop the rIP < PAGE_OFFSET check since
probe_kernel_read() can handle any address properly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-8-bp@alien8.de

---
 arch/x86/include/asm/stacktrace.h |  1 +
 arch/x86/kernel/dumpstack.c       | 23 +++++++++++++----------
 arch/x86/kernel/process_32.c      |  8 +++-----
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 0630eeb18bbc..b6dc698f992a 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -112,4 +112,5 @@ static inline unsigned long caller_frame_pointer(void)
 }
 
 void show_opcodes(u8 *rip, const char *loglvl);
+void show_ip(struct pt_regs *regs, const char *loglvl);
 #endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 1592d0c3ebb5..82da808b5c36 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -94,9 +94,19 @@ void show_opcodes(u8 *rip, const char *loglvl)
 	pr_cont("\n");
 }
 
+void show_ip(struct pt_regs *regs, const char *loglvl)
+{
+#ifdef CONFIG_X86_32
+	printk("%sEIP: %pS\n", loglvl, (void *)regs->ip);
+#else
+	printk("%sRIP: %04x:%pS\n", loglvl, (int)regs->cs, (void *)regs->ip);
+#endif
+	show_opcodes((u8 *)regs->ip, loglvl);
+}
+
 void show_iret_regs(struct pt_regs *regs)
 {
-	printk(KERN_DEFAULT "RIP: %04x:%pS\n", (int)regs->cs, (void *)regs->ip);
+	show_ip(regs, KERN_DEFAULT);
 	printk(KERN_DEFAULT "RSP: %04x:%016lx EFLAGS: %08lx", (int)regs->ss,
 		regs->sp, regs->flags);
 }
@@ -392,15 +402,8 @@ void show_regs(struct pt_regs *regs)
 	__show_regs(regs, all);
 
 	/*
-	 * When in-kernel, we also print out the stack and code at the
-	 * time of the fault..
+	 * When in-kernel, we also print out the stack at the time of the fault..
 	 */
-	if (!user_mode(regs)) {
+	if (!user_mode(regs))
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
-
-		if (regs->ip < PAGE_OFFSET)
-			printk(KERN_DEFAULT "Code: Bad RIP value.\n");
-		else
-			show_opcodes((u8 *)regs->ip, KERN_DEFAULT);
-	}
 }
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 5224c6099184..0ae659de21eb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -76,16 +76,14 @@ void __show_regs(struct pt_regs *regs, int all)
 		savesegment(gs, gs);
 	}
 
-	printk(KERN_DEFAULT "EIP: %pS\n", (void *)regs->ip);
-	printk(KERN_DEFAULT "EFLAGS: %08lx CPU: %d\n", regs->flags,
-		raw_smp_processor_id());
+	show_ip(regs, KERN_DEFAULT);
 
 	printk(KERN_DEFAULT "EAX: %08lx EBX: %08lx ECX: %08lx EDX: %08lx\n",
 		regs->ax, regs->bx, regs->cx, regs->dx);
 	printk(KERN_DEFAULT "ESI: %08lx EDI: %08lx EBP: %08lx ESP: %08lx\n",
 		regs->si, regs->di, regs->bp, sp);
-	printk(KERN_DEFAULT " DS: %04x ES: %04x FS: %04x GS: %04x SS: %04x\n",
-	       (u16)regs->ds, (u16)regs->es, (u16)regs->fs, gs, ss);
+	printk(KERN_DEFAULT "DS: %04x ES: %04x FS: %04x GS: %04x SS: %04x EFLAGS: %08lx\n",
+	       (u16)regs->ds, (u16)regs->es, (u16)regs->fs, gs, ss, regs->flags);
 
 	if (!all)
 		return;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Save first regs set for the executive summary
  2018-04-17 16:11 ` [PATCH 8/9] x86/dumpstack: Save first regs set for the executive summary Borislav Petkov
@ 2018-04-26 14:22   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, luto, bp, hpa, mingo, jpoimboe, torvalds, peterz, linux-kernel

Commit-ID:  602bd705da334f214fc03db328dc37d2f1f33307
Gitweb:     https://git.kernel.org/tip/602bd705da334f214fc03db328dc37d2f1f33307
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:23 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:28 +0200

x86/dumpstack: Save first regs set for the executive summary

Save the regs set when __die() is onvoked for the first time and print it
in oops_end().

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-9-bp@alien8.de

---
 arch/x86/kernel/dumpstack.c | 32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 82da808b5c36..ee344030fd0a 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -28,6 +28,8 @@ int panic_on_unrecovered_nmi;
 int panic_on_io_nmi;
 static int die_counter;
 
+static struct pt_regs exec_summary_regs;
+
 bool in_task_stack(unsigned long *stack, struct task_struct *task,
 		   struct stack_info *info)
 {
@@ -321,6 +323,9 @@ void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
 	raw_local_irq_restore(flags);
 	oops_exit();
 
+	/* Executive summary in case the oops scrolled away */
+	__show_regs(&exec_summary_regs, true);
+
 	if (!signr)
 		return;
 	if (in_interrupt())
@@ -339,10 +344,10 @@ NOKPROBE_SYMBOL(oops_end);
 
 int __die(const char *str, struct pt_regs *regs, long err)
 {
-#ifdef CONFIG_X86_32
-	unsigned short ss;
-	unsigned long sp;
-#endif
+	/* Save the regs of the first oops for the executive summary later. */
+	if (!die_counter)
+		exec_summary_regs = *regs;
+
 	printk(KERN_DEFAULT
 	       "%s: %04lx [#%d]%s%s%s%s%s\n", str, err & 0xffff, ++die_counter,
 	       IS_ENABLED(CONFIG_PREEMPT) ? " PREEMPT"         : "",
@@ -352,26 +357,13 @@ int __die(const char *str, struct pt_regs *regs, long err)
 	       IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION) ?
 	       (boot_cpu_has(X86_FEATURE_PTI) ? " PTI" : " NOPTI") : "");
 
+	show_regs(regs);
+	print_modules();
+
 	if (notify_die(DIE_OOPS, str, regs, err,
 			current->thread.trap_nr, SIGSEGV) == NOTIFY_STOP)
 		return 1;
 
-	print_modules();
-	show_regs(regs);
-#ifdef CONFIG_X86_32
-	if (user_mode(regs)) {
-		sp = regs->sp;
-		ss = regs->ss;
-	} else {
-		sp = kernel_stack_pointer(regs);
-		savesegment(ss, ss);
-	}
-	printk(KERN_EMERG "EIP: %pS SS:ESP: %04x:%08lx\n",
-	       (void *)regs->ip, ss, sp);
-#else
-	/* Executive summary in case the oops scrolled away */
-	printk(KERN_ALERT "RIP: %pS RSP: %016lx\n", (void *)regs->ip, regs->sp);
-#endif
 	return 0;
 }
 NOKPROBE_SYMBOL(__die);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/cleanups] x86/dumpstack: Explain the reasoning for the prologue and buffer size
  2018-04-17 16:11 ` [PATCH 9/9] x86/dumpstack: Explain the reasoning for the prologue and buffer size Borislav Petkov
@ 2018-04-26 14:22   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-04-26 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, torvalds, peterz, tglx, hpa, mingo, luto, linux-kernel, bp

Commit-ID:  4dba072cd097f35fa8f77c49d909ada2b079a4c4
Gitweb:     https://git.kernel.org/tip/4dba072cd097f35fa8f77c49d909ada2b079a4c4
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 17 Apr 2018 18:11:24 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 26 Apr 2018 16:15:28 +0200

x86/dumpstack: Explain the reasoning for the prologue and buffer size

The whole reasoning behind the amount of opcode bytes dumped and prologue
length isn't very clear so write down some of the reasons for why it is
done the way it is.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20180417161124.5294-10-bp@alien8.de

---
 arch/x86/kernel/dumpstack.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index ee344030fd0a..666a284116ac 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -72,6 +72,25 @@ static void printk_stack_address(unsigned long address, int reliable,
 	printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
 }
 
+/*
+ * There are a couple of reasons for the 2/3rd prologue, courtesy of Linus:
+ *
+ * In case where we don't have the exact kernel image (which, if we did, we can
+ * simply disassemble and navigate to the RIP), the purpose of the bigger
+ * prologue is to have more context and to be able to correlate the code from
+ * the different toolchains better.
+ *
+ * In addition, it helps in recreating the register allocation of the failing
+ * kernel and thus make sense of the register dump.
+ *
+ * What is more, the additional complication of a variable length insn arch like
+ * x86 warrants having longer byte sequence before rIP so that the disassembler
+ * can "sync" up properly and find instruction boundaries when decoding the
+ * opcode bytes.
+ *
+ * Thus, the 2/3rds prologue and 64 byte OPCODE_BUFSIZE is just a random
+ * guesstimate in attempt to achieve all of the above.
+ */
 void show_opcodes(u8 *rip, const char *loglvl)
 {
 	unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults
  2018-03-15 15:44 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v2 Borislav Petkov
@ 2018-03-15 15:44 ` Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2018-03-15 15:44 UTC (permalink / raw)
  To: X86 ML
  Cc: Andy Lutomirski, Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, LKML

From: Borislav Petkov <bp@suse.de>

Sometimes it is useful to see which user opcode bytes RIP points to
when a fault happens: be it to rule out RIP corruption, to dump info
early during boot, when doing core dumps is impossible due to not having
writable fs yet.

Sometimes it is useful if debugging an issue and one doesn't have access
to the executable which caused the fault in order to disassemble it.

That last aspect might have some security implications so
show_unhandled_signals could be revisited for that or a new config
option added.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/fault.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 26865147a507..b3c19f734442 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -850,6 +850,8 @@ static inline void
 show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		unsigned long address, struct task_struct *tsk)
 {
+	const char *loglvl = task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG;
+
 	if (!unhandled_signal(tsk, SIGSEGV))
 		return;
 
@@ -857,13 +859,14 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
 		return;
 
 	printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",
-		task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
-		tsk->comm, task_pid_nr(tsk), address,
+		loglvl, tsk->comm, task_pid_nr(tsk), address,
 		(void *)regs->ip, (void *)regs->sp, error_code);
 
 	print_vma_addr(KERN_CONT " in ", regs->ip);
 
 	printk(KERN_CONT "\n");
+
+	show_opcodes((u8 *)regs->ip, loglvl);
 }
 
 static void
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-04-26 14:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 16:11 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v3 Borislav Petkov
2018-04-17 16:11 ` [PATCH 1/9] x86/dumpstack: Remove code_bytes Borislav Petkov
2018-04-26 14:18   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 2/9] x86/dumpstack: Unexport oops_begin() Borislav Petkov
2018-04-26 14:19   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 3/9] x86/dumpstack: Carve out Code: dumping into a function Borislav Petkov
2018-04-26 14:19   ` [tip:x86/cleanups] x86/dumpstack: Carve out code-dumping " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 4/9] x86/dumpstack: Improve opcodes dumping in the Code: section Borislav Petkov
2018-04-26 14:20   ` [tip:x86/cleanups] x86/dumpstack: Improve opcodes dumping in the code section tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 5/9] x86/dumpstack: Add loglevel argument to show_opcodes() Borislav Petkov
2018-04-26 14:20   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults Borislav Petkov
2018-04-26 14:21   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 7/9] x86/dumpstack: Add a show_ip() function Borislav Petkov
2018-04-26 14:21   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 8/9] x86/dumpstack: Save first regs set for the executive summary Borislav Petkov
2018-04-26 14:22   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
2018-04-17 16:11 ` [PATCH 9/9] x86/dumpstack: Explain the reasoning for the prologue and buffer size Borislav Petkov
2018-04-26 14:22   ` [tip:x86/cleanups] " tip-bot for Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2018-03-15 15:44 [PATCH 0/9] x86/dumpstack: Cleanups and user opcode bytes Code: section, v2 Borislav Petkov
2018-03-15 15:44 ` [PATCH 6/9] x86/fault: Dump user opcode bytes on fatal faults Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.