All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] kernel NULL pointer dereference in kprobe_int3_handler
@ 2022-07-27 21:01 Daniel Müller
  2022-07-28  2:22 ` Chen Zhongjin
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Daniel Müller @ 2022-07-27 21:01 UTC (permalink / raw)
  To: bpf; +Cc: mhiramat, kernel-team

Hi,

I've seen a NULL pointer dereference in kprobe_int3_handler, in code that seems
to have gotten added with 6256e668b7af9 ("x86/kprobes: Use int3 instead of debug
trap for single-step").
Specifically, our CI has reported the following (running test_progs-no_alu32):

  [ 1033.068258] test_progs-no_a[1177] is installing a program with bpf_probe_write_user helper that may corrupt user memory!
  [ 1040.264691] BUG: kernel NULL pointer dereference, address: 0000000000000058
  [ 1040.264856] #PF: supervisor read access in kernel mode
  [ 1040.264890] #PF: error_code(0x0000) - not-present page
  [ 1040.264961] PGD 0 P4D 0 
  [ 1040.265183] Oops: 0000 [#1] PREEMPT SMP NOPTI
  [ 1040.265183] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  OE     5.19.0-rc7-g4129b786299d #1
  [ 1040.265183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
  [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
  [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
  [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
  [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
  [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
  [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
  [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
  [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
  [ 1040.265183] Call Trace:
  [ 1040.265183]  <TASK>
  [ 1040.265183]  do_int3+0xf/0x50
  [ 1040.265183]  exc_int3+0x87/0xd0
  [ 1040.265183]  asm_exc_int3+0x35/0x40
  [ 1040.265183] RIP: 0010:__schedule+0x3f9/0xbf0
  [ 1040.265183] Code: 83 5a fe ff ff 65 ff 05 e5 61 61 7e 48 8b 05 3e cb 68 01 48 85 c0 74 16 48 8b 78 08 4c 89 f1 4c 89 ea 44 8b 45 ac 8b 75 b8 e8 <53> 6c 79 ff 65 ff 0d bc 61 61 7e 0f 85 0d fe ff ff e8 a0 cf 5f ff
  [ 1040.265183] RSP: 0018:ffffb4140009be70 EFLAGS: 00000086
  [ 1040.265183] RAX: ffff9490056e0b90 RBX: ffff9490002f39e8 RCX: ffff949008758000
  [ 1040.265183] RDX: ffff9490002f3300 RSI: 0000000000000000 RDI: 0000000000000000
  [ 1040.265183] RBP: ffffb4140009bec8 R08: 0000000000000000 R09: 1dc944f200000000
  [ 1040.265183] R10: 0000000000000001 R11: 0000000000080000 R12: ffff9490b9c6c8c0
  [ 1040.265183] R13: ffff9490002f3300 R14: ffff949008758000 R15: ffff9490b9c6c8d8
  [ 1040.265183]  ? __schedule+0x3f9/0xbf0
  [ 1040.265183]  schedule_idle+0x26/0x40
  [ 1040.265183]  do_idle+0x177/0x250
  [ 1040.265183]  cpu_startup_entry+0x19/0x20
  [ 1040.265183]  start_secondary+0xed/0xf0
  [ 1040.265183]  secondary_startup_64_no_verify+0xe0/0xeb
  [ 1040.265183]  </TASK>
  [ 1040.265183] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod]
  [ 1040.265183] CR2: 0000000000000058
  [ 1040.265183] ---[ end trace 0000000000000000 ]---
  [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
  [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
  [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
  [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
  [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
  [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
  [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
  [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
  [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
  [ 1040.265183] Kernel panic - not syncing: Fatal exception in interrupt
  [ 1040.265183] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

(it was sync'ed to somewhere around 40b09653b1977 ("selftests/bpf: Adjust
vmtest.sh to use local kernel configuration"); I can probably piece together the
exact kernel configuration if needed, but the inquiry is of a more general
nature)

I am wondering what is the reason for us not checking whether kprobe_running
returned a non-NULL pointer here (as we do elsewhere):
https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L986
? Is that an oversight or should some kind of invariant be upheld at this point?

kprobe_int3_handler+0xd4/0x1a0 maps to line 987 in the above file. Address
0000000000000058 is exactly the offset that p->ainsn.insn is at, so it seems as
if p is NULL.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kernel NULL pointer dereference in kprobe_int3_handler
  2022-07-27 21:01 [BUG] kernel NULL pointer dereference in kprobe_int3_handler Daniel Müller
@ 2022-07-28  2:22 ` Chen Zhongjin
  2022-07-31 16:18 ` Masami Hiramatsu
  2022-08-02 11:03 ` [tip: perf/urgent] x86/kprobes: Update kcb status flag after singlestepping tip-bot2 for Masami Hiramatsu (Google)
  2 siblings, 0 replies; 5+ messages in thread
From: Chen Zhongjin @ 2022-07-28  2:22 UTC (permalink / raw)
  To: Daniel Müller, bpf; +Cc: mhiramat, kernel-team

Hi,

On 2022/7/28 5:01, Daniel Müller wrote:
> Hi,
>
> I've seen a NULL pointer dereference in kprobe_int3_handler, in code that seems
> to have gotten added with 6256e668b7af9 ("x86/kprobes: Use int3 instead of debug
> trap for single-step").
> Specifically, our CI has reported the following (running test_progs-no_alu32):
>
>    [ 1033.068258] test_progs-no_a[1177] is installing a program with bpf_probe_write_user helper that may corrupt user memory!
>    [ 1040.264691] BUG: kernel NULL pointer dereference, address: 0000000000000058
>    [ 1040.264856] #PF: supervisor read access in kernel mode
>    [ 1040.264890] #PF: error_code(0x0000) - not-present page
>    [ 1040.264961] PGD 0 P4D 0
>    [ 1040.265183] Oops: 0000 [#1] PREEMPT SMP NOPTI
>    [ 1040.265183] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  OE     5.19.0-rc7-g4129b786299d #1
>    [ 1040.265183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>    [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
>    [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
>    [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
>    [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
>    [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
>    [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
>    [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>    [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
>    [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
>    [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
>    [ 1040.265183] Call Trace:
>    [ 1040.265183]  <TASK>
>    [ 1040.265183]  do_int3+0xf/0x50
>    [ 1040.265183]  exc_int3+0x87/0xd0
>    [ 1040.265183]  asm_exc_int3+0x35/0x40
>    [ 1040.265183] RIP: 0010:__schedule+0x3f9/0xbf0
>    [ 1040.265183] Code: 83 5a fe ff ff 65 ff 05 e5 61 61 7e 48 8b 05 3e cb 68 01 48 85 c0 74 16 48 8b 78 08 4c 89 f1 4c 89 ea 44 8b 45 ac 8b 75 b8 e8 <53> 6c 79 ff 65 ff 0d bc 61 61 7e 0f 85 0d fe ff ff e8 a0 cf 5f ff
>    [ 1040.265183] RSP: 0018:ffffb4140009be70 EFLAGS: 00000086
>    [ 1040.265183] RAX: ffff9490056e0b90 RBX: ffff9490002f39e8 RCX: ffff949008758000
>    [ 1040.265183] RDX: ffff9490002f3300 RSI: 0000000000000000 RDI: 0000000000000000
>    [ 1040.265183] RBP: ffffb4140009bec8 R08: 0000000000000000 R09: 1dc944f200000000
>    [ 1040.265183] R10: 0000000000000001 R11: 0000000000080000 R12: ffff9490b9c6c8c0
>    [ 1040.265183] R13: ffff9490002f3300 R14: ffff949008758000 R15: ffff9490b9c6c8d8
>    [ 1040.265183]  ? __schedule+0x3f9/0xbf0
>    [ 1040.265183]  schedule_idle+0x26/0x40
>    [ 1040.265183]  do_idle+0x177/0x250
>    [ 1040.265183]  cpu_startup_entry+0x19/0x20
>    [ 1040.265183]  start_secondary+0xed/0xf0
>    [ 1040.265183]  secondary_startup_64_no_verify+0xe0/0xeb
>    [ 1040.265183]  </TASK>
>    [ 1040.265183] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod]
>    [ 1040.265183] CR2: 0000000000000058
>    [ 1040.265183] ---[ end trace 0000000000000000 ]---
>    [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
>    [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
>    [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
>    [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
>    [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
>    [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
>    [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>    [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
>    [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
>    [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
>    [ 1040.265183] Kernel panic - not syncing: Fatal exception in interrupt
>    [ 1040.265183] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> (it was sync'ed to somewhere around 40b09653b1977 ("selftests/bpf: Adjust
> vmtest.sh to use local kernel configuration"); I can probably piece together the
> exact kernel configuration if needed, but the inquiry is of a more general
> nature)
>
> I am wondering what is the reason for us not checking whether kprobe_running
> returned a non-NULL pointer here (as we do elsewhere):
> https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L986
> ? Is that an oversight or should some kind of invariant be upheld at this point?

I think it's kprobe_is_ss(kcb) promises this.

KPROBE_HIT_SS and KPROBE_REENTER are both only set in setup_singlestep, 
which set_current_kprobe must be run before that.

And p for set_current_kprobe have been checked not NULL in 
kprobe_int3_handle.

I didn't find any path that current_kprobe can be set NULL here by 
viewing code. Is this bug reproducible? It is not a multi threads

problem I guess it can be reproduced easily.

> kprobe_int3_handler+0xd4/0x1a0 maps to line 987 in the above file. Address
> 0000000000000058 is exactly the offset that p->ainsn.insn is at, so it seems as
> if p is NULL.
>
> Thanks,
> Daniel

Best,

Chen



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] kernel NULL pointer dereference in kprobe_int3_handler
  2022-07-27 21:01 [BUG] kernel NULL pointer dereference in kprobe_int3_handler Daniel Müller
  2022-07-28  2:22 ` Chen Zhongjin
@ 2022-07-31 16:18 ` Masami Hiramatsu
  2022-08-01 19:42   ` Daniel Müller
  2022-08-02 11:03 ` [tip: perf/urgent] x86/kprobes: Update kcb status flag after singlestepping tip-bot2 for Masami Hiramatsu (Google)
  2 siblings, 1 reply; 5+ messages in thread
From: Masami Hiramatsu @ 2022-07-31 16:18 UTC (permalink / raw)
  To: Daniel Müller
  Cc: bpf, mhiramat, kernel-team, Andy Lutomirski, Peter Zijlstra

On Wed, 27 Jul 2022 21:01:36 +0000
Daniel Müller <deso@posteo.net> wrote:

> Hi,
> 
> I've seen a NULL pointer dereference in kprobe_int3_handler, in code that seems
> to have gotten added with 6256e668b7af9 ("x86/kprobes: Use int3 instead of debug
> trap for single-step").
> Specifically, our CI has reported the following (running test_progs-no_alu32):
> 
>   [ 1033.068258] test_progs-no_a[1177] is installing a program with bpf_probe_write_user helper that may corrupt user memory!
>   [ 1040.264691] BUG: kernel NULL pointer dereference, address: 0000000000000058
>   [ 1040.264856] #PF: supervisor read access in kernel mode
>   [ 1040.264890] #PF: error_code(0x0000) - not-present page
>   [ 1040.264961] PGD 0 P4D 0 
>   [ 1040.265183] Oops: 0000 [#1] PREEMPT SMP NOPTI
>   [ 1040.265183] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  OE     5.19.0-rc7-g4129b786299d #1
>   [ 1040.265183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>   [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
>   [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
>   [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
>   [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
>   [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
>   [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
>   [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>   [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
>   [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
>   [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
>   [ 1040.265183] Call Trace:
>   [ 1040.265183]  <TASK>
>   [ 1040.265183]  do_int3+0xf/0x50
>   [ 1040.265183]  exc_int3+0x87/0xd0
>   [ 1040.265183]  asm_exc_int3+0x35/0x40
>   [ 1040.265183] RIP: 0010:__schedule+0x3f9/0xbf0
>   [ 1040.265183] Code: 83 5a fe ff ff 65 ff 05 e5 61 61 7e 48 8b 05 3e cb 68 01 48 85 c0 74 16 48 8b 78 08 4c 89 f1 4c 89 ea 44 8b 45 ac 8b 75 b8 e8 <53> 6c 79 ff 65 ff 0d bc 61 61 7e 0f 85 0d fe ff ff e8 a0 cf 5f ff
>   [ 1040.265183] RSP: 0018:ffffb4140009be70 EFLAGS: 00000086
>   [ 1040.265183] RAX: ffff9490056e0b90 RBX: ffff9490002f39e8 RCX: ffff949008758000
>   [ 1040.265183] RDX: ffff9490002f3300 RSI: 0000000000000000 RDI: 0000000000000000
>   [ 1040.265183] RBP: ffffb4140009bec8 R08: 0000000000000000 R09: 1dc944f200000000
>   [ 1040.265183] R10: 0000000000000001 R11: 0000000000080000 R12: ffff9490b9c6c8c0
>   [ 1040.265183] R13: ffff9490002f3300 R14: ffff949008758000 R15: ffff9490b9c6c8d8
>   [ 1040.265183]  ? __schedule+0x3f9/0xbf0
>   [ 1040.265183]  schedule_idle+0x26/0x40
>   [ 1040.265183]  do_idle+0x177/0x250
>   [ 1040.265183]  cpu_startup_entry+0x19/0x20
>   [ 1040.265183]  start_secondary+0xed/0xf0
>   [ 1040.265183]  secondary_startup_64_no_verify+0xe0/0xeb
>   [ 1040.265183]  </TASK>
>   [ 1040.265183] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod]
>   [ 1040.265183] CR2: 0000000000000058
>   [ 1040.265183] ---[ end trace 0000000000000000 ]---
>   [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
>   [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
>   [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
>   [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
>   [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
>   [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
>   [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>   [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
>   [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
>   [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
>   [ 1040.265183] Kernel panic - not syncing: Fatal exception in interrupt
>   [ 1040.265183] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> (it was sync'ed to somewhere around 40b09653b1977 ("selftests/bpf: Adjust
> vmtest.sh to use local kernel configuration"); I can probably piece together the
> exact kernel configuration if needed, but the inquiry is of a more general
> nature)
> 
> I am wondering what is the reason for us not checking whether kprobe_running
> returned a non-NULL pointer here (as we do elsewhere):
> https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L986
> ? Is that an oversight or should some kind of invariant be upheld at this point?
> 
> kprobe_int3_handler+0xd4/0x1a0 maps to line 987 in the above file. Address
> 0000000000000058 is exactly the offset that p->ainsn.insn is at, so it seems as
> if p is NULL.

Ah, good catch! I guess there is other int3 user in the kernel which is not
handled by kprobes. And kprobes missed to reset(clear) the state when !post_handler.


https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L814

static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
			       struct kprobe_ctlblk *kcb)
{
	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
		kcb->kprobe_status = KPROBE_HIT_SSDONE;	// this only set if cur->post_handler.
		cur->post_handler(cur, regs, 0);
	}

	/* Restore back the original saved kprobes variables and continue. */
	if (kcb->kprobe_status == KPROBE_REENTER)
		restore_previous_kprobe(kcb);
	else
		reset_current_kprobe();	// This only clear the current_kprobe (== kprobe_running())
}
NOKPROBE_SYMBOL(kprobe_post_process);

What about below patch?


From 66ac2a39c7d3d8a76d1ef989c0033831be24165e Mon Sep 17 00:00:00 2001
From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Date: Mon, 1 Aug 2022 01:14:09 +0900
Subject: [PATCH] x86/kprobes: Fix to update kcb status flag after
 singlestepping
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fix kprobes to update kcb (kprobes control block) status flag to
KPROBE_HIT_SSDONE even if the kp->post_handler is not set.
This may cause a kernel panic if another int3 user runs right
after kprobes because kprobe_int3_handler() misunderstands the
int3 is kprobe's single stepping int3.

Fixes: 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
Reported-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 arch/x86/kernel/kprobes/core.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 7c4ab8870da4..74167dc5f55e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -814,16 +814,20 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
 			       struct kprobe_ctlblk *kcb)
 {
-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
-		kcb->kprobe_status = KPROBE_HIT_SSDONE;
-		cur->post_handler(cur, regs, 0);
-	}
-
 	/* Restore back the original saved kprobes variables and continue. */
-	if (kcb->kprobe_status == KPROBE_REENTER)
+	if (kcb->kprobe_status == KPROBE_REENTER) {
+		/* This will restore both kcb and current_kprobe */
 		restore_previous_kprobe(kcb);
-	else
+	} else {
+		/*
+		 * Always update the kcb status because
+		 * reset_curent_kprobe() doesn't update kcb.
+		 */
+		kcb->kprobe_status = KPROBE_HIT_SSDONE;
+		if (cur->post_handler)
+			cur->post_handler(cur, regs, 0);
 		reset_current_kprobe();
+	}
 }
 NOKPROBE_SYMBOL(kprobe_post_process);
 
-- 
2.25.1

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [BUG] kernel NULL pointer dereference in kprobe_int3_handler
  2022-07-31 16:18 ` Masami Hiramatsu
@ 2022-08-01 19:42   ` Daniel Müller
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Müller @ 2022-08-01 19:42 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: bpf, kernel-team, Andy Lutomirski, Peter Zijlstra

On Mon, Aug 01, 2022 at 01:18:11AM +0900, Masami Hiramatsu wrote:
> On Wed, 27 Jul 2022 21:01:36 +0000
> Daniel Müller <deso@posteo.net> wrote:
> 
> > Hi,
> > 
> > I've seen a NULL pointer dereference in kprobe_int3_handler, in code that seems
> > to have gotten added with 6256e668b7af9 ("x86/kprobes: Use int3 instead of debug
> > trap for single-step").
> > Specifically, our CI has reported the following (running test_progs-no_alu32):
> > 
> >   [ 1033.068258] test_progs-no_a[1177] is installing a program with bpf_probe_write_user helper that may corrupt user memory!
> >   [ 1040.264691] BUG: kernel NULL pointer dereference, address: 0000000000000058
> >   [ 1040.264856] #PF: supervisor read access in kernel mode
> >   [ 1040.264890] #PF: error_code(0x0000) - not-present page
> >   [ 1040.264961] PGD 0 P4D 0 
> >   [ 1040.265183] Oops: 0000 [#1] PREEMPT SMP NOPTI
> >   [ 1040.265183] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W  OE     5.19.0-rc7-g4129b786299d #1
> >   [ 1040.265183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> >   [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
> >   [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
> >   [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
> >   [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
> >   [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
> >   [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
> >   [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >   [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
> >   [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
> >   [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
> >   [ 1040.265183] Call Trace:
> >   [ 1040.265183]  <TASK>
> >   [ 1040.265183]  do_int3+0xf/0x50
> >   [ 1040.265183]  exc_int3+0x87/0xd0
> >   [ 1040.265183]  asm_exc_int3+0x35/0x40
> >   [ 1040.265183] RIP: 0010:__schedule+0x3f9/0xbf0
> >   [ 1040.265183] Code: 83 5a fe ff ff 65 ff 05 e5 61 61 7e 48 8b 05 3e cb 68 01 48 85 c0 74 16 48 8b 78 08 4c 89 f1 4c 89 ea 44 8b 45 ac 8b 75 b8 e8 <53> 6c 79 ff 65 ff 0d bc 61 61 7e 0f 85 0d fe ff ff e8 a0 cf 5f ff
> >   [ 1040.265183] RSP: 0018:ffffb4140009be70 EFLAGS: 00000086
> >   [ 1040.265183] RAX: ffff9490056e0b90 RBX: ffff9490002f39e8 RCX: ffff949008758000
> >   [ 1040.265183] RDX: ffff9490002f3300 RSI: 0000000000000000 RDI: 0000000000000000
> >   [ 1040.265183] RBP: ffffb4140009bec8 R08: 0000000000000000 R09: 1dc944f200000000
> >   [ 1040.265183] R10: 0000000000000001 R11: 0000000000080000 R12: ffff9490b9c6c8c0
> >   [ 1040.265183] R13: ffff9490002f3300 R14: ffff949008758000 R15: ffff9490b9c6c8d8
> >   [ 1040.265183]  ? __schedule+0x3f9/0xbf0
> >   [ 1040.265183]  schedule_idle+0x26/0x40
> >   [ 1040.265183]  do_idle+0x177/0x250
> >   [ 1040.265183]  cpu_startup_entry+0x19/0x20
> >   [ 1040.265183]  start_secondary+0xed/0xf0
> >   [ 1040.265183]  secondary_startup_64_no_verify+0xe0/0xeb
> >   [ 1040.265183]  </TASK>
> >   [ 1040.265183] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod]
> >   [ 1040.265183] CR2: 0000000000000058
> >   [ 1040.265183] ---[ end trace 0000000000000000 ]---
> >   [ 1040.265183] RIP: 0010:kprobe_int3_handler+0xd4/0x1a0
> >   [ 1040.265183] Code: 49 8b 06 48 83 e8 02 48 a9 fd ff ff ff 75 d0 48 c7 c7 32 cc 2b 82 e8 eb d5 9a 00 48 8b 95 80 00 00 00 65 48 8b 3d 74 62 fc 7e <48> 8b 47 58 48 39 d0 73 ac 48 8d 48 0f 48 39 ca 73 a3 48 8b 4f 28
> >   [ 1040.265183] RSP: 0018:ffffb4140009bd40 EFLAGS: 00000092
> >   [ 1040.265183] RAX: 0000000000000001 RBX: ffffffff81a04cb9 RCX: 0000000000000000
> >   [ 1040.265183] RDX: ffffffff81a04cb9 RSI: ffffffff822bcc32 RDI: 0000000000000000
> >   [ 1040.265183] RBP: ffffb4140009bd98 R08: 000000000003929b R09: 0000000000000000
> >   [ 1040.265183] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >   [ 1040.265183] R13: ffffffff81a04cb8 R14: ffff9490b9c5b1e0 R15: 0000000000000000
> >   [ 1040.265183] FS:  0000000000000000(0000) GS:ffff9490b9c40000(0000) knlGS:0000000000000000
> >   [ 1040.265183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [ 1040.265183] CR2: 0000000000000058 CR3: 0000000028836000 CR4: 00000000000006e0
> >   [ 1040.265183] Kernel panic - not syncing: Fatal exception in interrupt
> >   [ 1040.265183] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > 
> > (it was sync'ed to somewhere around 40b09653b1977 ("selftests/bpf: Adjust
> > vmtest.sh to use local kernel configuration"); I can probably piece together the
> > exact kernel configuration if needed, but the inquiry is of a more general
> > nature)
> > 
> > I am wondering what is the reason for us not checking whether kprobe_running
> > returned a non-NULL pointer here (as we do elsewhere):
> > https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L986
> > ? Is that an oversight or should some kind of invariant be upheld at this point?
> > 
> > kprobe_int3_handler+0xd4/0x1a0 maps to line 987 in the above file. Address
> > 0000000000000058 is exactly the offset that p->ainsn.insn is at, so it seems as
> > if p is NULL.
> 
> Ah, good catch! I guess there is other int3 user in the kernel which is not
> handled by kprobes. And kprobes missed to reset(clear) the state when !post_handler.
> 
> 
> https://elixir.bootlin.com/linux/v5.18.13/source/arch/x86/kernel/kprobes/core.c#L814
> 
> static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
> 			       struct kprobe_ctlblk *kcb)
> {
> 	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
> 		kcb->kprobe_status = KPROBE_HIT_SSDONE;	// this only set if cur->post_handler.
> 		cur->post_handler(cur, regs, 0);
> 	}
> 
> 	/* Restore back the original saved kprobes variables and continue. */
> 	if (kcb->kprobe_status == KPROBE_REENTER)
> 		restore_previous_kprobe(kcb);
> 	else
> 		reset_current_kprobe();	// This only clear the current_kprobe (== kprobe_running())
> }
> NOKPROBE_SYMBOL(kprobe_post_process);
> 
> What about below patch?

Thanks for the patch! I scheduled ten CI runs without seeing the NULL pointer
deference in question, something that seemed impossible without it. So from my
perspective the patch does indeed seem to address the issue.

Tested-by: Daniel Müller <deso@posteo.net>

> From 66ac2a39c7d3d8a76d1ef989c0033831be24165e Mon Sep 17 00:00:00 2001
> From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
> Date: Mon, 1 Aug 2022 01:14:09 +0900
> Subject: [PATCH] x86/kprobes: Fix to update kcb status flag after
>  singlestepping
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Fix kprobes to update kcb (kprobes control block) status flag to
> KPROBE_HIT_SSDONE even if the kp->post_handler is not set.
> This may cause a kernel panic if another int3 user runs right
> after kprobes because kprobe_int3_handler() misunderstands the
> int3 is kprobe's single stepping int3.
> 
> Fixes: 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
> Reported-by: Daniel Müller <deso@posteo.net>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  arch/x86/kernel/kprobes/core.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 7c4ab8870da4..74167dc5f55e 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -814,16 +814,20 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
>  static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
>  			       struct kprobe_ctlblk *kcb)
>  {
> -	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
> -		kcb->kprobe_status = KPROBE_HIT_SSDONE;
> -		cur->post_handler(cur, regs, 0);
> -	}
> -
>  	/* Restore back the original saved kprobes variables and continue. */
> -	if (kcb->kprobe_status == KPROBE_REENTER)
> +	if (kcb->kprobe_status == KPROBE_REENTER) {
> +		/* This will restore both kcb and current_kprobe */
>  		restore_previous_kprobe(kcb);
> -	else
> +	} else {
> +		/*
> +		 * Always update the kcb status because
> +		 * reset_curent_kprobe() doesn't update kcb.
> +		 */
> +		kcb->kprobe_status = KPROBE_HIT_SSDONE;
> +		if (cur->post_handler)
> +			cur->post_handler(cur, regs, 0);
>  		reset_current_kprobe();
> +	}
>  }
>  NOKPROBE_SYMBOL(kprobe_post_process);
>  
> -- 
> 2.25.1
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: perf/urgent] x86/kprobes: Update kcb status flag after singlestepping
  2022-07-27 21:01 [BUG] kernel NULL pointer dereference in kprobe_int3_handler Daniel Müller
  2022-07-28  2:22 ` Chen Zhongjin
  2022-07-31 16:18 ` Masami Hiramatsu
@ 2022-08-02 11:03 ` tip-bot2 for Masami Hiramatsu (Google)
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Masami Hiramatsu (Google) @ 2022-08-02 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: deso, Masami Hiramatsu (Google), Ingo Molnar, stable, x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     dec8784c9088b131a1523f582c2194cfc8107dc0
Gitweb:        https://git.kernel.org/tip/dec8784c9088b131a1523f582c2194cfc8107dc0
Author:        Masami Hiramatsu (Google) <mhiramat@kernel.org>
AuthorDate:    Tue, 02 Aug 2022 15:04:16 +09:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 02 Aug 2022 12:35:04 +02:00

x86/kprobes: Update kcb status flag after singlestepping

Fix kprobes to update kcb (kprobes control block) status flag to
KPROBE_HIT_SSDONE even if the kp->post_handler is not set.

This bug may cause a kernel panic if another INT3 user runs right
after kprobes because kprobe_int3_handler() misunderstands the
INT3 is kprobe's single stepping INT3.

Fixes: 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
Reported-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Daniel Müller <deso@posteo.net>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20220727210136.jjgc3lpqeq42yr3m@muellerd-fedora-PC2BDTX9
Link: https://lore.kernel.org/r/165942025658.342061.12452378391879093249.stgit@devnote2
---
 arch/x86/kernel/kprobes/core.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 7c4ab88..74167dc 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -814,16 +814,20 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
 			       struct kprobe_ctlblk *kcb)
 {
-	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
-		kcb->kprobe_status = KPROBE_HIT_SSDONE;
-		cur->post_handler(cur, regs, 0);
-	}
-
 	/* Restore back the original saved kprobes variables and continue. */
-	if (kcb->kprobe_status == KPROBE_REENTER)
+	if (kcb->kprobe_status == KPROBE_REENTER) {
+		/* This will restore both kcb and current_kprobe */
 		restore_previous_kprobe(kcb);
-	else
+	} else {
+		/*
+		 * Always update the kcb status because
+		 * reset_curent_kprobe() doesn't update kcb.
+		 */
+		kcb->kprobe_status = KPROBE_HIT_SSDONE;
+		if (cur->post_handler)
+			cur->post_handler(cur, regs, 0);
 		reset_current_kprobe();
+	}
 }
 NOKPROBE_SYMBOL(kprobe_post_process);
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-08-02 11:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 21:01 [BUG] kernel NULL pointer dereference in kprobe_int3_handler Daniel Müller
2022-07-28  2:22 ` Chen Zhongjin
2022-07-31 16:18 ` Masami Hiramatsu
2022-08-01 19:42   ` Daniel Müller
2022-08-02 11:03 ` [tip: perf/urgent] x86/kprobes: Update kcb status flag after singlestepping tip-bot2 for Masami Hiramatsu (Google)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.