bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack
@ 2021-04-01  0:07 Dave Marchevsky
  2021-04-01  6:48 ` Song Liu
  2021-04-01 21:00 ` Alexei Starovoitov
  0 siblings, 2 replies; 4+ messages in thread
From: Dave Marchevsky @ 2021-04-01  0:07 UTC (permalink / raw)
  To: bpf
  Cc: kernel-team, Alexei Starovoitov, Daniel Borkmann, Song Liu,
	Dave Marchevsky

On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
offset of task->stack. The pt_regs are later dereferenced in
__bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
the task in question exits while bpf_get_task_stack is executing, as
warned by task_stack_page's comment:

* When accessing the stack of a non-current task that might exit, use
* try_get_task_stack() instead.  task_stack_page will return a pointer
* that could get freed out from under you.

Taking the comment's advice and using try_get_task_stack() and
put_task_stack() to hold task->stack refcount, or bail early if it's
already 0. Incrementing stack_refcount will ensure the task's stack
sticks around while we're using its data.

I noticed this bug while testing a bpf task iter similar to
bpf_iter_task_stack in selftests, except mine grabbed user stack, and
getting intermittent crashes, which resulted in dumps like:

  BUG: unable to handle page fault for address: 0000000000003fe0
  \#PF: supervisor read access in kernel mode
  \#PF: error_code(0x0000) - not-present page
  RIP: 0010:__bpf_get_stack+0xd0/0x230
  <snip...>
  Call Trace:
  bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
  bpf_iter_run_prog+0x24/0x81
  __task_seq_show+0x58/0x80
  bpf_seq_read+0xf7/0x3d0
  vfs_read+0x91/0x140
  ksys_read+0x59/0xd0
  do_syscall_64+0x48/0x120
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()")
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
---
 kernel/bpf/stackmap.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index be35bfb7fb13..6fbc2abe9c91 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -517,9 +517,17 @@ const struct bpf_func_proto bpf_get_stack_proto = {
 BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
 	   u32, size, u64, flags)
 {
-	struct pt_regs *regs = task_pt_regs(task);
+	struct pt_regs *regs;
+	long res;
 
-	return __bpf_get_stack(regs, task, NULL, buf, size, flags);
+	if (!try_get_task_stack(task))
+		return -EFAULT;
+
+	regs = task_pt_regs(task);
+	res = __bpf_get_stack(regs, task, NULL, buf, size, flags);
+	put_task_stack(task);
+
+	return res;
 }
 
 BTF_ID_LIST_SINGLE(bpf_get_task_stack_btf_ids, struct, task_struct)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack
  2021-04-01  0:07 [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack Dave Marchevsky
@ 2021-04-01  6:48 ` Song Liu
  2021-04-01 17:47   ` Song Liu
  2021-04-01 21:00 ` Alexei Starovoitov
  1 sibling, 1 reply; 4+ messages in thread
From: Song Liu @ 2021-04-01  6:48 UTC (permalink / raw)
  To: Dave Marchevsky
  Cc: open list:BPF (Safe dynamic programs and tools),
	Kernel Team, Alexei Starovoitov, Daniel Borkmann



> On Mar 31, 2021, at 5:07 PM, Dave Marchevsky <davemarchevsky@fb.com> wrote:
> 
> On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
> offset of task->stack. The pt_regs are later dereferenced in
> __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
> the task in question exits while bpf_get_task_stack is executing, as
> warned by task_stack_page's comment:
> 
> * When accessing the stack of a non-current task that might exit, use
> * try_get_task_stack() instead.  task_stack_page will return a pointer
> * that could get freed out from under you.
> 
> Taking the comment's advice and using try_get_task_stack() and
> put_task_stack() to hold task->stack refcount, or bail early if it's
> already 0. Incrementing stack_refcount will ensure the task's stack
> sticks around while we're using its data.
> 
> I noticed this bug while testing a bpf task iter similar to
> bpf_iter_task_stack in selftests, except mine grabbed user stack, and
> getting intermittent crashes, which resulted in dumps like:
> 
>  BUG: unable to handle page fault for address: 0000000000003fe0
>  \#PF: supervisor read access in kernel mode
>  \#PF: error_code(0x0000) - not-present page
>  RIP: 0010:__bpf_get_stack+0xd0/0x230
>  <snip...>
>  Call Trace:
>  bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
>  bpf_iter_run_prog+0x24/0x81
>  __task_seq_show+0x58/0x80
>  bpf_seq_read+0xf7/0x3d0
>  vfs_read+0x91/0x140
>  ksys_read+0x59/0xd0
>  do_syscall_64+0x48/0x120
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()")
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>

Thanks for the fix!

Acked-by: Song Liu <songliubraving@fb.com>

Could you please extend bpf_iter_task_stack to also grab user stack? 

Thanks,
Song

[...]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack
  2021-04-01  6:48 ` Song Liu
@ 2021-04-01 17:47   ` Song Liu
  0 siblings, 0 replies; 4+ messages in thread
From: Song Liu @ 2021-04-01 17:47 UTC (permalink / raw)
  To: Dave Marchevsky
  Cc: open list:BPF (Safe dynamic programs and tools),
	Kernel Team, Alexei Starovoitov, Daniel Borkmann



> On Mar 31, 2021, at 11:48 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> 
> 
>> On Mar 31, 2021, at 5:07 PM, Dave Marchevsky <davemarchevsky@fb.com> wrote:
>> 
>> On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
>> offset of task->stack. The pt_regs are later dereferenced in
>> __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
>> the task in question exits while bpf_get_task_stack is executing, as
>> warned by task_stack_page's comment:
>> 
>> * When accessing the stack of a non-current task that might exit, use
>> * try_get_task_stack() instead.  task_stack_page will return a pointer
>> * that could get freed out from under you.
>> 
>> Taking the comment's advice and using try_get_task_stack() and
>> put_task_stack() to hold task->stack refcount, or bail early if it's
>> already 0. Incrementing stack_refcount will ensure the task's stack
>> sticks around while we're using its data.
>> 
>> I noticed this bug while testing a bpf task iter similar to
>> bpf_iter_task_stack in selftests, except mine grabbed user stack, and
>> getting intermittent crashes, which resulted in dumps like:
>> 
>> BUG: unable to handle page fault for address: 0000000000003fe0
>> \#PF: supervisor read access in kernel mode
>> \#PF: error_code(0x0000) - not-present page
>> RIP: 0010:__bpf_get_stack+0xd0/0x230
>> <snip...>
>> Call Trace:
>> bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
>> bpf_iter_run_prog+0x24/0x81
>> __task_seq_show+0x58/0x80
>> bpf_seq_read+0xf7/0x3d0
>> vfs_read+0x91/0x140
>> ksys_read+0x59/0xd0
>> do_syscall_64+0x48/0x120
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> 
>> Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()")
>> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> 
> Thanks for the fix!
> 
> Acked-by: Song Liu <songliubraving@fb.com>
> 
> Could you please extend bpf_iter_task_stack to also grab user stack? 

I think we can extend bpf_iter_task_stack in a follow up patch. It is
not necessary to bundle these two patches in the same set. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack
  2021-04-01  0:07 [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack Dave Marchevsky
  2021-04-01  6:48 ` Song Liu
@ 2021-04-01 21:00 ` Alexei Starovoitov
  1 sibling, 0 replies; 4+ messages in thread
From: Alexei Starovoitov @ 2021-04-01 21:00 UTC (permalink / raw)
  To: Dave Marchevsky
  Cc: bpf, Kernel Team, Alexei Starovoitov, Daniel Borkmann, Song Liu

On Wed, Mar 31, 2021 at 5:08 PM Dave Marchevsky <davemarchevsky@fb.com> wrote:
>
> On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
> offset of task->stack. The pt_regs are later dereferenced in
> __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
> the task in question exits while bpf_get_task_stack is executing, as
> warned by task_stack_page's comment:
>
> * When accessing the stack of a non-current task that might exit, use
> * try_get_task_stack() instead.  task_stack_page will return a pointer
> * that could get freed out from under you.
>
> Taking the comment's advice and using try_get_task_stack() and
> put_task_stack() to hold task->stack refcount, or bail early if it's
> already 0. Incrementing stack_refcount will ensure the task's stack
> sticks around while we're using its data.
>
> I noticed this bug while testing a bpf task iter similar to
> bpf_iter_task_stack in selftests, except mine grabbed user stack, and
> getting intermittent crashes, which resulted in dumps like:
>
>   BUG: unable to handle page fault for address: 0000000000003fe0
>   \#PF: supervisor read access in kernel mode
>   \#PF: error_code(0x0000) - not-present page
>   RIP: 0010:__bpf_get_stack+0xd0/0x230
>   <snip...>
>   Call Trace:
>   bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
>   bpf_iter_run_prog+0x24/0x81
>   __task_seq_show+0x58/0x80
>   bpf_seq_read+0xf7/0x3d0
>   vfs_read+0x91/0x140
>   ksys_read+0x59/0xd0
>   do_syscall_64+0x48/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()")
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>

Applied. Thanks

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-01 21:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01  0:07 [PATCH bpf] bpf: refcount task stack in bpf_get_task_stack Dave Marchevsky
2021-04-01  6:48 ` Song Liu
2021-04-01 17:47   ` Song Liu
2021-04-01 21:00 ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).