linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch <linux-arch@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>, Oleg Nesterov <oleg@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Richard Henderson <rth@twiddle.net>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	alpha <linux-alpha@vger.kernel.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	linux-m68k <linux-m68k@lists.linux-m68k.org>,
	Arnd Bergmann <arnd@kernel.org>,
	Ley Foon Tan <ley.foon.tan@intel.com>, Tejun Heo <tj@kernel.org>,
	Daniel Jacobowitz <drow@nevyn.them.org>,
	Kees Cook <keescook@chromium.org>
Subject: Re: Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads
Date: Fri, 11 Jun 2021 16:39:44 -0500	[thread overview]
Message-ID: <87pmwsytb3.fsf@disp2133> (raw)
In-Reply-To: <CAHk-=wjiBXCZBxLiCG5hxpd0vMkMjiocenponWygG5SCG6DXNw@mail.gmail.com> (Linus Torvalds's message of "Thu, 10 Jun 2021 15:04:33 -0700")

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, Jun 10, 2021 at 1:58 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> The problem is sometimes we read all of the registers from
>> a context where they are not all saved.
>
> Ouch. Yes. And this is really painful because none of the *normal*
> architectures do this, so it gets absolutely no coverage.
>
>> I think at this point we need to say that the architectures that have a
>> do this need to be fixed to at least call do_exit and the kernel
>> function in create_io_thread with the deeper stack.
>
> Yeah. We traditionally have that requirement for fork() and friends
> too (vfork/clone), so adding exit and io_uring to do so seems like the
> most straightforward thing.

Interesting.  I am starting with Al's analysis and reading the code
to see if I can understand what is going on.  So I am still glossing
over a few details as I dig into this.  Kernel threads not having
all of their registers saved is one of those details.

Looking at copy_thread it looks like at least on alpha we are dealing
with a structure that defines all of the registers in copy_thread.  So
perhaps all of the registers are there in kernel_threads already.  I
don't read alpha assembly very well and fork is a bit subtle.  I don't
know which piece of code is calling
ret_from_fork/ret_from_kernel_thread.

I really suspect that all of those registers are popped so at least for
IO_THREADS we need to push them again, in a way that signal_pt_regs()
can find them.

It looks like we just need something like this to cover the userspace
side of exit.

diff --git a/arch/alpha/kernel/entry.S b/arch/alpha/kernel/entry.S
index e227f3a29a43..ab0dcb545bd1 100644
--- a/arch/alpha/kernel/entry.S
+++ b/arch/alpha/kernel/entry.S
@@ -812,6 +812,22 @@ fork_like fork
 fork_like vfork
 fork_like clone
 
+.macro exit_like name
+	.align	4
+	.globl	alpha_\name
+	.ent	alpha_\name
+alpha_\name:
+	.prologue 0
+	DO_SWITCH_STACK
+	jsr	$26, sys_\name
+	UNDO_SWITCH_STACK
+	ret
+.end	alpha_\name
+.endm
+
+exit_like exit
+exit_like exit_group
+
 .macro	sigreturn_like name
 	.align	4
 	.globl	sys_\name
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 3000a2e8ee21..b9d6449d6caa 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -8,7 +8,7 @@
 # The <abi> is always "common" for this file
 #
 0	common	osf_syscall			alpha_syscall_zero
-1	common	exit				sys_exit
+1	common	exit				alpha_exit
 2	common	fork				alpha_fork
 3	common	read				sys_read
 4	common	write				sys_write
@@ -333,7 +333,7 @@
 400	common	io_getevents			sys_io_getevents
 401	common	io_submit			sys_io_submit
 402	common	io_cancel			sys_io_cancel
-405	common	exit_group			sys_exit_group
+405	common	exit_group			alpha_exit_group
 406	common	lookup_dcookie			sys_lookup_dcookie
 407	common	epoll_create			sys_epoll_create
 408	common	epoll_ctl			sys_epoll_ctl


> But I really wish we had some way to test and trigger this so that we
> wouldn't get caught on this before. Something in task_pt_regs() that
> catches "this doesn't actually work" and does a WARN_ON_ONCE() on the
> affected architectures?

I think that would require pushing an extra magic value in SWITCH_STACK
and not just popping it but deliberately changing that value in
UNDO_SWITCH_STACK.  Basically stack canaries.

I don't see how we could do it in an arch independent way though.
Which means it will require auditing all of the architectures to get
there. Volunteers?

This is looking straight forward enough that I can probably pull
something together, just don't count on me to have it done in anything
resembling a timely manner.

Eric

  reply	other threads:[~2021-06-11 21:40 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10 20:57 Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads Eric W. Biederman
2021-06-10 22:04 ` Linus Torvalds
2021-06-11 21:39   ` Eric W. Biederman [this message]
2021-06-11 23:26     ` Linus Torvalds
2021-06-13 21:54       ` Eric W. Biederman
2021-06-13 22:18         ` Linus Torvalds
2021-06-14  2:05           ` Michael Schmitz
2021-06-14  5:03             ` Michael Schmitz
2021-06-14 16:26               ` Eric W. Biederman
2021-06-14 22:26                 ` Michael Schmitz
2021-06-15 19:30                   ` Eric W. Biederman
2021-06-15 19:36                     ` [PATCH] alpha: Add extra switch_stack frames in exit, exec, and kernel threads Eric W. Biederman
2021-06-15 22:02                       ` Linus Torvalds
2021-06-16 16:32                         ` Eric W. Biederman
2021-06-16 18:29                           ` [PATCH 0/2] alpha/ptrace: Improved switch_stack handling Eric W. Biederman
2021-06-16 18:31                             ` [PATCH 1/2] alpha/ptrace: Record and handle the absence of switch_stack Eric W. Biederman
2021-06-16 20:00                               ` Linus Torvalds
2021-06-16 20:37                                 ` Linus Torvalds
2021-06-16 20:57                                   ` Eric W. Biederman
2021-06-16 21:02                                     ` Al Viro
2021-06-16 21:08                                     ` Linus Torvalds
2021-06-16 20:42                                 ` Eric W. Biederman
2021-06-16 20:17                               ` Al Viro
2021-06-21  2:01                               ` Michael Schmitz
2021-06-21  2:17                                 ` Linus Torvalds
2021-06-21  3:18                                   ` Michael Schmitz
2021-06-21  3:37                                     ` Linus Torvalds
2021-06-21  4:08                                       ` Michael Schmitz
2021-06-21  3:44                                     ` Al Viro
2021-06-21  5:31                                       ` Michael Schmitz
2021-06-21  2:27                                 ` Al Viro
2021-06-21  3:36                                   ` Michael Schmitz
2021-06-16 18:32                             ` [PATCH 2/2] alpha/ptrace: Add missing switch_stack frames Eric W. Biederman
2021-06-16 20:25                               ` Al Viro
2021-06-16 20:28                                 ` Al Viro
2021-06-16 20:49                                   ` Eric W. Biederman
2021-06-16 20:54                                     ` Al Viro
2021-06-16 20:47                                 ` Eric W. Biederman
2021-06-16 20:55                                   ` Al Viro
2021-06-16 20:50                       ` [PATCH] alpha: Add extra switch_stack frames in exit, exec, and kernel threads Al Viro
2021-06-15 20:56                     ` Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads Michael Schmitz
2021-06-16  0:23                       ` Finn Thain
2021-06-15 21:58                     ` Linus Torvalds
2021-06-16 15:06                       ` Eric W. Biederman
2021-06-21 13:54                       ` Al Viro
2021-06-21 14:16                         ` Al Viro
2021-06-21 16:50                           ` Eric W. Biederman
2021-06-21 23:05                             ` Al Viro
2021-06-22 16:39                               ` Eric W. Biederman
2021-06-21 15:38                         ` Linus Torvalds
2021-06-21 18:59                         ` Al Viro
2021-06-21 19:22                           ` Linus Torvalds
2021-06-21 19:45                             ` Al Viro
2021-06-21 23:14                               ` Linus Torvalds
2021-06-21 23:23                                 ` Al Viro
2021-06-21 23:36                                   ` Linus Torvalds
2021-06-22 21:02                                     ` Eric W. Biederman
2021-06-22 21:48                                       ` Michael Schmitz
2021-06-23  5:26                                         ` Michael Schmitz
2021-06-23 14:36                                           ` Eric W. Biederman
2021-06-22  0:01                                 ` Michael Schmitz
2021-06-22 20:04                                 ` Michael Schmitz
2021-06-22 20:18                                   ` Al Viro
2021-06-22 21:57                                     ` Michael Schmitz
2021-06-21 20:03                             ` Eric W. Biederman
2021-06-21 23:15                               ` Linus Torvalds
2021-06-22 20:52                                 ` Eric W. Biederman
2021-06-23  0:41                                   ` Linus Torvalds
2021-06-23 14:33                                     ` Eric W. Biederman
2021-06-24 18:57                                       ` [PATCH 0/9] Refactoring exit Eric W. Biederman
2021-06-24 18:59                                         ` [PATCH 1/9] signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) Eric W. Biederman
2021-06-24 18:59                                         ` [PATCH 2/9] signal/seccomp: Refactor seccomp signal and coredump generation Eric W. Biederman
2021-06-26  3:17                                           ` Kees Cook
2021-06-28 19:21                                             ` Eric W. Biederman
2021-06-28 14:34                                           ` [signal/seccomp] 3fdd8c68c2: kernel-selftests.seccomp.seccomp_bpf.fail kernel test robot
2021-06-24 19:00                                         ` [PATCH 3/9] signal/seccomp: Dump core when there is only one live thread Eric W. Biederman
2021-06-26  3:20                                           ` Kees Cook
2021-06-24 19:01                                         ` [PATCH 4/9] signal: Factor start_group_exit out of complete_signal Eric W. Biederman
2021-06-24 20:04                                           ` Linus Torvalds
2021-06-25  8:47                                           ` kernel test robot
2021-06-26  3:24                                           ` Kees Cook
2021-06-24 19:01                                         ` [PATCH 5/9] signal/group_exit: Use start_group_exit in place of do_group_exit Eric W. Biederman
2021-06-26  3:35                                           ` Kees Cook
2021-06-24 19:02                                         ` [PATCH 6/9] signal: Fold do_group_exit into get_signal fixing io_uring threads Eric W. Biederman
2021-06-26  3:42                                           ` Kees Cook
2021-06-28 19:25                                             ` Eric W. Biederman
2021-06-24 19:02                                         ` [PATCH 7/9] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2021-06-24 20:11                                           ` Linus Torvalds
2021-06-24 21:37                                             ` Eric W. Biederman
2021-06-24 19:03                                         ` [PATCH 8/9] signal/task_exit: Use start_task_exit in place of do_exit Eric W. Biederman
2021-06-26  5:56                                           ` Kees Cook
2021-06-24 19:03                                         ` [PATCH 9/9] signal: Move PTRACE_EVENT_EXIT into get_signal Eric W. Biederman
2021-06-24 22:45                                         ` [PATCH 0/9] Refactoring exit Al Viro
2021-06-27 22:13                                           ` Al Viro
2021-06-27 22:59                                             ` Michael Schmitz
2021-06-28  7:31                                               ` Geert Uytterhoeven
2021-06-28 16:20                                                 ` Eric W. Biederman
2021-06-28 17:14                                                 ` Michael Schmitz
2021-06-28 19:17                                                   ` Geert Uytterhoeven
2021-06-28 20:13                                                     ` Michael Schmitz
2021-06-28 21:18                                                       ` Geert Uytterhoeven
2021-06-28 23:42                                                         ` Michael Schmitz
2021-06-29 20:28                                                           ` [CFT][PATCH] exit/bdflush: Remove the deprecated bdflush system call Eric W. Biederman
2021-06-29 21:45                                                             ` Michael Schmitz
2021-06-30  8:24                                                             ` Geert Uytterhoeven
2021-06-30  8:37                                                             ` Arnd Bergmann
2021-06-30 12:30                                                             ` Cyril Hrubis
2021-06-28 19:02                                           ` [PATCH 0/9] Refactoring exit Eric W. Biederman
2021-06-21 19:24                           ` Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads Al Viro
2021-06-21 23:24                             ` Michael Schmitz
2021-06-16  7:38                     ` Geert Uytterhoeven
2021-06-16 19:40                       ` Michael Schmitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmwsytb3.fsf@disp2133 \
    --to=ebiederm@xmission.com \
    --cc=arnd@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=drow@nevyn.them.org \
    --cc=geert@linux-m68k.org \
    --cc=ink@jurassic.park.msu.ru \
    --cc=keescook@chromium.org \
    --cc=ley.foon.tan@intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=mattst88@gmail.com \
    --cc=oleg@redhat.com \
    --cc=rth@twiddle.net \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).