linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Dmitry Osipenko <digetx@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexey Gladkov <legion@kernel.org>, Kyle Huey <me@kylehuey.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case
Date: Tue, 04 Jan 2022 10:18:01 -0600	[thread overview]
Message-ID: <877dbfmq4m.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <9363765f-9883-75ee-70f1-a1a8e9841812@gmail.com> (Dmitry Osipenko's message of "Tue, 4 Jan 2022 09:30:45 +0300")

Dmitry Osipenko <digetx@gmail.com> writes:

> 14.12.2021 01:53, Eric W. Biederman пишет:
>> Simplify the code that allows SIGKILL during coredumps to terminate
>> the coredump.  As far as I can tell I have avoided breaking it
>> by dumb luck.
>> 
>> Historically with all of the other threads stopping in exit_mm the
>> wants_signal loop in complete_signal would find the dumper task and
>> then complete_signal would wake the dumper task with signal_wake_up.
>> 
>> After moving the coredump_task_exit above the setting of PF_EXITING in
>> commit 92307383082d ("coredump: Don't perform any cleanups before
>> dumping core") wants_signal will consider all of the threads in a
>> multi-threaded process for waking up, not just the core dumping task.
>> 
>> Luckily complete_signal short circuits SIGKILL during a coredump marks
>> every thread with SIGKILL and signal_wake_up.  This code is arguably
>> buggy however as it tries to skip creating a group exit when is already
>> present, and it fails that a coredump is in progress.
>> 
>> Ever since commit 06af8679449d ("coredump: Limit what can interrupt
>> coredumps") was added dump_interrupted needs not just TIF_SIGPENDING
>> set on the dumper task but also SIGKILL set in it's pending bitmap.
>> This means that if the code is ever fixed not to short-circuit and
>> kill a process after it has already been killed the special case
>> for SIGKILL during a coredump will be broken.
>> 
>> Sort all of this out by making the coredump special case more special,
>> and perform all of the work in prepare_signal and leave the rest of
>> the signal delivery path out of it.
>> 
>> In prepare_signal when the process coredumping is sent SIGKILL find
>> the task performing the coredump and use sigaddset and signal_wake_up
>> to ensure that task reports fatal_signal_pending.
>> 
>> Return false from prepare_signal to tell the rest of the signal
>> delivery path to ignore the signal.
>> 
>> Update wait_for_dump_helpers to perform a wait_event_killable wait
>> so that if signal_pending gets set spuriously the wait will not
>> be interrupted unless fatal_signal_pending is true.
>> 
>> I have tested this and verified I did not break SIGKILL during
>> coredumps by accident (before or after this change).  I actually
>> thought I had and I had to figure out what I had misread that kept
>> SIGKILL during coredumps working.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  fs/coredump.c   |  4 ++--
>>  kernel/signal.c | 11 +++++++++--
>>  2 files changed, 11 insertions(+), 4 deletions(-)
>> 
>> diff --git a/fs/coredump.c b/fs/coredump.c
>> index a6b3c196cdef..7b91fb32dbb8 100644
>> --- a/fs/coredump.c
>> +++ b/fs/coredump.c
>> @@ -448,7 +448,7 @@ static void coredump_finish(bool core_dumped)
>>  static bool dump_interrupted(void)
>>  {
>>  	/*
>> -	 * SIGKILL or freezing() interrupt the coredumping. Perhaps we
>> +	 * SIGKILL or freezing() interrupted the coredumping. Perhaps we
>>  	 * can do try_to_freeze() and check __fatal_signal_pending(),
>>  	 * but then we need to teach dump_write() to restart and clear
>>  	 * TIF_SIGPENDING.
>> @@ -471,7 +471,7 @@ static void wait_for_dump_helpers(struct file *file)
>>  	 * We actually want wait_event_freezable() but then we need
>>  	 * to clear TIF_SIGPENDING and improve dump_interrupted().
>>  	 */
>> -	wait_event_interruptible(pipe->rd_wait, pipe->readers == 1);
>> +	wait_event_killable(pipe->rd_wait, pipe->readers == 1);
>>  
>>  	pipe_lock(pipe);
>>  	pipe->readers--;
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index 8272cac5f429..7e305a8ec7c2 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -907,8 +907,15 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
>>  	sigset_t flush;
>>  
>>  	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
>> -		if (!(signal->flags & SIGNAL_GROUP_EXIT))
>> -			return sig == SIGKILL;
>> +		struct core_state *core_state = signal->core_state;
>> +		if (core_state) {
>> +			if (sig == SIGKILL) {
>> +				struct task_struct *dumper = core_state->dumper.task;
>> +				sigaddset(&dumper->pending.signal, SIGKILL);
>> +				signal_wake_up(dumper, 1);
>> +			}
>> +			return false;
>> +		}
>>  		/*
>>  		 * The process is in the middle of dying, nothing to do.
>>  		 */
>> 
>
> Hi,
>
> This patch breaks userspace, in particular it breaks gst-plugin-scanner
> of GStreamer which hangs now on next-20211224. IIUC, this tool builds a
> registry of good/working GStreamer plugins by loading them and
> blacklisting those that don't work (crash). Before the hang I see
> systemd-coredump process running, taking snapshot of gst-plugin-scanner
> and then gst-plugin-scanner gets stuck.
>
> Bisection points at this patch, reverting it restores
> gst-plugin-scanner. Systemd-coredump still running, but there is no hang
> anymore and everything works properly as before.
>
> I'm seeing this problem on ARM32 and haven't checked other arches.
> Please fix, thanks in advance.

That is weird.

Doubly weird really because this should only change the case where
coredumps are interrupted by SIGKILL.

What distro are you running?  I would like to match things as closely
as I can.  So I can reproduce the issue so I can figure out what
is wrong so I can fix it.

Eric

  reply	other threads:[~2022-01-04 16:18 UTC|newest]

Thread overview: 193+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 20:17 [PATCH 00/10] Removal of most do_exit calls Eric W. Biederman
2021-12-08 20:25 ` [PATCH 01/10] exit/s390: Remove dead reference to do_exit from copy_thread Eric W. Biederman
2021-12-12 17:48   ` Heiko Carstens
2021-12-13 14:50     ` Eric W. Biederman
2022-01-05  4:25     ` Al Viro
2021-12-08 20:25 ` [PATCH 02/10] exit: Add and use make_task_dead Eric W. Biederman
2022-01-05  5:01   ` Al Viro
2022-01-05 20:46     ` Eric W. Biederman
2022-01-05 21:53       ` Al Viro
2022-01-05 22:51         ` Linus Torvalds
2022-01-05 23:34           ` Al Viro
2021-12-08 20:25 ` [PATCH 03/10] exit: Move oops specific logic from do_exit into make_task_dead Eric W. Biederman
2022-01-05  5:48   ` Al Viro
2022-01-06  7:08     ` Al Viro
2022-01-07  3:42     ` Al Viro
2022-01-07 19:02       ` Eric W. Biederman
2022-01-07 18:59     ` Eric W. Biederman
2022-01-17  8:05       ` Christoph Hellwig
2022-01-17 12:15         ` Heiko Carstens
2022-01-17 13:17           ` Christoph Hellwig
2022-01-17 13:24         ` Arnd Bergmann
2022-01-17 13:27           ` [PATCH] microblaze: remove CONFIG_SET_FS Arnd Bergmann
2022-02-09 13:50             ` Michal Simek
2022-02-09 13:52               ` Christoph Hellwig
2022-02-09 14:03                 ` Michal Simek
2022-02-09 14:40               ` Arnd Bergmann
2022-02-09 14:44                 ` Michal Simek
2022-02-09 14:54                   ` Arnd Bergmann
2022-02-09 23:31                     ` Stafford Horne
2022-02-11  0:17                       ` Stafford Horne
2022-02-11 16:59                         ` Arnd Bergmann
2022-02-11 17:46                           ` Linus Torvalds
2022-02-11 20:57                             ` Arnd Bergmann
2022-02-11 21:10                               ` Eric W. Biederman
2022-02-11 22:21                                 ` Stafford Horne
2022-02-14  7:41                             ` Christoph Hellwig
2022-02-14  7:50                           ` Christoph Hellwig
2022-02-14 16:20                             ` Arnd Bergmann
2021-12-08 20:25 ` [PATCH 04/10] exit: Stop poorly open coding do_task_dead in make_task_dead Eric W. Biederman
2022-01-05  5:58   ` Al Viro
2022-01-05 22:33     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 05/10] exit: Stop exporting do_exit Eric W. Biederman
2022-01-05  6:02   ` Al Viro
2022-01-05 22:36     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 06/10] exit: Implement kthread_exit Eric W. Biederman
2022-01-07  2:27   ` Al Viro
2022-01-08 18:35     ` Eric W. Biederman
2022-01-08 22:44       ` David Laight
2022-01-10 15:00         ` Eric W. Biederman
2022-01-09  3:27       ` Al Viro
2022-01-10 15:05         ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 07/10] exit: Rename module_put_and_exit to module_put_and_kthread_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 08/10] exit: Rename complete_and_exit to kthread_complete_and_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Eric W. Biederman
2021-12-22 18:19   ` Nathan Chancellor
2021-12-22 18:30     ` Eric W. Biederman
2021-12-22 18:46       ` Nathan Chancellor
2021-12-22 23:22         ` Eric W. Biederman
2021-12-23  0:37           ` Nathan Chancellor
2021-12-23  1:44           ` Linus Torvalds
2021-12-23  3:34             ` Eric W. Biederman
2021-12-23  5:19               ` [PATCH] kthread: Generalize pf_io_worker so it can point to struct kthread Eric W. Biederman
2021-12-23 17:20                 ` Linus Torvalds
2022-01-07  3:59   ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Al Viro
2022-01-08 18:20     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 10/10] exit/kthread: Move the exit code for kernel threads into struct kthread Eric W. Biederman
2022-01-07  3:22   ` Al Viro
2021-12-13 22:50 ` [PATCH 0/8] signal: Cleanup of the signal->flags Eric W. Biederman
2022-01-03 21:30   ` [PATCH 00/17] exit: Making task exiting a first class concept Eric W. Biederman
2022-01-03 21:32     ` [PATCH 01/17] exit: Remove profile_task_exit & profile_munmap Eric W. Biederman
2022-01-04  7:38       ` Christoph Hellwig
2022-01-07  3:48       ` Al Viro
2022-01-08 16:10         ` Eric W. Biederman
2022-01-03 21:32     ` [PATCH 02/17] exit: Coredumps reach do_group_exit Eric W. Biederman
2022-01-03 21:32     ` [PATCH 03/17] exit: Fix the exit_code for wait_task_zombie Eric W. Biederman
2022-01-03 21:32     ` [PATCH 04/17] exit: Use the correct exit_code in /proc/<pid>/stat Eric W. Biederman
2022-01-03 21:33     ` [PATCH 05/17] taskstats: Cleanup the use of task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 06/17] ptrace: Remove second setting of PT_SEIZED in ptrace_attach Eric W. Biederman
2022-01-03 21:33     ` [PATCH 07/17] ptrace: Remove unused regs argument from ptrace_report_syscall Eric W. Biederman
2022-01-03 21:33     ` [PATCH 08/17] ptrace/m68k: Stop open coding ptrace_report_syscall Eric W. Biederman
2022-01-10 15:26       ` Geert Uytterhoeven
2022-01-10 16:20         ` Al Viro
2022-01-10 16:25           ` Al Viro
2022-01-10 17:54           ` Geert Uytterhoeven
2022-01-10 20:37             ` Al Viro
2022-01-10 21:18               ` Eric W. Biederman
2022-01-11  1:33             ` Michael Schmitz
2022-01-11 22:42               ` Finn Thain
2022-01-12  0:20                 ` Michael Schmitz
2022-01-12  3:32                   ` Finn Thain
2022-01-12  7:54                     ` Michael Schmitz
2022-01-12  7:55                   ` Geert Uytterhoeven
2022-01-12  8:05                     ` Michael Schmitz
2022-01-03 21:33     ` [PATCH 09/17] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 10/17] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 11/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 12/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2022-01-03 21:33     ` [PATCH 13/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2022-01-03 21:33     ` [PATCH 14/17] signal: Remove zap_other_threads Eric W. Biederman
2022-01-03 21:33     ` [PATCH 15/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2022-01-03 21:33     ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2022-01-03 21:33     ` [PATCH 17/17] signal: Always set SIGNAL_GROUP_EXIT on process exit Eric W. Biederman
2022-03-09  0:13     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 16:24       ` [PATCH 01/13] ptrace: Move ptrace_report_syscall into ptrace.h Eric W. Biederman
2022-03-09 22:19         ` Kees Cook
2022-03-09 16:24       ` [PATCH 02/13] ptrace/arm: Rename tracehook_report_syscall report_syscall Eric W. Biederman
2022-03-09 22:20         ` Kees Cook
2022-03-09 16:24       ` [PATCH 03/13] ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h Eric W. Biederman
2022-03-09 22:26         ` Kees Cook
2022-03-09 16:24       ` [PATCH 04/13] ptrace: Remove arch_syscall_{enter,exit}_tracehook Eric W. Biederman
2022-03-09 22:29         ` Kees Cook
2022-03-09 16:24       ` [PATCH 05/13] ptrace: Remove tracehook_signal_handler Eric W. Biederman
2022-03-09 22:29         ` Kees Cook
2022-03-09 16:24       ` [PATCH 06/13] task_work: Remove unnecessary include from posix_timers.h Eric W. Biederman
2022-03-09 22:30         ` Kees Cook
2022-03-09 16:24       ` [PATCH 07/13] task_work: Introduce task_work_pending Eric W. Biederman
2022-03-09 21:05         ` Jens Axboe
2022-03-09 23:24           ` Eric W. Biederman
2022-03-09 23:26             ` Jens Axboe
2022-03-09 22:31         ` Kees Cook
2022-03-09 16:24       ` [PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures Eric W. Biederman
2022-03-10  5:57         ` Kees Cook
2022-03-10 19:04           ` Eric W. Biederman
2022-03-10 20:00             ` Kees Cook
2022-03-09 16:24       ` [PATCH 09/13] task_work: Decouple TIF_NOTIFY_SIGNAL and task_work Eric W. Biederman
2022-03-09 22:36         ` Kees Cook
2022-03-09 16:24       ` [PATCH 10/13] signal: Move set_notify_signal and clear_notify_signal into sched/signal.h Eric W. Biederman
2022-03-09 22:36         ` Kees Cook
2022-03-09 16:24       ` [PATCH 11/13] resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume Eric W. Biederman
2022-03-09 22:39         ` Kees Cook
2022-03-09 16:24       ` [PATCH 12/13] resume_user_mode: Move to resume_user_mode.h Eric W. Biederman
2022-03-09 22:54         ` Kees Cook
2022-03-09 16:24       ` [PATCH 13/13] tracehook: Remove tracehook.h Eric W. Biederman
2022-03-09 22:55         ` Kees Cook
2022-03-09 21:05       ` [PATCH 00/13] Removing tracehook.h Jens Axboe
2022-03-15 23:18       ` [PATCH 0/2] ptrace: Making the ptrace changes atomic Eric W. Biederman
2022-03-15 23:21         ` [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-03-17 17:46           ` Oleg Nesterov
2022-03-17 19:10           ` Kees Cook
2022-03-18 14:44             ` Eric W. Biederman
2022-03-18 17:20               ` Kees Cook
2022-03-15 23:22         ` [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-03-17 18:08           ` Oleg Nesterov
2022-03-17 18:31             ` Eric W. Biederman
2022-03-18 19:43               ` Oleg Nesterov
2022-03-18 14:40             ` Eric W. Biederman
2022-03-17 19:13           ` Kees Cook
2022-03-18 14:52             ` Eric W. Biederman
2022-03-18 17:28               ` Kees Cook
2022-03-28 23:56         ` [GIT PULL] ptrace: Cleanups for v5.18 Eric W. Biederman
2022-03-29  0:03           ` Jens Axboe
2022-03-29  0:33           ` Linus Torvalds
2022-03-29  0:53             ` Stephen Rothwell
2022-03-29  0:58               ` Linus Torvalds
2022-03-29  3:37             ` Eric W. Biederman
2022-03-29  4:49               ` Linus Torvalds
2022-03-29  5:20                 ` Linus Torvalds
2022-03-29  0:35           ` pr-tracker-bot
2022-03-09  0:15     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 20:58       ` Linus Torvalds
2021-12-13 22:53 ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2022-01-04  6:30   ` Dmitry Osipenko
2022-01-04 16:18     ` Eric W. Biederman [this message]
2022-01-05 19:58     ` Eric W. Biederman
2022-01-05 21:39       ` Dmitry Osipenko
2022-01-08 18:13         ` Eric W. Biederman
2022-01-08 18:15           ` [PATCH 1/2] signal: Have prepare_signal detect coredumps using signal->core_state Eric W. Biederman
2022-01-08 18:15           ` [PATCH 2/2] signal: Make coredump handling explicit in complete_signal Eric W. Biederman
2022-01-11  8:59           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Dmitry Osipenko
2022-01-11 17:20             ` Eric W. Biederman
2022-01-18 17:30               ` Dmitry Osipenko
2022-01-18 17:52                 ` Eric W. Biederman
2022-01-18 18:01                   ` Dmitry Osipenko
2022-01-04 18:44   ` Linus Torvalds
2022-01-04 19:47     ` Eric W. Biederman
2022-01-08 19:13       ` Heiko Carstens
     [not found]         ` <87ilurwjju.fsf@email.froward.int.ebiederm.org>
     [not found]           ` <87o84juwhg.fsf@email.froward.int.ebiederm.org>
2022-01-10 23:00             ` Olivier Langlois
2022-01-11 17:28               ` Eric W. Biederman
2022-01-11 18:51                 ` Eric W. Biederman
2022-01-11 19:19                   ` Linus Torvalds
2022-01-15  0:12                     ` Eric W. Biederman
2022-01-15 19:23                       ` Olivier Langlois
2022-01-17 16:09                         ` Eric W. Biederman
2022-01-17 18:46                           ` io_uring truncating coredumps Eric W. Biederman
2022-01-18  4:23                             ` Linus Torvalds
2022-01-26 15:06                           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Olivier Langlois
2021-12-13 22:53 ` [PATCH 2/8] signal: Drop signals received after a fatal signal has been processed Eric W. Biederman
2021-12-13 22:53 ` [PATCH 3/8] signal: Have the oom killer detect coredumps using signal->core_state Eric W. Biederman
2021-12-13 22:53 ` [PATCH 4/8] signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process Eric W. Biederman
2021-12-13 22:53 ` [PATCH 5/8] signal: Remove SIGNAL_GROUP_COREDUMP Eric W. Biederman
2021-12-13 22:53 ` [PATCH 6/8] coredump: Stop setting signal->group_exit_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 7/8] signal: Rename group_exit_task group_exec_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 8/8] signal: Remove the helper signal_group_exit Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dbfmq4m.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=digetx@gmail.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).