All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Osipenko <digetx@gmail.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexey Gladkov <legion@kernel.org>, Kyle Huey <me@kylehuey.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case
Date: Thu, 6 Jan 2022 00:39:10 +0300	[thread overview]
Message-ID: <5bbb54c4-7504-cd28-5dde-4e5965496625@gmail.com> (raw)
In-Reply-To: <87pmp67y4r.fsf@email.froward.int.ebiederm.org>

05.01.2022 22:58, Eric W. Biederman пишет:
> Dmitry Osipenko <digetx@gmail.com> writes:
> 
>> 14.12.2021 01:53, Eric W. Biederman пишет:
>>> Simplify the code that allows SIGKILL during coredumps to terminate
>>> the coredump.  As far as I can tell I have avoided breaking it
>>> by dumb luck.
>>>
>>> Historically with all of the other threads stopping in exit_mm the
>>> wants_signal loop in complete_signal would find the dumper task and
>>> then complete_signal would wake the dumper task with signal_wake_up.
>>>
>>> After moving the coredump_task_exit above the setting of PF_EXITING in
>>> commit 92307383082d ("coredump: Don't perform any cleanups before
>>> dumping core") wants_signal will consider all of the threads in a
>>> multi-threaded process for waking up, not just the core dumping task.
>>>
>>> Luckily complete_signal short circuits SIGKILL during a coredump marks
>>> every thread with SIGKILL and signal_wake_up.  This code is arguably
>>> buggy however as it tries to skip creating a group exit when is already
>>> present, and it fails that a coredump is in progress.
>>>
>>> Ever since commit 06af8679449d ("coredump: Limit what can interrupt
>>> coredumps") was added dump_interrupted needs not just TIF_SIGPENDING
>>> set on the dumper task but also SIGKILL set in it's pending bitmap.
>>> This means that if the code is ever fixed not to short-circuit and
>>> kill a process after it has already been killed the special case
>>> for SIGKILL during a coredump will be broken.
>>>
>>> Sort all of this out by making the coredump special case more special,
>>> and perform all of the work in prepare_signal and leave the rest of
>>> the signal delivery path out of it.
>>>
>>> In prepare_signal when the process coredumping is sent SIGKILL find
>>> the task performing the coredump and use sigaddset and signal_wake_up
>>> to ensure that task reports fatal_signal_pending.
>>>
>>> Return false from prepare_signal to tell the rest of the signal
>>> delivery path to ignore the signal.
>>>
>>> Update wait_for_dump_helpers to perform a wait_event_killable wait
>>> so that if signal_pending gets set spuriously the wait will not
>>> be interrupted unless fatal_signal_pending is true.
>>>
>>> I have tested this and verified I did not break SIGKILL during
>>> coredumps by accident (before or after this change).  I actually
>>> thought I had and I had to figure out what I had misread that kept
>>> SIGKILL during coredumps working.
>>>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>> ---
>>>  fs/coredump.c   |  4 ++--
>>>  kernel/signal.c | 11 +++++++++--
>>>  2 files changed, 11 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/coredump.c b/fs/coredump.c
>>> index a6b3c196cdef..7b91fb32dbb8 100644
>>> --- a/fs/coredump.c
>>> +++ b/fs/coredump.c
>>> @@ -448,7 +448,7 @@ static void coredump_finish(bool core_dumped)
>>>  static bool dump_interrupted(void)
>>>  {
>>>  	/*
>>> -	 * SIGKILL or freezing() interrupt the coredumping. Perhaps we
>>> +	 * SIGKILL or freezing() interrupted the coredumping. Perhaps we
>>>  	 * can do try_to_freeze() and check __fatal_signal_pending(),
>>>  	 * but then we need to teach dump_write() to restart and clear
>>>  	 * TIF_SIGPENDING.
>>> @@ -471,7 +471,7 @@ static void wait_for_dump_helpers(struct file *file)
>>>  	 * We actually want wait_event_freezable() but then we need
>>>  	 * to clear TIF_SIGPENDING and improve dump_interrupted().
>>>  	 */
>>> -	wait_event_interruptible(pipe->rd_wait, pipe->readers == 1);
>>> +	wait_event_killable(pipe->rd_wait, pipe->readers == 1);
>>>  
>>>  	pipe_lock(pipe);
>>>  	pipe->readers--;
>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>> index 8272cac5f429..7e305a8ec7c2 100644
>>> --- a/kernel/signal.c
>>> +++ b/kernel/signal.c
>>> @@ -907,8 +907,15 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
>>>  	sigset_t flush;
>>>  
>>>  	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
>>> -		if (!(signal->flags & SIGNAL_GROUP_EXIT))
>>> -			return sig == SIGKILL;
>>> +		struct core_state *core_state = signal->core_state;
>>> +		if (core_state) {
>>> +			if (sig == SIGKILL) {
>>> +				struct task_struct *dumper = core_state->dumper.task;
>>> +				sigaddset(&dumper->pending.signal, SIGKILL);
>>> +				signal_wake_up(dumper, 1);
>>> +			}
>>> +			return false;
>>> +		}
>>>  		/*
>>>  		 * The process is in the middle of dying, nothing to do.
>>>  		 */
>>>
>>
>> Hi,
>>
>> This patch breaks userspace, in particular it breaks gst-plugin-scanner
>> of GStreamer which hangs now on next-20211224. IIUC, this tool builds a
>> registry of good/working GStreamer plugins by loading them and
>> blacklisting those that don't work (crash). Before the hang I see
>> systemd-coredump process running, taking snapshot of gst-plugin-scanner
>> and then gst-plugin-scanner gets stuck.
>>
>> Bisection points at this patch, reverting it restores
>> gst-plugin-scanner. Systemd-coredump still running, but there is no hang
>> anymore and everything works properly as before.
>>
>> I'm seeing this problem on ARM32 and haven't checked other arches.
>> Please fix, thanks in advance.
> 
> 
> I have not yet been able to figure out how to run gst-pluggin-scanner in
> a way that triggers this yet.  In truth I can't figure out how to
> run gst-pluggin-scanner in a useful way.
> 
> I am going to set up some unit tests and see if I can reproduce your
> hang another way, but if you could give me some more information on what
> you are doing to trigger this I would appreciate it.

Thanks, Eric. The distro is Arch Linux, but it's a development
environment where I'm running latest GStreamer from git master. I'll try
to figure out the reproduction steps and get back to you.

  reply	other threads:[~2022-01-05 21:39 UTC|newest]

Thread overview: 196+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 20:17 [PATCH 00/10] Removal of most do_exit calls Eric W. Biederman
2021-12-08 20:25 ` [PATCH 01/10] exit/s390: Remove dead reference to do_exit from copy_thread Eric W. Biederman
2021-12-12 17:48   ` Heiko Carstens
2021-12-13 14:50     ` Eric W. Biederman
2022-01-05  4:25     ` Al Viro
2021-12-08 20:25 ` [PATCH 02/10] exit: Add and use make_task_dead Eric W. Biederman
2022-01-05  5:01   ` Al Viro
2022-01-05 20:46     ` Eric W. Biederman
2022-01-05 21:53       ` Al Viro
2022-01-05 22:51         ` Linus Torvalds
2022-01-05 23:34           ` Al Viro
2021-12-08 20:25 ` [PATCH 03/10] exit: Move oops specific logic from do_exit into make_task_dead Eric W. Biederman
2022-01-05  5:48   ` Al Viro
2022-01-06  7:08     ` Al Viro
2022-01-07  3:42     ` Al Viro
2022-01-07 19:02       ` Eric W. Biederman
2022-01-07 18:59     ` Eric W. Biederman
2022-01-17  8:05       ` Christoph Hellwig
2022-01-17 12:15         ` Heiko Carstens
2022-01-17 13:17           ` Christoph Hellwig
2022-01-17 13:24         ` Arnd Bergmann
2022-01-17 13:27           ` [PATCH] microblaze: remove CONFIG_SET_FS Arnd Bergmann
2022-02-09 13:50             ` Michal Simek
2022-02-09 13:52               ` Christoph Hellwig
2022-02-09 14:03                 ` Michal Simek
2022-02-09 14:40               ` Arnd Bergmann
2022-02-09 14:44                 ` Michal Simek
2022-02-09 14:54                   ` Arnd Bergmann
2022-02-09 23:31                     ` Stafford Horne
2022-02-11  0:17                       ` Stafford Horne
2022-02-11 16:59                         ` Arnd Bergmann
2022-02-11 17:46                           ` Linus Torvalds
2022-02-11 20:57                             ` Arnd Bergmann
2022-02-11 21:10                               ` Eric W. Biederman
2022-02-11 22:21                                 ` Stafford Horne
2022-02-14  7:41                             ` Christoph Hellwig
2022-02-14  7:50                           ` Christoph Hellwig
2022-02-14 16:20                             ` Arnd Bergmann
2021-12-08 20:25 ` [PATCH 04/10] exit: Stop poorly open coding do_task_dead in make_task_dead Eric W. Biederman
2022-01-05  5:58   ` Al Viro
2022-01-05 22:33     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 05/10] exit: Stop exporting do_exit Eric W. Biederman
2022-01-05  6:02   ` Al Viro
2022-01-05 22:36     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 06/10] exit: Implement kthread_exit Eric W. Biederman
2022-01-07  2:27   ` Al Viro
2022-01-08 18:35     ` Eric W. Biederman
2022-01-08 22:44       ` David Laight
2022-01-10 15:00         ` Eric W. Biederman
2022-01-09  3:27       ` Al Viro
2022-01-10 15:05         ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 07/10] exit: Rename module_put_and_exit to module_put_and_kthread_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 08/10] exit: Rename complete_and_exit to kthread_complete_and_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Eric W. Biederman
2021-12-22 18:19   ` Nathan Chancellor
2021-12-22 18:30     ` Eric W. Biederman
2021-12-22 18:46       ` Nathan Chancellor
2021-12-22 23:22         ` Eric W. Biederman
2021-12-23  0:37           ` Nathan Chancellor
2021-12-23  1:44           ` Linus Torvalds
2021-12-23  3:34             ` Eric W. Biederman
2021-12-23  3:34               ` Eric W. Biederman
2021-12-23  5:19               ` [PATCH] kthread: Generalize pf_io_worker so it can point to struct kthread Eric W. Biederman
2021-12-23  5:19                 ` Eric W. Biederman
2021-12-23 17:20                 ` Linus Torvalds
2021-12-23 17:20                   ` Linus Torvalds
2022-01-07  3:59   ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Al Viro
2022-01-08 18:20     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 10/10] exit/kthread: Move the exit code for kernel threads into struct kthread Eric W. Biederman
2022-01-07  3:22   ` Al Viro
2021-12-13 22:50 ` [PATCH 0/8] signal: Cleanup of the signal->flags Eric W. Biederman
2022-01-03 21:30   ` [PATCH 00/17] exit: Making task exiting a first class concept Eric W. Biederman
2022-01-03 21:32     ` [PATCH 01/17] exit: Remove profile_task_exit & profile_munmap Eric W. Biederman
2022-01-04  7:38       ` Christoph Hellwig
2022-01-07  3:48       ` Al Viro
2022-01-08 16:10         ` Eric W. Biederman
2022-01-03 21:32     ` [PATCH 02/17] exit: Coredumps reach do_group_exit Eric W. Biederman
2022-01-03 21:32     ` [PATCH 03/17] exit: Fix the exit_code for wait_task_zombie Eric W. Biederman
2022-01-03 21:32     ` [PATCH 04/17] exit: Use the correct exit_code in /proc/<pid>/stat Eric W. Biederman
2022-01-03 21:33     ` [PATCH 05/17] taskstats: Cleanup the use of task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 06/17] ptrace: Remove second setting of PT_SEIZED in ptrace_attach Eric W. Biederman
2022-01-03 21:33     ` [PATCH 07/17] ptrace: Remove unused regs argument from ptrace_report_syscall Eric W. Biederman
2022-01-03 21:33     ` [PATCH 08/17] ptrace/m68k: Stop open coding ptrace_report_syscall Eric W. Biederman
2022-01-10 15:26       ` Geert Uytterhoeven
2022-01-10 16:20         ` Al Viro
2022-01-10 16:25           ` Al Viro
2022-01-10 17:54           ` Geert Uytterhoeven
2022-01-10 20:37             ` Al Viro
2022-01-10 21:18               ` Eric W. Biederman
2022-01-11  1:33             ` Michael Schmitz
2022-01-11 22:42               ` Finn Thain
2022-01-12  0:20                 ` Michael Schmitz
2022-01-12  3:32                   ` Finn Thain
2022-01-12  7:54                     ` Michael Schmitz
2022-01-12  7:55                   ` Geert Uytterhoeven
2022-01-12  8:05                     ` Michael Schmitz
2022-01-03 21:33     ` [PATCH 09/17] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 10/17] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 11/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 12/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2022-01-03 21:33     ` [PATCH 13/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2022-01-03 21:33     ` [PATCH 14/17] signal: Remove zap_other_threads Eric W. Biederman
2022-01-03 21:33     ` [PATCH 15/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2022-01-03 21:33     ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2022-01-03 21:33     ` [PATCH 17/17] signal: Always set SIGNAL_GROUP_EXIT on process exit Eric W. Biederman
2022-03-09  0:13     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 16:24       ` [PATCH 01/13] ptrace: Move ptrace_report_syscall into ptrace.h Eric W. Biederman
2022-03-09 22:19         ` Kees Cook
2022-03-09 16:24       ` [PATCH 02/13] ptrace/arm: Rename tracehook_report_syscall report_syscall Eric W. Biederman
2022-03-09 22:20         ` Kees Cook
2022-03-09 16:24       ` [PATCH 03/13] ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h Eric W. Biederman
2022-03-09 22:26         ` Kees Cook
2022-03-09 16:24       ` [PATCH 04/13] ptrace: Remove arch_syscall_{enter,exit}_tracehook Eric W. Biederman
2022-03-09 22:29         ` Kees Cook
2022-03-09 16:24       ` [PATCH 05/13] ptrace: Remove tracehook_signal_handler Eric W. Biederman
2022-03-09 22:29         ` Kees Cook
2022-03-09 16:24       ` [PATCH 06/13] task_work: Remove unnecessary include from posix_timers.h Eric W. Biederman
2022-03-09 22:30         ` Kees Cook
2022-03-09 16:24       ` [PATCH 07/13] task_work: Introduce task_work_pending Eric W. Biederman
2022-03-09 21:05         ` Jens Axboe
2022-03-09 23:24           ` Eric W. Biederman
2022-03-09 23:26             ` Jens Axboe
2022-03-09 22:31         ` Kees Cook
2022-03-09 16:24       ` [PATCH 08/13] task_work: Call tracehook_notify_signal from get_signal on all architectures Eric W. Biederman
2022-03-10  5:57         ` Kees Cook
2022-03-10 19:04           ` Eric W. Biederman
2022-03-10 20:00             ` Kees Cook
2022-03-09 16:24       ` [PATCH 09/13] task_work: Decouple TIF_NOTIFY_SIGNAL and task_work Eric W. Biederman
2022-03-09 22:36         ` Kees Cook
2022-03-09 16:24       ` [PATCH 10/13] signal: Move set_notify_signal and clear_notify_signal into sched/signal.h Eric W. Biederman
2022-03-09 22:36         ` Kees Cook
2022-03-09 16:24       ` [PATCH 11/13] resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume Eric W. Biederman
2022-03-09 22:39         ` Kees Cook
2022-03-09 16:24       ` [PATCH 12/13] resume_user_mode: Move to resume_user_mode.h Eric W. Biederman
2022-03-09 22:54         ` Kees Cook
2022-03-09 16:24       ` [PATCH 13/13] tracehook: Remove tracehook.h Eric W. Biederman
2022-03-09 22:55         ` Kees Cook
2022-03-09 21:05       ` [PATCH 00/13] Removing tracehook.h Jens Axboe
2022-03-15 23:18       ` [PATCH 0/2] ptrace: Making the ptrace changes atomic Eric W. Biederman
2022-03-15 23:21         ` [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-03-17 17:46           ` Oleg Nesterov
2022-03-17 19:10           ` Kees Cook
2022-03-18 14:44             ` Eric W. Biederman
2022-03-18 17:20               ` Kees Cook
2022-03-15 23:22         ` [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-03-17 18:08           ` Oleg Nesterov
2022-03-17 18:31             ` Eric W. Biederman
2022-03-18 19:43               ` Oleg Nesterov
2022-03-18 14:40             ` Eric W. Biederman
2022-03-17 19:13           ` Kees Cook
2022-03-18 14:52             ` Eric W. Biederman
2022-03-18 17:28               ` Kees Cook
2022-03-28 23:56         ` [GIT PULL] ptrace: Cleanups for v5.18 Eric W. Biederman
2022-03-29  0:03           ` Jens Axboe
2022-03-29  0:33           ` Linus Torvalds
2022-03-29  0:53             ` Stephen Rothwell
2022-03-29  0:58               ` Linus Torvalds
2022-03-29  3:37             ` Eric W. Biederman
2022-03-29  4:49               ` Linus Torvalds
2022-03-29  5:20                 ` Linus Torvalds
2022-03-29  0:35           ` pr-tracker-bot
2022-03-09  0:15     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 20:58       ` Linus Torvalds
2021-12-13 22:53 ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2022-01-04  6:30   ` Dmitry Osipenko
2022-01-04 16:18     ` Eric W. Biederman
2022-01-05 19:58     ` Eric W. Biederman
2022-01-05 21:39       ` Dmitry Osipenko [this message]
2022-01-08 18:13         ` Eric W. Biederman
2022-01-08 18:15           ` [PATCH 1/2] signal: Have prepare_signal detect coredumps using signal->core_state Eric W. Biederman
2022-01-08 18:15           ` [PATCH 2/2] signal: Make coredump handling explicit in complete_signal Eric W. Biederman
2022-01-11  8:59           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Dmitry Osipenko
2022-01-11 17:20             ` Eric W. Biederman
2022-01-18 17:30               ` Dmitry Osipenko
2022-01-18 17:52                 ` Eric W. Biederman
2022-01-18 18:01                   ` Dmitry Osipenko
2022-01-04 18:44   ` Linus Torvalds
2022-01-04 19:47     ` Eric W. Biederman
2022-01-08 19:13       ` Heiko Carstens
     [not found]         ` <87ilurwjju.fsf@email.froward.int.ebiederm.org>
     [not found]           ` <87o84juwhg.fsf@email.froward.int.ebiederm.org>
2022-01-10 23:00             ` Olivier Langlois
2022-01-11 17:28               ` Eric W. Biederman
2022-01-11 18:51                 ` Eric W. Biederman
2022-01-11 19:19                   ` Linus Torvalds
2022-01-15  0:12                     ` Eric W. Biederman
2022-01-15 19:23                       ` Olivier Langlois
2022-01-17 16:09                         ` Eric W. Biederman
2022-01-17 18:46                           ` io_uring truncating coredumps Eric W. Biederman
2022-01-18  4:23                             ` Linus Torvalds
2022-01-26 15:06                           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Olivier Langlois
2021-12-13 22:53 ` [PATCH 2/8] signal: Drop signals received after a fatal signal has been processed Eric W. Biederman
2021-12-13 22:53 ` [PATCH 3/8] signal: Have the oom killer detect coredumps using signal->core_state Eric W. Biederman
2021-12-13 22:53 ` [PATCH 4/8] signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process Eric W. Biederman
2021-12-13 22:53 ` [PATCH 5/8] signal: Remove SIGNAL_GROUP_COREDUMP Eric W. Biederman
2021-12-13 22:53 ` [PATCH 6/8] coredump: Stop setting signal->group_exit_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 7/8] signal: Rename group_exit_task group_exec_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 8/8] signal: Remove the helper signal_group_exit Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5bbb54c4-7504-cd28-5dde-4e5965496625@gmail.com \
    --to=digetx@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.