From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Dmitry Vyukov <dvyukov@google.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
jolsa@redhat.com, Namhyung Kim <namhyung@kernel.org>,
luca abeni <luca.abeni@santannapisa.it>,
syzkaller <syzkaller@googlegroups.com>
Subject: Re: [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending
Date: Wed, 06 Feb 2019 16:25:46 -0600 [thread overview]
Message-ID: <87imxwv9jp.fsf@xmission.com> (raw)
In-Reply-To: <20190206180754.GA23476@redhat.com> (Oleg Nesterov's message of "Wed, 6 Feb 2019 19:07:54 +0100")
Oleg Nesterov <oleg@redhat.com> writes:
> Eric, at al,
>
> Sorry, I am on on vacation, can't even read this thread right now,
> so I am not sure I understand the problem correctly...
That is fair. Please don't let me mess up your vacation.
The problem is an infinite stream of SIGHUP from a timer, making
processes unkillable.
Looking into that I see two bugs.
a) We handle fatal signals (esp SIGKILL) early in complete_signal and
this SIGHUP stream is messing up our handling.
b) We don't properly distinguish between synchronous and asynchrous
signals.
I am only aiming to fix the fatal signal issue with this change.
> On 02/05, Eric W. Biederman wrote:
>>
>> @@ -2393,6 +2393,11 @@ bool get_signal(struct ksignal *ksig)
>> goto relock;
>> }
>>
>> + /* Has this task already been flagged for death? */
>> + ksig->info.si_signo = signr = SIGKILL;
>> + if (current->jobctl & JOBCTL_TASK_EXIT)
>> + goto fatal;
>> +
>
> Can't we simply change, say, next_signal() to return SIGKILL if it is
> pending?
We could. But since this isn't necessarily SIGKILL we are dealing with,
it could just as easily be an unhandled SIGINT, so having SIGKILL in the
per task signal queue could still lead to other problems. I am afraid
that if continue abusing the per task signal queue other bugs of
confusion will occur.
> In any case, I am not sure we need JOBCTL_TASK_EXIT. Can't we rely on
> signal_group_exit() ?
Looking more closely yes I believe we can.
I thought de_thread would be a problem but signal_group_exit handles
that case and the core dump cases just fine.
It is a bit of a surprise but that would make:
> static inline int __fatal_signal_pending(struct task_struct *p)
> {
> return signal_group_exit(p->signal);
> }
I will respin and see if we can the change just depend on
signal_group_exit. One less abstraction to have to keep
in sync sounds more robust in the long run.
Eric
next prev parent reply other threads:[~2019-02-06 22:26 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-01 16:48 perf_event_open+clone = unkillable process Dmitry Vyukov
2019-02-01 17:06 ` Dmitry Vyukov
2019-02-02 18:30 ` Jiri Olsa
2019-02-03 15:21 ` Jiri Olsa
2019-02-04 9:27 ` Thomas Gleixner
2019-02-04 9:38 ` Dmitry Vyukov
2019-02-04 17:38 ` Thomas Gleixner
2019-02-05 3:00 ` Eric W. Biederman
2019-02-05 4:27 ` Eric W. Biederman
2019-02-05 6:07 ` Eric W. Biederman
2019-02-05 15:26 ` [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending Eric W. Biederman
2019-02-06 12:09 ` Dmitry Vyukov
2019-02-06 21:47 ` Eric W. Biederman
2019-02-06 18:07 ` Oleg Nesterov
2019-02-06 22:25 ` Eric W. Biederman [this message]
2019-02-07 6:42 ` [PATCH 0/2]: Fixing unkillable processes caused by SIGHUP timers Eric W. Biederman
2019-02-07 6:43 ` [PATCH 1/2] signal: Always notice exiting tasks Eric W. Biederman
2019-02-11 14:13 ` Oleg Nesterov
2019-02-12 0:42 ` Eric W. Biederman
2019-02-12 8:18 ` Eric W. Biederman
2019-02-12 16:50 ` Oleg Nesterov
2019-02-13 3:58 ` Eric W. Biederman
2019-02-13 4:09 ` [PATCH] signal: Restore the stop PTRACE_EVENT_EXIT Eric W. Biederman
2019-02-13 13:55 ` Oleg Nesterov
2019-02-13 14:38 ` Oleg Nesterov
2019-02-13 14:58 ` Eric W. Biederman
2019-02-07 6:44 ` [PATCH 2/2] signal: Better detection of synchronous signals Eric W. Biederman
2019-02-11 15:18 ` Oleg Nesterov
2019-02-12 0:01 ` Eric W. Biederman
2019-02-12 17:21 ` Oleg Nesterov
2019-02-07 11:46 ` [PATCH 0/2]: Fixing unkillable processes caused by SIGHUP timers Dmitry Vyukov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87imxwv9jp.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=dvyukov@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=syzkaller@googlegroups.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).