linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Enke Chen <enkechen@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Kate Stewart <kstewart@linuxfoundation.org>,
	Helge Deller <deller@gmx.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christian Brauner <christian@brauner.io>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Roman Gushchin <guro@fb.com>,
	Marcos Paulo de Souza <marcos.souza.org@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org,
	"Victor Kamensky (kamensky)" <kamensky@cisco.com>,
	xe-linux-external@cisco.com, Stefan Strogin <sstrogin@cisco.com>
Subject: Re: [PATCH v2] kernel/signal: Signal-based pre-coredump notification
Date: Thu, 25 Oct 2018 07:23:07 -0500	[thread overview]
Message-ID: <87zhv2md04.fsf@xmission.com> (raw)
Message-ID: <20181025122307.eWGP5GTxAPKdLuq2cU2qGlhLDsp__rNqh721rTyJbsE@z> (raw)
In-Reply-To: <df08c74f-1027-c635-fb99-a12d23f60947@cisco.com> (Enke Chen's message of "Wed, 24 Oct 2018 16:50:26 -0700")

Enke Chen <enkechen@cisco.com> writes:

> Hi, Eric:
>
> Thanks for your comments. Please see my replies inline.
>
> On 10/24/18 6:29 AM, Eric W. Biederman wrote:
>> Enke Chen <enkechen@cisco.com> writes:
>> 
>>> For simplicity and consistency, this patch provides an implementation
>>> for signal-based fault notification prior to the coredump of a child
>>> process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can
>>> be used by an application to express its interest and to specify the
>>> signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new
>>> signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD.
>>>
>>> Changes to prctl(2):
>>>
>>>    PR_SET_PREDUMP_SIG (since Linux 4.20.x)
>>>           Set the child pre-coredump signal of the calling process to
>>>           arg2 (either SIGUSR1, or SIUSR2, or SIGCHLD, or 0 to clear).
>>>           This is the signal that the calling process will get prior to
>>>           the coredump of a child process. This value is cleared across
>>>           execve(2), or for the child of a fork(2).
>>>
>>>           When SIGCHLD is specified, the signal code will be set to
>>>           CLD_PREDUMP in such an SIGCHLD signal.
>> 
>> Your signal handling is still not right.  Please read and comprehend
>> siginfo_layout.
>> 
>> You have not filled in all of the required fields for the SIGCHLD case.
>> For the non SIGCHLD case you are using si_code == 0 == SI_USER which is
>> very wrong.  This is not a user generated signal.
>> 
>> Let me say this slowly.  The pair si_signo si_code determines the union
>> member of struct siginfo.  That needs to be handled consistently. You
>> aren't.  I just finished fixing this up in the entire kernel and now you
>> are trying to add a usage that is worst than most of the bugs I have
>> fixed.  I really don't appreciate having to deal with no bugs.
>> 
>
> My apologies. I will investigate and make them consistent.
>
>> 
>> 
>> Further siginfo can be dropped.  Multiple signals with the same signal
>> number can be consolidated.  What is your plan for dealing with that?
>
> The primary application for the early notification involves a process
> manager which is responsible for re-spawning processes or initiating
> the control-plane fail-over. There are two models:
>
> One model is to have 1:1 relationship between a process manager and
> application process. There can only be one predump-signal (say, SIGUSR1)
> from the child to the parent, and will unlikely be dropped or consolidated.
>
> Another model is to have 1:N where there is only one process manager with
> multiple application processes. One of the RT signal can be used to help
> make it more reliable.

Which suggests you want one of the negative si_codes, and to use the _rt
siginfo member like sigqueue.

>> Other code paths pair with wait to get the information out.  There
>> is no equivalent of wait in your code.
>
> I was not aware of that before.  Let me investigate.
>
>> 
>> Signals can be delayed by quite a bit, scheduling delays etc.  They can
>> not provide any meaningful kind of real time notification.
>> 
>
> The timing requirement is about 50-100 msecs for BFD.  Not sure if that
> qualifies as "real time".  This mechanism has worked well in deployment
> over the years.

It would help if those numbers were put into the patch description so
people can tell if the mechanism is quick enough.

>> So between delays and loss of information signals appear to be a very
>> poor fit for this usecase.
>> 
>> I am concerned about code that does not fit the usecase well because
>> such code winds up as code that no one cares about that must be
>> maintained indefinitely, because somewhere out there there is one use
>> that would break if the interface was removed.  This does not feel like
>> an interface people will want to use and maintain in proper working
>> order forever.
>> 
>> Ugh.  Your test case is even using signalfd.  So you don't even want
>> this signal to be delivered as a signal.
>
> I actually tested sigaction()/waitpid() as well. If there is a preference,
> I can check in the sigaction()/waitpid() version instead.
>
>> 
>> You add an interface that takes a pointer and you don't add a compat
>> interface.  See Oleg's point of just returning the signal number in the
>> return code.
>
> This is what Oleg said "but I won't insist, this is subjective and cosmetic".
>
> It is no big deal either way. It just seems less work if we do not keep
> adding exceptions to the prctl(2) manpage:
>  
> prctl(2):
>
>        On success, PR_GET_DUMPABLE,   PR_GET_KEEPCAPS,   PR_GET_NO_NEW_PRIVS,   PR_CAPBSET_READ,    PR_GET_TIMING,    PR_GET_SECUREBITS,
>        PR_MCE_KILL_GET,  PR_CAP_AMBIENT+PR_CAP_AMBIENT_IS_SET,  and  (if  it returns) PR_GET_SECCOMP return the nonnegative values described
>        above.  All other option values return 0 on success.  On error, -1 is returned, and errno is set appropriately.

More work in the man page versus less work in the kernel, and less code
to maintain.  I will vote for more work in the man page.

>> Now I am wondering how well prctl works from a 32bit process on a 64bit
>> kernel.  At first glance it looks like it probably does not work.
>>
>
> I am not sure which part would be problematic.

32bit pointers need to be translated into 64bit pointers.  If the system
call does not zero extend them.  Plus structure sizes.

I think prctl is just inside the line where problems happen but it is so
close to the line of structure size differences that it makes me
nervous.  Typically pointers in structures are what cause system calls
to cross that line.

>> Consistency with PDEATHSIG is not a good argument for anything.
>> PDEATHSIG at the present time is unusable in the real world by most
>> applications that want something like it.
>
> Agreed, PDEATHSIG seems to have a few issues ...
>
>> 
>> So far I see an interface that even you don't want to use as designed,
>> that is implemented incorrectly.
>> 
>> The concern is real and deserves to be addressed.  I don't think signals
>> are the right way to handle it, and certainly not this patch as it
>> stands.
>
> I will address your concerns on the patch. Regarding the requirement and the
> overall solution, if there are specific questions that I have not answered,
> please let me know.

So far so good.

Eric

  parent reply	other threads:[~2018-10-25 20:56 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-13  0:33 [PATCH] kernel/signal: Signal-based pre-coredump notification Enke Chen
2018-10-13  0:33 ` Enke Chen
2018-10-13  6:40 ` Greg Kroah-Hartman
2018-10-13  6:40   ` Greg Kroah-Hartman
2018-10-15 18:16   ` Enke Chen
2018-10-15 18:16     ` Enke Chen
2018-10-15 18:43     ` Greg Kroah-Hartman
2018-10-15 18:43       ` Greg Kroah-Hartman
2018-10-15 18:49       ` Enke Chen
2018-10-15 18:49         ` Enke Chen
2018-10-15 18:58         ` Greg Kroah-Hartman
2018-10-15 18:58           ` Greg Kroah-Hartman
2018-10-13 10:44 ` Christian Brauner
2018-10-13 10:44   ` Christian Brauner
2018-10-15 18:39   ` Enke Chen
2018-10-15 18:39     ` Enke Chen
2018-10-13 18:27 ` Jann Horn
2018-10-13 18:27   ` Jann Horn
2018-10-15 18:36   ` Enke Chen
2018-10-15 18:36     ` Enke Chen
2018-10-15 18:54     ` Jann Horn
2018-10-15 18:54       ` Jann Horn
2018-10-15 19:23       ` Enke Chen
2018-10-15 19:23         ` Enke Chen
2018-10-19 23:01       ` Enke Chen
2018-10-19 23:01         ` Enke Chen
2018-10-22 15:40         ` Jann Horn
2018-10-22 15:40           ` Jann Horn
2018-10-22 20:48           ` Enke Chen
2018-10-22 20:48             ` Enke Chen
2018-10-15 12:05 ` Oleg Nesterov
2018-10-15 12:05   ` Oleg Nesterov
2018-10-15 18:54   ` Enke Chen
2018-10-15 18:54     ` Enke Chen
2018-10-15 19:17   ` Enke Chen
2018-10-15 19:17     ` Enke Chen
2018-10-15 19:26     ` Enke Chen
2018-10-15 19:26       ` Enke Chen
2018-10-16 14:14     ` Oleg Nesterov
2018-10-16 14:14       ` Oleg Nesterov
2018-10-16 15:09       ` Eric W. Biederman
2018-10-16 15:09         ` Eric W. Biederman
2018-10-17  0:39       ` Enke Chen
2018-10-17  0:39         ` Enke Chen
2018-10-15 21:21 ` Alan Cox
2018-10-15 21:21   ` Alan Cox
2018-10-15 21:31   ` Enke Chen
2018-10-15 21:31     ` Enke Chen
2018-10-15 23:28 ` Eric W. Biederman
2018-10-15 23:28   ` Eric W. Biederman
2018-10-16  0:33   ` valdis.kletnieks
2018-10-16  0:33     ` valdis.kletnieks
2018-10-16  0:54   ` Enke Chen
2018-10-16  0:54     ` Enke Chen
2018-10-16 15:26     ` Eric W. Biederman
2018-10-16 15:26       ` Eric W. Biederman
2018-10-22 21:09 ` [PATCH v2] " Enke Chen
2018-10-22 21:09   ` Enke Chen
2018-10-23  9:23   ` Oleg Nesterov
2018-10-23  9:23     ` Oleg Nesterov
2018-10-23 19:43     ` Enke Chen
2018-10-23 19:43       ` Enke Chen
2018-10-23 21:40       ` Enke Chen
2018-10-23 21:40         ` Enke Chen
2018-10-24 13:52       ` Oleg Nesterov
2018-10-24 13:52         ` Oleg Nesterov
2018-10-24 21:56         ` Enke Chen
2018-10-24 21:56           ` Enke Chen
2018-10-24  5:39   ` [PATCH v3] " Enke Chen
2018-10-24  5:39     ` Enke Chen
2018-10-24 14:02     ` Oleg Nesterov
2018-10-24 14:02       ` Oleg Nesterov
2018-10-24 22:02       ` Enke Chen
2018-10-24 22:02         ` Enke Chen
2018-10-25 22:56     ` [PATCH v4] " Enke Chen
2018-10-25 22:56       ` Enke Chen
2018-10-26  8:28       ` Oleg Nesterov
2018-10-26  8:28         ` Oleg Nesterov
2018-10-26 22:23         ` Enke Chen
2018-10-26 22:23           ` Enke Chen
2018-10-29 11:18           ` Oleg Nesterov
2018-10-29 11:18             ` Oleg Nesterov
2018-10-29 21:08             ` Enke Chen
2018-10-29 21:08               ` Enke Chen
2018-10-29 22:31             ` [PATCH v5] " Enke Chen
2018-10-29 22:31               ` Enke Chen
2018-10-30 16:46               ` Oleg Nesterov
2018-10-30 16:46                 ` Oleg Nesterov
2018-10-31  0:25                 ` Enke Chen
2018-10-31  0:25                   ` Enke Chen
2018-11-22  0:37                 ` Andrew Morton
2018-11-22  0:37                   ` Andrew Morton
2018-11-22  1:09                   ` Enke Chen
2018-11-22  1:09                     ` Enke Chen
2018-11-22  1:18                     ` Enke Chen
2018-11-22  1:18                       ` Enke Chen
2018-11-22  1:33                     ` Andrew Morton
2018-11-22  1:33                       ` Andrew Morton
2018-11-22  4:57                       ` Enke Chen
2018-11-22  4:57                         ` Enke Chen
2018-11-12 23:22               ` Enke Chen
2018-11-12 23:22                 ` Enke Chen
2018-11-27 22:54               ` [PATCH v5 1/2] " Enke Chen
2018-11-27 22:54                 ` Enke Chen
2018-11-28 15:19                 ` Dave Martin
2018-11-28 15:19                   ` Dave Martin
2018-11-29  0:15                   ` Enke Chen
2018-11-29  0:15                     ` Enke Chen
2018-11-29 11:55                     ` Dave Martin
2018-11-29 11:55                       ` Dave Martin
2018-11-30  0:27                       ` Enke Chen
2018-11-30  0:27                         ` Enke Chen
2018-11-30 12:03                       ` Oleg Nesterov
2018-11-30 12:03                         ` Oleg Nesterov
2018-12-05  6:47                       ` Jann Horn
2018-12-05  6:47                         ` Jann Horn
2018-12-04 22:37                     ` Andrew Morton
2018-12-04 22:37                       ` Andrew Morton
2018-12-06 17:29                       ` Oleg Nesterov
2018-12-06 17:29                         ` Oleg Nesterov
2018-10-25 22:56     ` [PATCH] selftests/prctl: selftest for pre-coredump signal notification Enke Chen
2018-10-25 22:56       ` Enke Chen
2018-11-27 22:54       ` [PATCH v5 2/2] " Enke Chen
2018-11-27 22:54         ` Enke Chen
2018-10-24 13:29   ` [PATCH v2] kernel/signal: Signal-based pre-coredump notification Eric W. Biederman
2018-10-24 13:29     ` Eric W. Biederman
2018-10-24 23:50     ` Enke Chen
2018-10-24 23:50       ` Enke Chen
2018-10-25 12:23       ` Eric W. Biederman [this message]
2018-10-25 12:23         ` Eric W. Biederman
2018-10-25 20:45         ` Enke Chen
2018-10-25 20:45           ` Enke Chen
2018-10-25 21:24         ` Enke Chen
2018-10-25 21:24           ` Enke Chen
2018-10-25 21:56         ` Enke Chen
2018-10-25 21:56           ` Enke Chen
2018-10-25 13:45     ` Jann Horn
2018-10-25 13:45       ` Jann Horn
2018-10-25 20:21       ` Eric W. Biederman
2018-10-25 20:21         ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhv2md04.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=Dave.Martin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christian@brauner.io \
    --cc=deller@gmx.de \
    --cc=enkechen@cisco.com \
    --cc=gorcunov@openvz.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=guro@fb.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=kamensky@cisco.com \
    --cc=keescook@chromium.org \
    --cc=khalid.aziz@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=marcos.souza.org@gmail.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=sstrogin@cisco.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    --cc=xe-linux-external@cisco.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).