All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Kees Cook <keescook@chromium.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>,
	David Howells <dhowells@redhat.com>, Jann Horn <jannh@google.com>,
	Andrew Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Android Kernel Team <kernel-team@android.com>
Subject: Re: [RFC][PATCH 1/5] exit: kill struct waitid_info
Date: Fri, 26 Jul 2019 14:37:39 +0200	[thread overview]
Message-ID: <20190726123737.axwqx6zeeza5dmpb@brauner.io> (raw)
In-Reply-To: <87wog5qa50.fsf@xmission.com>

On Fri, Jul 26, 2019 at 06:59:39AM -0500, Eric W. Biederman wrote:
> Christian Brauner <christian@brauner.io> writes:
> 
> > On Thu, Jul 25, 2019 at 07:46:50AM -0500, Eric W. Biederman wrote:
> >> Linus Torvalds <torvalds@linux-foundation.org> writes:
> >> 
> >> > On Wed, Jul 24, 2019 at 7:47 AM Christian Brauner <christian@brauner.io> wrote:
> >> >>
> >> >> The code here uses a struct waitid_info to catch basic information about
> >> >> process exit including the pid, uid, status, and signal that caused the
> >> >> process to exit. This information is then stuffed into a struct siginfo
> >> >> for the waitid() syscall. That seems like an odd thing to do. We can
> >> >> just pass down a siginfo_t struct directly which let's us cleanup and
> >> >> simplify the whole code quite a bit.
> >> >
> >> > Ack. Except I'd like the commit message to explain where this comes
> >> > from instead of that "That seems like an odd thing to do".
> >> >
> >> > The _original_ reason for "struct waitid_info" was that "siginfo_t" is
> >> > huge because of all the insane padding that various architectures do.
> >> >
> >> > So it was introduced by commit 67d7ddded322 ("waitid(2): leave copyout
> >> > of siginfo to syscall itself") very much to avoid stack usage issues.
> >> > And I quote:
> >> >
> >> >     collect the information needed for siginfo into
> >> >     a small structure (waitid_info)
> >> >
> >> > simply because "sigset_t" was big.
> >> >
> >> > But that size came from the explicit "pad it out to 128 bytes for
> >> > future expansion that will never happen", and the kernel using the
> >> > same exact sigset_t that was exposed to user space.
> >> >
> >> > Then in commit 4ce5f9c9e754 ("signal: Use a smaller struct siginfo in
> >> > the kernel") we got rid of the insane padding for in-kernel use,
> >> > exactly because it causes issues like this.
> >> >
> >> > End result: that "struct waitid_info" no longer makes sense. It's not
> >> > appreciably smaller than kernel_siginfo_t is today, but it made sense
> >> > at the time.
> >> 
> >> Apologies.  I meant to reply yesterday but I was preempted by baby
> >> issues.
> >> 
> >> I strongly disagree that this direction makes sense.  The largest
> >> value that I see from struct waitid_info is that it makes it possible to
> >> reason about which values are returned where struct kernel_siginfo does
> >> not.
> >> 
> >> One of the details the existence of struct waitid_info makes clear is
> >> that unlike the related child death path the wait code does not
> >> fillin si_utime and si_stime.  Which is very important to know when you
> >> are dealing with y2038 issues and Arnd Bergmann is.
> >> 
> >> The most egregious example I know of using siginfo wrong is:
> >> 70f1b0d34bdf ("signal/usb: Replace kill_pid_info_as_cred with
> >> kill_pid_usb_asyncio").  But just by moving struct siginfo out of the
> >> program logic and into dedicated little functions that just deal with
> >> the craziness of struct siginfo I have found lots of little bugs.
> >> 
> >> We don't need that kind of invitation to bugs in the wait logic.
> >
> > I don't think it's a strong enough argument for rejecting this change.
> > Suspecting that something might go wrong if we simplify something is a
> > valid call to proceed with caution and be on the lookout for potential
> > regressions so we can react fast. I respect that. But it's not
> > necessarily a good argument to reject a change.
> 
> Except your change was not a simplification.   Your change was
> a substitution to do the work of filling in struct kernel_siginfo in 4
> locations instead of just 2.
> 
> The only simplification came from not using unsafe_put_user.  Which is
> valid but has nothing to do with struct waitid_info.
> 
> > I'm happy to switch from an initializer (which is not even clear is a
> > bug) to using clear_siginfo().
> 
> I just double checked the definitions in signal_types.h and
> uapi/asm-generic/siginfo.h and there is definitely padding on 64bit.
> So yes barring magic compiler plug ins it is a bug.
> 
> > And I'm also going to split this patch out of the P_PIDFD patch but I'm
> > going to send this out again. I haven't heard a sound argument why
> > this
> > patch is worse than what we have right now in there.
> 
> Then I am afraid I have not expressed myself well.
> 
> When I read through this patch I saw.
> - A bug when dealing with struct kernel_siginfo.
> - A substitution from of struct waitid_info to struct kernel_siginfo.
> - An actual simplification in replacing several unsafe_put_user calls
>   with copy_siginfo_to_user.
> - A gratuitous change in change the order of several of the statements.
> - No simplification in the logic of do_wait.

I'm not going to feel bad for trying to improve a piece of code and not
getting it right the first time.
Removal of a custom struct that is only used to copy bits of information
into another struct for which we have a dedicated in-kernel struct to
avoid exactly that seems like a valid use of
s/<custom-struct>/<dedicated-kernel-struct>/ to me; especially since I
needed to touch that file anyway.

> 
> Or in short I saw you did "s/struct waitid_info/struct kernel_siginfo/"
> and introduced bugs.  Further you increased the number of locations that
> we need to be very careful with struct kernel_siginfo from 2 to 4.

If you're concerned about siginfo_t being used in more places than it is
right now, then please put a comment above it that it should be avoided
and that because of architectural nuances clear_signfo() needs to be
used to initialize it correctly.

Christian

  reply	other threads:[~2019-07-26 12:37 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-24 14:46 [PATCH 0/5] pidfd: waiting on processes through pidfds Christian Brauner
2019-07-24 14:46 ` [RFC][PATCH 1/5] exit: kill struct waitid_info Christian Brauner
2019-07-24 17:37   ` Linus Torvalds
2019-07-24 22:01     ` Christian Brauner
2019-07-25 12:46     ` Eric W. Biederman
2019-07-26  8:01       ` Christian Brauner
2019-07-26 11:59         ` Eric W. Biederman
2019-07-26 12:37           ` Christian Brauner [this message]
2019-07-25  9:40   ` Oleg Nesterov
2019-07-25 10:07     ` Christian Brauner
2019-07-24 14:46 ` [PATCH 2/5] pidfd: add pidfd_wait() Christian Brauner
2019-07-24 17:45   ` Linus Torvalds
2019-07-24 17:50     ` Christian Brauner
2019-07-24 17:52       ` Christian Brauner
2019-07-25 14:34     ` Eric W. Biederman
2019-07-25 10:16   ` Oleg Nesterov
2019-07-25 10:21     ` Christian Brauner
2019-07-26  8:19   ` Arnd Bergmann
2019-07-26  8:24     ` Christian Brauner
2019-07-26  9:57       ` Arnd Bergmann
2019-07-24 14:46 ` [PATCH 3/5] arch: wire-up pidfd_wait() Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46 ` [PATCH 4/5] pidfd: add CLONE_WAIT_PID Christian Brauner
2019-07-24 18:14   ` Jann Horn
2019-07-24 18:27     ` Christian Brauner
2019-07-24 19:07       ` Jann Horn
2019-07-24 19:10         ` Christian Brauner
2019-07-25 10:11           ` Christian Brauner
2019-07-25 10:30         ` Oleg Nesterov
2019-07-25 10:36           ` Christian Brauner
2019-07-25 10:35   ` Oleg Nesterov
2019-07-25 10:40     ` Christian Brauner
2019-07-25 11:25       ` Oleg Nesterov
2019-07-25 11:41         ` Christian Brauner
2019-07-25 11:43         ` Oleg Nesterov
2019-07-25 12:26           ` Christian Brauner
2019-07-25 16:13             ` Oleg Nesterov
2019-07-25 16:56               ` Christian Brauner
2019-07-25 16:57             ` Eric W. Biederman
2019-07-24 14:46 ` [PATCH 5/5] pidfd: add pidfd_wait tests Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190726123737.axwqx6zeeza5dmpb@brauner.io \
    --to=christian@brauner.io \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=cyphar@cyphar.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.