From: Jann Horn <email@example.com> To: Kees Cook <firstname.lastname@example.org> Cc: Tycho Andersen <email@example.com>, "Michael Kerrisk (man-pages)" <firstname.lastname@example.org>, Sargun Dhillon <email@example.com>, Christian Brauner <firstname.lastname@example.org>, linux-man <email@example.com>, lkml <firstname.lastname@example.org>, Aleksa Sarai <email@example.com>, Alexei Starovoitov <firstname.lastname@example.org>, Will Drewry <email@example.com>, bpf <firstname.lastname@example.org>, Song Liu <email@example.com>, Daniel Borkmann <firstname.lastname@example.org>, Andy Lutomirski <email@example.com>, Linux Containers <firstname.lastname@example.org>, Giuseppe Scrivano <email@example.com>, Robert Sesek <firstname.lastname@example.org> Subject: Re: For review: seccomp_user_notif(2) manual page Date: Mon, 26 Oct 2020 11:31:01 +0100 [thread overview] Message-ID: <CAG48ez2OWhpH3HHUJSrAmokJ8=SVwKrmQMSw0gEbTJmKE4myCw@mail.gmail.com> (raw) In-Reply-To: <CAG48ez2b-fnsp8YAR=H5uRMT4bBTid_hyU4m6KavHxDko1Efog@mail.gmail.com> On Mon, Oct 26, 2020 at 10:51 AM Jann Horn <email@example.com> wrote: > On Mon, Oct 26, 2020 at 1:32 AM Kees Cook <firstname.lastname@example.org> wrote: > > On Thu, Oct 01, 2020 at 03:52:02AM +0200, Jann Horn wrote: > > > On Thu, Oct 1, 2020 at 1:25 AM Tycho Andersen <email@example.com> wrote: > > > > On Thu, Oct 01, 2020 at 01:11:33AM +0200, Jann Horn wrote: > > > > > On Thu, Oct 1, 2020 at 1:03 AM Tycho Andersen <firstname.lastname@example.org> wrote: > > > > > > On Wed, Sep 30, 2020 at 10:34:51PM +0200, Michael Kerrisk (man-pages) wrote: > > > > > > > On 9/30/20 5:03 PM, Tycho Andersen wrote: > > > > > > > > On Wed, Sep 30, 2020 at 01:07:38PM +0200, Michael Kerrisk (man-pages) wrote: > > > > > > > >> ┌─────────────────────────────────────────────────────┐ > > > > > > > >> │FIXME │ > > > > > > > >> ├─────────────────────────────────────────────────────┤ > > > > > > > >> │From my experiments, it appears that if a SEC‐ │ > > > > > > > >> │COMP_IOCTL_NOTIF_RECV is done after the target │ > > > > > > > >> │process terminates, then the ioctl() simply blocks │ > > > > > > > >> │(rather than returning an error to indicate that the │ > > > > > > > >> │target process no longer exists). │ > > > > > > > > > > > > > > > > Yeah, I think Christian wanted to fix this at some point, > > > > > > > > > > > > > > Do you have a pointer that discussion? I could not find it with a > > > > > > > quick search. > > > > > > > > > > > > > > > but it's a > > > > > > > > bit sticky to do. > > > > > > > > > > > > > > Can you say a few words about the nature of the problem? > > > > > > > > > > > > I remembered wrong, it's actually in the tree: 99cdb8b9a573 ("seccomp: > > > > > > notify about unused filter"). So maybe there's a bug here? > > > > > > > > > > That thing only notifies on ->poll, it doesn't unblock ioctls; and > > > > > Michael's sample code uses SECCOMP_IOCTL_NOTIF_RECV to wait. So that > > > > > commit doesn't have any effect on this kind of usage. > > > > > > > > Yes, thanks. And the ones stuck in RECV are waiting on a semaphore so > > > > we don't have a count of all of them, unfortunately. > > > > > > > > We could maybe look inside the wait_list, but that will probably make > > > > people angry :) > > > > > > The easiest way would probably be to open-code the semaphore-ish part, > > > and let the semaphore and poll share the waitqueue. The current code > > > kind of mirrors the semaphore's waitqueue in the wqh - open-coding the > > > entire semaphore would IMO be cleaner than that. And it's not like > > > semaphore semantics are even a good fit for this code anyway. > > > > > > Let's see... if we didn't have the existing UAPI to worry about, I'd > > > do it as follows (*completely* untested). That way, the ioctl would > > > block exactly until either there actually is a request to deliver or > > > there are no more users of the filter. The problem is that if we just > > > apply this patch, existing users of SECCOMP_IOCTL_NOTIF_RECV that use > > > an event loop and don't set O_NONBLOCK will be screwed. So we'd > > > > Wait, why? Do you mean a ioctl calling loop (rather than a poll event > > loop)? > > No, I'm talking about poll event loops. > > > I think poll would be fine, but a "try calling RECV and expect to > > return ENOENT" loop would change. But I don't think anyone would do this > > exactly because it _currently_ acts like O_NONBLOCK, yes? > > > > > probably also have to add some stupid counter in place of the > > > semaphore's counter that we can use to preserve the old behavior of > > > returning -ENOENT once for each cancelled request. :( > > > > I only see this in Debian Code Search: > > https://sources.debian.org/src/crun/0.15+dfsg-1/src/libcrun/seccomp_notify.c/?hl=166#L166 > > which is using epoll_wait(): > > https://sources.debian.org/src/crun/0.15+dfsg-1/src/libcrun/container.c/?hl=1326#L1326 > > > > I expect LXC is using it. :) > > The problem is the scenario where a process is interrupted while it's > waiting for the supervisor to reply. > > Consider the following scenario (with supervisor "S" and target "T"; S > wants to wait for events on two file descriptors seccomp_fd and > other_fd): > > S: starts poll() to wait for events on seccomp_fd and other_fd > T: performs a syscall that's filtered with RET_USER_NOTIF > S: poll() returns and signals readiness of seccomp_fd > T: receives signal SIGUSR1 > T: syscall aborts, enters signal handler > T: signal handler blocks on unfiltered syscall (e.g. write()) > S: starts SECCOMP_IOCTL_NOTIF_RECV > S: blocks because no syscalls are pending > > Depending on what other_fd is, this could in a worst case even lead to > a deadlock (if e.g. the signal handler wants to write to stdout, but > the stdout fd is hooked up to other_fd in the supervisor, but the > supervisor can't consume the data written because it's stuck in > seccomp handling). > > So we have to ensure that when existing code (like that crun code you > linked to) triggers this case, SECCOMP_IOCTL_NOTIF_RECV returns > immediately instead of blocking. Or I guess we could also just set O_NONBLOCK on the fd by default? Since the one existing user is eventloop-based...
next prev parent reply other threads:[~2020-10-26 10:33 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-30 11:07 Michael Kerrisk (man-pages) 2020-09-30 15:03 ` Tycho Andersen 2020-09-30 15:11 ` Tycho Andersen 2020-09-30 20:34 ` Michael Kerrisk (man-pages) 2020-09-30 23:03 ` Tycho Andersen 2020-09-30 23:11 ` Jann Horn 2020-09-30 23:24 ` Tycho Andersen 2020-10-01 1:52 ` Jann Horn 2020-10-01 2:14 ` Jann Horn 2020-10-25 16:31 ` Michael Kerrisk (man-pages) 2020-10-26 15:54 ` Jann Horn 2020-10-27 6:14 ` Michael Kerrisk (man-pages) 2020-10-27 10:28 ` Jann Horn 2020-10-28 6:31 ` Sargun Dhillon 2020-10-28 9:43 ` Jann Horn 2020-10-28 17:43 ` Sargun Dhillon 2020-10-28 18:20 ` Jann Horn 2020-10-01 7:49 ` Michael Kerrisk (man-pages) 2020-10-26 0:32 ` Kees Cook 2020-10-26 9:51 ` Jann Horn 2020-10-26 10:31 ` Jann Horn [this message] 2020-10-28 22:56 ` Kees Cook 2020-10-29 1:11 ` Jann Horn [not found] ` <20201029021348.GB25673@cisco> 2020-10-29 4:26 ` Jann Horn 2020-10-28 22:53 ` Kees Cook 2020-10-29 1:25 ` Jann Horn 2020-10-01 7:45 ` Michael Kerrisk (man-pages) 2020-10-14 4:40 ` Michael Kerrisk (man-pages) 2020-09-30 15:53 ` Jann Horn 2020-10-01 12:54 ` Christian Brauner 2020-10-01 15:47 ` Jann Horn 2020-10-01 16:58 ` Tycho Andersen 2020-10-01 17:12 ` Christian Brauner 2020-10-14 5:41 ` Michael Kerrisk (man-pages) 2020-10-01 18:18 ` Jann Horn 2020-10-01 18:56 ` Tycho Andersen 2020-10-01 17:05 ` Christian Brauner 2020-10-15 11:24 ` Michael Kerrisk (man-pages) 2020-10-15 20:32 ` Jann Horn 2020-10-16 18:29 ` Michael Kerrisk (man-pages) 2020-10-17 0:25 ` Jann Horn 2020-10-24 12:52 ` Michael Kerrisk (man-pages) 2020-10-26 9:32 ` Jann Horn 2020-10-26 9:47 ` Michael Kerrisk (man-pages) 2020-09-30 23:39 ` Kees Cook 2020-10-15 11:24 ` Michael Kerrisk (man-pages) 2020-10-26 0:19 ` Kees Cook 2020-10-26 9:39 ` Michael Kerrisk (man-pages) 2020-10-01 12:36 ` Christian Brauner 2020-10-15 11:23 ` Michael Kerrisk (man-pages) 2020-10-01 21:06 ` Sargun Dhillon 2020-10-01 23:19 ` Tycho Andersen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAG48ez2OWhpH3HHUJSrAmokJ8=SVwKrmQMSw0gEbTJmKE4myCw@mail.gmail.com' \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: For review: seccomp_user_notif(2) manual page' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).