linux-hardening.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: "Robert Święcki" <robert@swiecki.net>
Cc: Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>,
	linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH 0/3] signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
Date: Fri, 11 Feb 2022 11:46:53 -0600	[thread overview]
Message-ID: <87a6ex1ek2.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <CAP145phAg3ZSPJw7x2kKVQe86puy-XyKatVoByVoM27RP4aw_g@mail.gmail.com> ("Robert =?utf-8?B?xZp3acSZY2tpIidz?= message of "Fri, 11 Feb 2022 13:54:26 +0100")

Robert Święcki <robert@swiecki.net> writes:

>> It's mainly about the exit stuff having never been run before on these
>> kinds of process states, so things don't make sense. For example, on the
>> SIGSYS death, the registers have been rewound for the coredump, so when
>> the exit trace runs on x86 it sees the syscall return value as equal to
>> the syscall number (since %rax is used for the syscall number on entry
>> and for the syscall result on exit). So when a tracer watches a seccomp
>> fatal SIGSYS, it sees the syscall exit before it sees the child exit
>> (and therefore the signal). For example, x86_64 write (syscall number
>> 1), will return as if it had written 1 byte. :P
>>
>> So, it's not harmful, but it's confusing and weird. :)
>>
>> > I am trying to figure out if there is a case to be made that it was a
>> > bug that these events were missing.
>>
>> I don't think so -- the syscall did not finish, so there isn't a valid
>> return code. The process exited before it completed.

With the process state rewound for the coredump it makes sense
why the syscall exit would be meaningless.  So at least for
now I am convinced function that clears all syscall_work flags
is the way to go.

> A tangential point: please ignore for the purpose of fixing the
> problem at hand. I'm mostly making it, in case it can be taken into
> account in case some bigger changes to this code path are to be made -
> given that it touches the problem of signal delivery.
>
> When I noticed this problem, I was looking for a way to figure out
> what syscall caused SIGSYS (via SECCOMP_RET_KILL_*), and there's no
> easy way to do that programmatically from the perspective of a parent
> process. There are three ways of doing this that come to mind.

Unless I am misunderstanding what you are looking for
this information is contained within the SIGSYS siginfo.
The field si_syscall contains the system call number and
the field si_errno contains return code from the seccomp filter.

All of that can be read from the core dump of the process that exited.

Looking quickly I don't see a good way to pull that signal information
out of the kernel other than with a coredump.

It might be possible to persuade PTRACE_EVENT_EXIT to give it to you,
but I haven't looked at it enough to see if that would be a sensible
strategy.


> 1). Keep reference to /proc/<child>/syscall and read it upon process
> exiting by SIGSYS (and reading it with wait/id(WNOWAIT) from parent).
> This used to work a long time ago, but was racy (I reported this
> problem many years ago), and currently only -1 0 0 is returned (as in,
> no syscall in progress).

That might be a bug worth fixing.  But it would definitely need a test
that is run regularly to prevent future regressions.

> 2). Use ptrace - it works but it changes the logic of the signal
> delivery inside a traced process and requires non-trivial code to make
> it work correctly: use of PT_INTERRUPT, understanding all signal
> delivery events, registers and their mapping to syscall arguments per
> CPU arch.

I guess this works because you can see which syscall occurred before the
SECCOMP_RET_KILL.  Except for the bugs we are discussing fixing there
isn't a ptrace_stop after SECCOMP_RET_KILL.

> 3). auditd will print details of failed syscall to kmsg, but the
> string is not very structured, and auditd might not be always present
> inside kernels. And reading that data via netlink requires root IIRC.

I assume this is the same you can see which syscall occurred before
the SECCOMP_RET_KILL.

>
> I think it'd be good to have some way of doing it from the perspective
> of a parent process - it'd simplify development of sandboxing managers
> (eg nsjail, minijail, firejail), and creation of good seccomp
> policies.

By development do you mean debugging sandbox managers?  Or do you mean
something that sandbox managers can use on a routine basis?

Eric

  reply	other threads:[~2022-02-11 17:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10  2:53 [PATCH 0/3] signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE Kees Cook
2022-02-10  2:53 ` [PATCH 1/3] " Kees Cook
2022-02-10 16:18   ` Jann Horn
2022-02-10 17:37     ` Kees Cook
2022-02-10 18:01       ` Jann Horn
2022-02-10 18:12         ` Eric W. Biederman
2022-02-10 21:09         ` Kees Cook
2022-02-11 20:15           ` Jann Horn
2022-02-10 18:16   ` Eric W. Biederman
2022-02-10  2:53 ` [PATCH 2/3] seccomp: Invalidate seccomp mode to catch death failures Kees Cook
2022-02-10  2:53 ` [PATCH 3/3] samples/seccomp: Adjust sample to also provide kill option Kees Cook
2022-02-10 18:17 ` [PATCH 0/3] signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE Eric W. Biederman
2022-02-10 18:41   ` Kees Cook
2022-02-10 18:58     ` Eric W. Biederman
2022-02-10 20:43       ` Kees Cook
2022-02-10 22:48         ` Eric W. Biederman
2022-02-11  1:26           ` Kees Cook
2022-02-11  1:47             ` Eric W. Biederman
2022-02-11  2:53               ` Kees Cook
2022-02-11 12:54                 ` Robert Święcki
2022-02-11 17:46                   ` Eric W. Biederman [this message]
2022-02-11 18:57                     ` Robert Święcki
2022-02-11 20:01                     ` Kees Cook
2022-02-11 19:58                   ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6ex1ek2.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=robert@swiecki.net \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).