All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Huacai Chen <chenhuacai@gmail.com>,
	Huacai Chen <chenhuacai@loongson.cn>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Airlie <airlied@linux.ie>, Jonathan Corbet <corbet@lwn.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Xuefeng Li <lixuefeng@loongson.cn>,
	Yanteng Si <siyanteng@loongson.cn>, Guo Ren <guoren@kernel.org>,
	Xuerui Wang <kernel@xen0n.name>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH V9 13/24] LoongArch: Add system call support
Date: Mon, 9 May 2022 12:00:58 +0200	[thread overview]
Message-ID: <20220509100058.vmrgn5fkk3ayt63v@wittgenstein> (raw)
In-Reply-To: <20220507121104.7soocpgoqkvwv3gc@wittgenstein>

On Sat, May 07, 2022 at 02:11:04PM +0200, Christian Brauner wrote:
> On Sat, Apr 30, 2022 at 12:34:52PM +0200, Arnd Bergmann wrote:
> > On Sat, Apr 30, 2022 at 12:05 PM Huacai Chen <chenhuacai@gmail.com> wrote:
> > > On Sat, Apr 30, 2022 at 5:45 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > On Sat, Apr 30, 2022 at 11:05 AM Huacai Chen <chenhuacai@loongson.cn> wrote:
> > > > >
> > > > > This patch adds system call support and related uaccess.h for LoongArch.
> > > > >
> > > > > Q: Why keep __ARCH_WANT_NEW_STAT definition while there is statx:
> > > > > A: Until the latest glibc release (2.34), statx is only used for 32-bit
> > > > >    platforms, or 64-bit platforms with 32-bit timestamp. I.e., Most 64-
> > > > >    bit platforms still use newstat now.
> > > > >
> > > > > Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
> > > > > A: The latest glibc release (2.34) has some basic support for clone3 but
> > > > >    it isn't complete. E.g., pthread_create() and spawni() have converted
> > > > >    to use clone3 but fork() will still use clone. Moreover, some seccomp
> > > > >    related applications can still not work perfectly with clone3. E.g.,
> > > > >    Chromium sandbox cannot work at all and there is no solution for it,
> > > > >    which is more terrible than the fork() story [1].
> > > > >
> > > > > [1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
> > > >
> > > > I still think these have to be removed. There is no mainline glibc or musl
> > > > port yet, and neither of them should actually be required. Please remove
> > > > them here, and modify your libc patches accordingly when you send those
> > > > upstream.
> > >
> > > If this is just a problem that can be resolved by upgrading
> > > glibc/musl, I will remove them. But the Chromium problem (or sandbox
> > > problem in general) seems to have no solution now.
> > 
> > I added Christian Brauner to Cc now, maybe he has come across the
> > sandbox problem before and has an idea for a solution.
> 
> (I just got back from LSFMM so I'll reply in more detail next week. I'm
> still pretty jet-lagged.)

Right, I forgot about the EPERM/ENOSYS sandbox thread.

Kees and I gave a talk about this problem at LPC 2019 (see [2]). The
proposed solutions back then was to add basic deep argument inspection
for first-level pointers to seccomp.

There are problems with this approach such as not useable on
second-level pointers (although we concluded that's ok) and if the input
args are very large copying stuff from within seccomp becomes rather
costly and in general the various approaches seemed handwavy at the
time.

If seccomp were to be made to support some basic form of eBPF such that
it can still be safely called by unprivileged users then this would
likely be easier to do (famous last words) but given that the stance has
traditionally bee to not port seccomp it remains a tricky patch.

Some time after that I talked to Mathieu Desnoyers about this issue who
used another angle of attack. The idea seems less complicated to me.
Instead of argument inspection we introduce basic syscall argument
checksumming for seccomp. It would only be done when seccomp is
interested in syscall input args and checksumming would be per syscall
argument. It would be validated within the syscall when it actually
reads the arguments; again, only if seccomp is used. If the checksums
mismatch an error is returned or the calling process terminated.

There's one case that deserves mentioning: since we introduced the
seccomp notifier we do allow advanced syscall interception and we do use
it extensively in various projects.

Roughly, it works by allowing a userspace process (the "supervisor") to
listen on a seccomp fd. The seccomp fd is an fd referring to the filter
of a target task (the "supervisee"). When the supervisee performs a
syscall listed in the seccomp notify filter the supervisor will receive
a notification on the seccomp fd for the filter.

I mention this because it is possible for the supervisor to e.g.
intercept an bpf() system call and then modify/create/attach a bpf
program for the supervisee and then update fields in the supervisee's
bpf struct that was passed to the bpf() syscall by it. So the supervisor
might rewrite syscall args and continue the syscall (In general, it's
not recommeneded because of TOCTOU. But still doable in certain
scenarios where we can guarantee that this is safe even if syscall args
are rewritten to something else by a MIT attack.).

Arguably, the checksumming approach could even be made to work with this
if the seccomp fd learns a new ioctl() or similar to safely update the
checksum.

I can try and move a poc for this up the todo list.

Without an approach like this certain sandboxes will fallback to
ENOSYSing system calls they can't filter. This is a generic problem
though with clone3() being one promiment example.

[2]: https://www.youtube.com/watch?v=PnOSPsRzVYM&list=PLVsQ_xZBEyN2Ol7y8axxhbTsG47Va3Se2

  reply	other threads:[~2022-05-09 10:08 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-30  9:04 [PATCH V9 00/22] arch: Add basic LoongArch support Huacai Chen
2022-04-30  9:04 ` [PATCH V9 01/24] Documentation: LoongArch: Add basic documentations Huacai Chen
2022-05-01  7:48   ` Bagas Sanjaya
2022-05-01  8:55     ` Huacai Chen
2022-05-01  9:32   ` WANG Xuerui
2022-05-01 10:17     ` Huacai Chen
2022-04-30  9:04 ` [PATCH V9 02/24] Documentation/zh_CN: Add basic LoongArch documentations Huacai Chen
2022-05-01  9:38   ` WANG Xuerui
2022-04-30  9:04 ` [PATCH V9 03/24] LoongArch: Add elf-related definitions Huacai Chen
2022-05-01  9:41   ` WANG Xuerui
2022-05-01 14:27     ` Huacai Chen
2022-04-30  9:04 ` [PATCH V9 04/24] LoongArch: Add writecombine support for drm Huacai Chen
2022-04-30  9:04 ` [PATCH V9 05/24] LoongArch: Add build infrastructure Huacai Chen
2022-05-01 10:09   ` WANG Xuerui
2022-05-01 12:41     ` Huacai Chen
2022-05-01 15:43     ` Xi Ruoyao
2022-04-30  9:05 ` [PATCH V9 06/24] LoongArch: Add CPU definition headers Huacai Chen
2022-05-01 11:05   ` WANG Xuerui
2022-05-01 15:07     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 07/24] LoongArch: Add atomic/locking headers Huacai Chen
2022-05-01 11:16   ` WANG Xuerui
2022-05-01 13:16     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 08/24] LoongArch: Add other common headers Huacai Chen
2022-05-01 11:39   ` WANG Xuerui
2022-05-01 14:26     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 09/24] LoongArch: Add boot and setup routines Huacai Chen
2022-04-30  9:05 ` [PATCH V9 10/24] LoongArch: Add exception/interrupt handling Huacai Chen
2022-05-01 16:27   ` Xi Ruoyao
2022-05-01 17:08     ` Xi Ruoyao
2022-05-02  0:01       ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 11/24] LoongArch: Add process management Huacai Chen
2022-04-30  9:05 ` [PATCH V9 12/24] LoongArch: Add memory management Huacai Chen
2022-04-30  9:05 ` [PATCH V9 13/24] LoongArch: Add system call support Huacai Chen
2022-04-30  9:44   ` Arnd Bergmann
2022-04-30 10:05     ` Huacai Chen
2022-04-30 10:34       ` Arnd Bergmann
2022-05-07 12:11         ` Christian Brauner
2022-05-09 10:00           ` Christian Brauner [this message]
2022-05-11  7:11             ` Arnd Bergmann
2022-05-11 21:12               ` [musl] " Rich Felker
2022-05-12  7:21                 ` Arnd Bergmann
2022-05-12 12:11                   ` Rich Felker
2022-05-11 16:17             ` Florian Weimer
2022-04-30  9:05 ` [PATCH V9 14/24] LoongArch: Add signal handling support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 15/24] LoongArch: Add elf and module support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 16/24] LoongArch: Add misc common routines Huacai Chen
2022-04-30  9:50   ` Arnd Bergmann
2022-04-30 10:00     ` Huacai Chen
2022-04-30 10:41       ` Arnd Bergmann
2022-04-30 13:22         ` Palmer Dabbelt
2022-05-01  5:12           ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 17/24] LoongArch: Add some library functions Huacai Chen
2022-05-01 10:55   ` Guo Ren
2022-05-01 12:18     ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 18/24] LoongArch: Add PCI controller support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 19/24] LoongArch: Add VDSO and VSYSCALL support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 20/24] LoongArch: Add efistub booting support Huacai Chen
2022-04-30  9:56   ` Arnd Bergmann
2022-04-30 10:02     ` Huacai Chen
2022-05-03  7:23     ` Ard Biesheuvel
2022-05-05  9:59       ` Huacai Chen
2022-05-06  8:14         ` Ard Biesheuvel
2022-05-06 11:26           ` WANG Xuerui
2022-05-06 11:41             ` Arnd Bergmann
2022-05-06 13:20               ` Huacai Chen
2022-05-13 19:32                 ` Arnd Bergmann
2022-05-14  2:27                   ` Huacai Chen
2022-04-30  9:05 ` [PATCH V9 21/24] LoongArch: Add zboot (compressed kernel) support Huacai Chen
2022-04-30 10:07   ` Arnd Bergmann
2022-04-30 10:07     ` Arnd Bergmann
2022-04-30 10:07     ` Arnd Bergmann
2022-05-01  5:22     ` Huacai Chen
2022-05-01  5:22       ` Huacai Chen
2022-05-01  5:22       ` Huacai Chen
2022-05-01  6:35       ` Russell King (Oracle)
2022-05-01  6:35         ` Russell King (Oracle)
2022-05-01  6:35         ` Russell King (Oracle)
2022-05-01  8:46         ` Huacai Chen
2022-05-01  8:46           ` Huacai Chen
2022-05-01  8:46           ` Huacai Chen
2022-05-01 11:28           ` Russell King (Oracle)
2022-05-01 11:28             ` Russell King (Oracle)
2022-05-01 11:28             ` Russell King (Oracle)
2022-05-01  8:33       ` Arnd Bergmann
2022-05-01  8:33         ` Arnd Bergmann
2022-05-01  8:33         ` Arnd Bergmann
2022-05-01 23:36     ` Ard Biesheuvel
2022-05-01 23:36       ` Ard Biesheuvel
2022-05-01 23:36       ` Ard Biesheuvel
2022-04-30  9:05 ` [PATCH V9 22/24] LoongArch: Add multi-processor (SMP) support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 23/24] LoongArch: Add Non-Uniform Memory Access (NUMA) support Huacai Chen
2022-04-30  9:05 ` [PATCH V9 24/24] LoongArch: Add Loongson-3 default config file Huacai Chen
2022-05-01  8:19 ` [PATCH V9 00/22] arch: Add basic LoongArch support Bagas Sanjaya
2022-05-01  8:55   ` Huacai Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220509100058.vmrgn5fkk3ayt63v@wittgenstein \
    --to=brauner@kernel.org \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=chenhuacai@gmail.com \
    --cc=chenhuacai@loongson.cn \
    --cc=corbet@lwn.net \
    --cc=guoren@kernel.org \
    --cc=jiaxun.yang@flygoat.com \
    --cc=kernel@xen0n.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lixuefeng@loongson.cn \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=siyanteng@loongson.cn \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.