LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: "linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Sekhar Nori <nsekhar@ti.com>, Roman Peniaev <r.peniaev@gmail.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	Christoffer Dall <christoffer.dall@linaro.org>
Subject: Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
Date: Wed, 21 Jan 2015 15:32:12 -0800
Message-ID: <CAGXu5jKzif=vp6gn5ZtrTx-JTN367qFphobnt9s=awbaafwoUw@mail.gmail.com> (raw)
In-Reply-To: <20150120230458.GM26493@n2100.arm.linux.org.uk>

On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Jan 20, 2015 at 10:45:19PM +0000, Russell King - ARM Linux wrote:
>> Well, the whole question is this: is restarting a system call like
>> usleep() really a separate system call, or is it a kernel implementation
>> detail?
>>
>> If you wanted seccomp to see this, what would be the use case?  Why
>> would seccomp want to block this syscall?  Does it make sense for
>> seccomp to block this syscall when it doesn't block something like
>> usleep() and then have usleep() fail just because the thread received
>> a signal?
>>
>> I personally regard the whole restart system call thing as a purely
>> kernel internal thing which should not be exposed to userland.  If
>> we decide that it should be exposed to userland, then it becomes part
>> of the user ABI, and it /could/ become difficult if we needed to
>> change it in future - and I'd rather not get into the "oh shit, we
>> can't change it because that would break app X" crap.
>
> Here's a scenario where it could become a problem:
>
> Let's say that we want to use seccomp to secure some code which issues
> system calls.  We determine that the app uses system calls which don't
> result in the restart system call being issued, so we decide to ask
> seccomp to block the restart system call.  Some of these system calls
> that the app was using are restartable system calls.
>
> When these system calls are restarted, what we see via ptrace etc is
> that the system call simply gets re-issued as its own system call.
>
> In a future kernel version, we decide that we could really do with one
> of those system calls using the restart block feature, so we arrange
> for it to set up the restart block, and return -ERESTART_BLOCK.  That's
> fine for most applications, but this app now breaks.
>
> The side effect of that breakage is that we have to revert that kernel
> change - because we've broken userland, and that's simply not allowed.
>
> Now look at the alternative: we don't make the restart syscall visible.
> This means that we hide that detail, and we actually reflect the
> behaviour that we've had for the other system call restart mechanisms,
> and we don't have to fear userspace breakage as a result of switching
> from one restart mechanism to another.
>
> I am very much of the opinion that we should be trying to limit the
> exposure of inappropriate kernel internal details to userland, because
> userland has a habbit of becoming reliant on them, and when it does,
> it makes kernel maintanence unnecessarily harder.

I totally agree with you. :) My question here is more about what we
should do with what we currently have since we have some unexpected
combinations.

There is already an __NR_restart_syscall syscall and it seems like
it's already part of the userspace ABI:
 - it is possible to call it from userspace directly
 - seccomp "sees" it
 - ptrace doesn't see it

Native ARM64 hides the restart from both seccomp and ptrace, and this
seems like the right idea, except that restart_syscall is still
callable from userspace. I don't think there's a way to make that
vanish, which means we'll always have an exposed syscall. If anything
goes wrong with it (which we've been quite close to recently[1]),
there would be no way to have seccomp filter it.

So, at the least, I'd like arm64 to NOT hide restart_syscall from
seccomp, and at best I'd like both arm and arm64 to (somehow) entirely
remove restart_syscall from the userspace ABI so it wouldn't need to
be filtered, and wouldn't become a weird ABI hiccup as you've
described.

I fail to imagine a way to remove restart_syscall from userspace, so
I'm left with wanting parity of behavior between ARM and ARM64 (and
x86). What's the right way forward?

-Kees

[1] https://git.kernel.org/linus/849151dd5481bc8acb1d287a299b5d6a4ca9f1c3

-- 
Kees Cook
Chrome OS Security

  reply index

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-11 14:32 [PATCH 0/2] ARM: set thread_info->syscall just before sys_* execution Roman Pen
2015-01-11 14:32 ` [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall Roman Pen
2015-01-12 18:39   ` Will Deacon
2015-01-13  8:35     ` Roman Peniaev
2015-01-14  2:23       ` Roman Peniaev
2015-01-14 20:51       ` Kees Cook
2015-01-15  1:54         ` Roman Peniaev
2015-01-15 22:54           ` Kees Cook
2015-01-16 15:57             ` Roman Peniaev
2015-01-16 15:59               ` Russell King - ARM Linux
2015-01-16 16:08                 ` Roman Peniaev
2015-01-16 16:17                   ` Russell King - ARM Linux
2015-01-16 19:57                     ` Kees Cook
2015-01-16 23:54                       ` Kees Cook
2015-01-19  5:58                         ` Roman Peniaev
2015-01-20 18:56                           ` Kees Cook
2015-01-19  9:20                         ` Will Deacon
2015-01-20 18:31                           ` Kees Cook
2015-01-20 22:45                             ` Russell King - ARM Linux
2015-01-20 23:04                               ` Russell King - ARM Linux
2015-01-21 23:32                                 ` Kees Cook [this message]
2015-01-22  1:24                                   ` Roman Peniaev
2015-01-22 18:07                                     ` Kees Cook
2015-01-23  4:17                                       ` Roman Peniaev
2015-01-11 14:32 ` [PATCH 2/2] ARM: entry-common,ptrace: do not pass scno to syscall_trace_enter Roman Pen
2015-01-13 20:08   ` Kees Cook
2015-01-13 23:21     ` Roman Peniaev
2015-01-13 23:43       ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGXu5jKzif=vp6gn5ZtrTx-JTN367qFphobnt9s=awbaafwoUw@mail.gmail.com' \
    --to=keescook@chromium.org \
    --cc=Catalin.Marinas@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=nsekhar@ti.com \
    --cc=r.peniaev@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git