From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752602AbdGFAUM (ORCPT <rfc822;w@1wt.eu>);
        Wed, 5 Jul 2017 20:20:12 -0400
Received: from mail.kernel.org ([198.145.29.99]:47846 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752440AbdGFAUL (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 5 Jul 2017 20:20:11 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90FFB22BDC
Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org
Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org
MIME-Version: 1.0
In-Reply-To: <CAGXu5j+khT6eirko-HLnimZZXe6h2XMWZQCYj8b_ARr-j67J8Q@mail.gmail.com>
References: <1499126133.2707.20.camel@decadent.org.uk> <CA+55aFzMX72+Kb=zNgjCf6UfPt+C+e7WDp_rpbSLuOVx1k7iqg@mail.gmail.com>
 <20170704084122.GC14722@dhcp22.suse.cz> <20170704093538.GF14722@dhcp22.suse.cz>
 <20170704094728.GB22013@1wt.eu> <20170704104211.GG14722@dhcp22.suse.cz>
 <20170704113611.GA4732@decadent.org.uk> <1499209315.2707.29.camel@decadent.org.uk>
 <CA+55aFzjibuaV=GBZQN8KOaCpf1P3B1abJCYGaxEM9RVpf9fXg@mail.gmail.com>
 <1499257180.2707.34.camel@decadent.org.uk> <20170705142354.GB21220@dhcp22.suse.cz>
 <CALCETrW5P8ZhuOJ4bL0x+hx52RdFkG4_x0qAQofrdBi2JavxmA@mail.gmail.com>
 <CA+55aFwn3bbfxXMTafQVXz0nx5hWTUzQFdaqOFvJSY96Muc3VA@mail.gmail.com>
 <CALCETrUMsnV=jWJBtFxfJPdnoszLvkXwT8NuaQnOQOoUOpVjCw@mail.gmail.com> <CAGXu5j+khT6eirko-HLnimZZXe6h2XMWZQCYj8b_ARr-j67J8Q@mail.gmail.com>
From: Andy Lutomirski <luto@kernel.org>
Date: Wed, 5 Jul 2017 17:19:47 -0700
X-Gmail-Original-Message-ID: <CALCETrVhvvhwdqER25_dKbGXX35UYvwfQrxz9VfofbKCqTQLRQ@mail.gmail.com>
Message-ID: <CALCETrVhvvhwdqER25_dKbGXX35UYvwfQrxz9VfofbKCqTQLRQ@mail.gmail.com>
Subject: Re: [PATCH] mm: larger stack guard gap, between vmas
To: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Michal Hocko <mhocko@kernel.org>, Ben Hutchings <ben@decadent.org.uk>,
        Willy Tarreau <w@1wt.eu>, Hugh Dickins <hughd@google.com>,
        Oleg Nesterov <oleg@redhat.com>,
        "Jason A. Donenfeld" <Jason@zx2c4.com>, Rik van Riel <riel@redhat.com>,
        Larry Woodman <lwoodman@redhat.com>,
        "Kirill A. Shutemov" <kirill@shutemov.name>,
        Tony Luck <tony.luck@intel.com>,
        "James E.J. Bottomley" <jejb@parisc-linux.org>,
        Helge Diller <deller@gmx.de>, James Hogan <james.hogan@imgtec.com>,
        Laura Abbott <labbott@redhat.com>, Greg KH <greg@kroah.com>,
        "security@kernel.org" <security@kernel.org>,
        Qualys Security Advisory <qsa@qualys.com>,
        LKML <linux-kernel@vger.kernel.org>, Ximin Luo <infinity0@debian.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Jul 5, 2017 at 4:50 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Jul 5, 2017 at 10:23 AM, Andy Lutomirski <luto@kernel.org> wrote:
>> Right.  But I think the approach that we're all taking here is a bit
>> nutty.  We all realize that this issue is a longstanding *GCC* bug
>> [1], but we're acting like it's a Big Deal (tm) kernel bug that Must
>> Be Fixed (tm) and therefore is allowed to break ABI.  My security hat
>> is normally pretty hard-line, but I think it may be time to call BS.
>>
>> Imagine if Kees had sent some symlink hardening patch that was
>> default-on and broke a stock distro.  Or if I had sent a vsyscall
>> hardening patch that broke real code.  It would get reverted right
>> away, probably along with a diatribe about how we should have known
>> better.  I think this stack gap stuff is the same thing.  It's not a
>> security fix -- it's a hardening patch.
>>
>> Looking at it that way, I think a new inherited-on-exec flag is nucking futs.
>>
>> I'm starting to think that the right approach is to mostly revert all
>> this stuff (the execve fixes are fine).  Then start over and think
>> about it as hardening.  I would suggest the following approach:
>>
>>  - The stack gap is one page, just like it's been for years.
>>  - As a hardening feature, if the stack would expand within 64k or
>> whatever of a non-MAP_FIXED mapping, refuse to expand it.  (This might
>> have to be a non-hinted mapping, not just a non-MAP_FIXED mapping.)
>> The idea being that, if you deliberately place a mapping under the
>> stack, you know what you're doing.  If you're like LibreOffice and do
>> something daft and are thus exploitable, you're on your own.
>>  - As a hardening measure, don't let mmap without MAP_FIXED position
>> something within 64k or whatever of the bottom of the stack unless a
>> MAP_FIXED mapping is between them.
>>
>> And that's all.  It's not like a 64k gap actually fixes these bugs for
>> real -- it just makes them harder to exploit.
>>
>> [1] The code that GCC generates for char buf[bug number] and alloca()
>> is flat-out wrong.  Everyone who's ever thought about it all all knows
>> it and has known about it for years, but no one cared to fix it.
>
> As part of that should we put restrictions on the environment of
> set*id exec too? Part of the risks demonstrated by Qualys was that
> allowing a privilege-elevating binary to inherit rlimits can have lead
> to the nasty memory layout side-effects. That would fall into the
> "hardening" bucket as well. And if it turns out there is some set*id
> binary out there that can't run with "only", e.g., 128MB of stack, we
> can make it configurable...

Yes.  I think it's ridiculous that you can change rlimits and then
exec a setuid thing.  It's not so easy to fix, though.  Maybe track,
per-task, inherited by clone and exec, what the rlimits were the last
time the process had privilege and reset to those limits when running
something setuid.  But a better approach might be to have some sysctls
that say what the rlimits become when doing setuid.

We need per-user-ns sysctls for stuff like this, and we don't really
have them...

--Andy