From mboxrd@z Thu Jan 1 00:00:00 1970 From: Martin Schwidefsky Subject: Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode Date: Fri, 12 May 2017 07:54:58 +0200 Message-ID: <20170512075458.09a3a1ce@mschwideX1> References: <20170428153213.137279-1-thgarnie@google.com> <20170508073352.caqe3fqf7nuxypgi@gmail.com> <20170508075209.7aluvpwildw325rf@gmail.com> <1494256932.1167.1.camel@gmail.com> <20170509065619.wmqa6z6w3n6xpvrw@gmail.com> <20170509111007.GA14702@kroah.com> <20170512072802.5a686f23@mschwideX1> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Kees Cook Cc: Linus Torvalds , Thomas Garnier , Greg KH , Ingo Molnar , Daniel Micay , Heiko Carstens , Dave Hansen , Arnd Bergmann , Thomas Gleixner , David Howells , =?UTF-8?B?UmVuw6k=?= Nyffenegger , Andrew Morton , "Paul E . McKenney" , "Eric W . Biederman" , Oleg Nesterov , Pavel Tikhomirov , Ingo Molnar , "H . Peter Anvin" , Andy Lutomirski , Pao List-Id: linux-api@vger.kernel.org On Thu, 11 May 2017 22:34:31 -0700 Kees Cook wrote: > On Thu, May 11, 2017 at 10:28 PM, Martin Schwidefsky > wrote: > > On Thu, 11 May 2017 16:44:07 -0700 > > Linus Torvalds wrote: > > > >> On Thu, May 11, 2017 at 4:17 PM, Thomas Garnier wrote: > >> > > >> > Ingo: Do you want the change as-is? Would you like it to be optional? > >> > What do you think? > >> > >> I'm not ingo, but I don't like that patch. It's in the wrong place - > >> that system call return code is too timing-critical to add address > >> limit checks. > >> > >> Now what I think you *could* do is: > >> > >> - make "set_fs()" actually set a work flag in the current thread flags > >> > >> - do the test in the slow-path (syscall_return_slowpath). > >> > >> Yes, yes, that ends up being architecture-specific, but it's fairly simple. > >> > >> And it only slows down the system calls that actually use "set_fs()". > >> Sure, it will slow those down a fair amount, but they are hopefully a > >> small subset of all cases. > >> > >> How does that sound to people? Thats' where we currently do that > >> > >> if (IS_ENABLED(CONFIG_PROVE_LOCKING) && > >> WARN(irqs_disabled(), "syscall %ld left IRQs disabled", > >> regs->orig_ax)) > >> local_irq_enable(); > >> > >> check too, which is a fairly similar issue. > > > > This is exactly what Heiko did for the s390 backend as a result of this > > discussion. See the _CIF_ASCE_SECONDARY bit in arch/s390/kernel/entry.S, > > for the hot patch the check for the bit is included in the general > > _CIF_WORK test. Only the slow patch gets a bit slower. > > > > git commit b5a882fcf146c87cb6b67c6df353e1c042b8773d > > "s390: restore address space when returning to user space". > > If I'm understanding this, it won't catch corruption of addr_limit > during fast-path syscalls, though (i.e. addr_limit changed without a > call to set_fs()). :( This addr_limit corruption is mostly only a risk > archs without THREAD_INFO_IN_TASK, but it would still be nice to catch > unbalanced set_fs() code, so I like the idea. I like getting rid of > addr_limit entirely even more, but that'll take some time. :) Well for s390 there is no addr_limit as we use two separate address space for kernel vs. user. The equivalent to the addr_limit corruption on a fast-path syscall would be changing CR7 outside of set_fs. This boils down to the question what we are protection against? Bad code with unbalanced set_fs or evil code that changes addr_limit/CR7 outside of set_fs -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.