From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933682AbcKPSz0 (ORCPT ); Wed, 16 Nov 2016 13:55:26 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:36318 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932382AbcKPSzT (ORCPT ); Wed, 16 Nov 2016 13:55:19 -0500 MIME-Version: 1.0 In-Reply-To: <20161116101548.GO3142@twins.programming.kicks-ass.net> References: <20161114174446.832175072@infradead.org> <20161115084009.GB15734@gmail.com> <20161115094744.GG3142@twins.programming.kicks-ass.net> <20161115100359.GA7757@gmail.com> <20161115104608.GH3142@twins.programming.kicks-ass.net> <20161115130315.GA12957@gmail.com> <32165F01-DA9E-4287-9831-6EDE40A71E83@infradead.org> <20161116083155.GC1270@gmail.com> <20161116101548.GO3142@twins.programming.kicks-ass.net> From: Kees Cook Date: Wed, 16 Nov 2016 10:55:16 -0800 X-Google-Sender-Auth: ZRE4Mbjoj8-E3whnbWGMBc4qFnU Message-ID: Subject: Re: [RFC][PATCH 7/7] kref: Implement using refcount_t To: Peter Zijlstra Cc: Ingo Molnar , Greg KH , Will Deacon , "Reshetova, Elena" , Arnd Bergmann , Thomas Gleixner , "H. Peter Anvin" , David Windsor , Linus Torvalds , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 16, 2016 at 2:15 AM, Peter Zijlstra wrote: > On Wed, Nov 16, 2016 at 09:31:55AM +0100, Ingo Molnar wrote: >> >> * Kees Cook wrote: >> >> > On Tue, Nov 15, 2016 at 11:16 AM, Peter Zijlstra wrote: >> > > >> > > >> > > On 15 November 2016 19:06:28 CET, Kees Cook wrote: >> > > >> > >>I'll want to modify this in the future; I have a config already doing >> > >>"Bug on data structure corruption" that makes the warn/bug choice. >> > >>It'll need some massaging to fit into the new refcount_t checks, but >> > >>it should be okay -- there needs to be a way to complete the >> > >>saturation, etc, but still kill the offending process group. >> > > >> > > Ideally we'd create a new WARN like construct that continues in kernel space >> > > and terminates the process on return to user. That way there would be minimal >> > > kernel state corruption. >> >> Yeah, so the problem is that sometimes you are p0wned the moment you return to a >> corrupted stack, and some of these checks only detect corruption after the fact. > > So the case here is about refcounts, with the saturation semantics we > avoid the use-after-free case which is all this is about. So actually > continuation of execution is harmless vs the attack vector in question. > > Corrupting the stack is another attack vector, one that refcount > overflow is entirely unrelated to and not one I think we should consider > here. > > The problem with BUG and insta killing the task is that refcounts are > typically done under locks, if you kill the task before the unlock, > you've wrecked kernel state in unrecoverable ways. My intention with what I'm designing is to couple the "panic_on_oops" sysctl logic with a "kernel structure corruption has been detected" warning. That way, one can select, at runtime, if the kernel should panic instantly on hitting this, or just do its best to clean things up and kill the process. There basically isn't a use-case for BUG in this situation. Either you're risk-averse enough to want to take the entire machine down, or you want to kill the offending process and clean up to continue running. I'm still evolving how to best do it, and right now it's a rather large hammer (now controlled by a CONFIG called CONFIG_BUG_ON_DATA_CORRUPTION in -next but it will likely disappear entirely as its design has evolved). I intend to improve it first and then expand its coverage in the kernel. It requires extracting some of the per-arch BUG logic into a real kernel API, and combining it with existing pieces of the WARN API. -Kees -- Kees Cook Nexus Security