From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753135AbcGMIzI (ORCPT ); Wed, 13 Jul 2016 04:55:08 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:51282 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752671AbcGMIy6 (ORCPT ); Wed, 13 Jul 2016 04:54:58 -0400 X-IBM-Helo: d06dlp03.portsmouth.uk.ibm.com X-IBM-MailFrom: borntraeger@de.ibm.com X-IBM-RcptTo: linux-arch@vger.kernel.org;linux-kernel@vger.kernel.org;linux-s390@vger.kernel.org Subject: Re: [PATCH v5 00/32] virtually mapped stacks and thread_info cleanup To: Andy Lutomirski , x86@kernel.org, linux-kernel@vger.kernel.org References: Cc: linux-arch@vger.kernel.org, Borislav Petkov , Nadav Amit , Kees Cook , Brian Gerst , "kernel-hardening@lists.openwall.com" , Linus Torvalds , Josh Poimboeuf , Jann Horn , Heiko Carstens , linux-s390 From: Christian Borntraeger Date: Wed, 13 Jul 2016 10:54:11 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16071308-0040-0000-0000-000001FD86A0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16071308-0041-0000-0000-000020F837DA Message-Id: <578601B3.3050903@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-13_02:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1607130101 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/11/2016 10:53 PM, Andy Lutomirski wrote: > Hi all- > > Since the dawn of time, a kernel stack overflow has been a real PITA > to debug, has caused nondeterministic crashes some time after the > actual overflow, and has generally been easy to exploit for root. > > With this series, arches can enable HAVE_ARCH_VMAP_STACK. Arches > that enable it (just x86 for now) get virtually mapped stacks with > guard pages. This causes reliable faults when the stack overflows. > > If the arch implements it well, we get a nice OOPS on stack overflow > (as opposed to panicing directly or otherwise exploding badly). On > x86, the OOPS is nice, has a usable call trace, and the overflowing > task is killed cleanly. > > This series (starting with v4) also extensively cleans up > thread_info. thread_info has been partially redundant with > thread_struct for a long time -- both are places for arch code to > add additional per-task variables. thread_struct is much cleaner: > it's always in task_struct, and there's nothing particularly magical > about it. So this series contains a bunch of cleanups on x86 to > move almost everything from thread_info to thread_struct (which, > even by itself, deletes more code than it adds) and to remove x86's > dependence on thread_info's position on the stack. Then it opts x86 > into a new config option THREAD_INFO_IN_TASK to get rid of > arch-specific thread_info entirely and simply embed a defanged > thread_info (containing only flags) and 'int cpu' into task_struct. > > Once thread_info stops being magical, there's another benefit: we > can free the thread stack as soon as the task is dead (without > waiting for RCU) and then, if vmapped stacks are in use, cache the > entire stack for reuse on the same cpu. > > This seems to be an overall speedup of about 0.5-1 µs per > pthread_create/join in a simple test -- a percpu cache of vmalloced > stacks appears to be a bit faster than a high-order stack > allocation, at least when the cache hits. (I expect that workloads > with a low cache hit rate are likely to be dominated by other > effects anyway.) > > This does not address interrupt stacks. > > It's worth noting that s390 has an arch-specific gcc feature that > detects stack overflows by adjusting function prologues. Arches > with features like that may wish to avoid using vmapped stacks to > minimize the performance hit. Yes, might not need this for stack overflow detection. What might be interesting is the thread_info/thread_struct change, if we can strip down thread_info.(CONFIG_THREAD_INFO_IN_TASK). Would it actually make sense to separate these two changes to see what performance impact CONFIG_THREAD_INFO_IN_TASK has on its own?