From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965054AbbLRU2g (ORCPT ); Fri, 18 Dec 2015 15:28:36 -0500 Received: from mail-ob0-f173.google.com ([209.85.214.173]:34951 "EHLO mail-ob0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752459AbbLRU2e (ORCPT ); Fri, 18 Dec 2015 15:28:34 -0500 MIME-Version: 1.0 In-Reply-To: <56746774.8000707@linux.intel.com> References: <56736BD1.5080700@linux.intel.com> <5673750B.606@linux.intel.com> <567453AF.5060808@linux.intel.com> <56746774.8000707@linux.intel.com> From: Andy Lutomirski Date: Fri, 18 Dec 2015 12:28:14 -0800 Message-ID: Subject: Re: Rethinking sigcontext's xfeatures slightly for PKRU's benefit? To: Dave Hansen Cc: "H. Peter Anvin" , Oleg Nesterov , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Brian Gerst , "linux-kernel@vger.kernel.org" , Linus Torvalds , Christoph Hellwig Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 18, 2015 at 12:07 PM, Dave Hansen wrote: > On 12/18/2015 11:21 AM, Andy Lutomirski wrote: >> On Fri, Dec 18, 2015 at 10:42 AM, Dave Hansen >> wrote: >>> On 12/18/2015 08:04 AM, Andy Lutomirski wrote: >>>> 1b. If the app malfunctions such that RSP points to pmem, the kernel >>>> MUST NOT clobber the pmem space. I think that this basically mandates >>>> that PKRU needs to have some safe state (i.e. definitely not the init >>>> state) on signal delivery: the kernel is going to write a signal frame >>>> at the address identified by RSP, and that address is in pmem, so >>>> those writes need to fail. >>> >>> The kernel is writing the signal frame using normal old copy_to_user(). >>> Those are writing through mappings with _PAGE_USER set and should be >>> subject to the PKRU state of the thread before the signal started to be >>> delivered. >>> >>> We don't do the fpu__clear() until after this copy, so I think pkeys >>> enforcement is being done properly for this today. >> >> True, but I think only in a very limited sense. Your average signal >> handler is reasonably like to execute "push $rbp" as its very first >> instruction, at which point we're immediately screwed with the current >> arrangement. > > I completely agree that there's a window for corruption. > > But, I think it's a small one. Basically, RSP would have to pointing at > a place which was allowed by protection keys for all of the sigframe > setup. Then, _just_ happened to be at a place which was denied by > protection keys when it enters the signal handler back in userspace. > It's possible, but it's a small window. That's true. OTOH, I think that the signal in the middle of the wrpkru-allowed window is also a compelling case. So maybe my new preference is option B, where we save and restore PKRU like all the other xfeatures but, rather than clearing it in the signal context, we set it to the syscall-specified value. If we do this, then we'd have to clearly separate the syscall-specified value from the xstate value, and we'd probably want one of the syscalls to offer an option to force user PKRU to match the syscall value. Maybe your code already does this -- I didn't check. This has the nice property that sigreturn isn't affected at all except insofar as we should give a little bit of thought to the ordering of restoring PKRU relative to any other uaccess. I'd imagine that we'd just restore PKRU last, in case the stack we're restoring from has access disallowed by the saved PKRU value. --Andy