From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.codeaurora.org by pdx-caf-mail.web.codeaurora.org (Dovecot) with LMTP id 57yTGpemGlsyFAAAmS7hNA ; Fri, 08 Jun 2018 15:53:59 +0000 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 55B36608B8; Fri, 8 Jun 2018 15:53:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI autolearn=unavailable autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by smtp.codeaurora.org (Postfix) with ESMTP id BD38D605A5; Fri, 8 Jun 2018 15:53:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org BD38D605A5 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752674AbeFHPx5 (ORCPT + 25 others); Fri, 8 Jun 2018 11:53:57 -0400 Received: from mga07.intel.com ([134.134.136.100]:24105 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751134AbeFHPxx (ORCPT ); Fri, 8 Jun 2018 11:53:53 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 08:53:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="57644959" Received: from 2b52.sc.intel.com (HELO [143.183.136.147]) ([143.183.136.147]) by orsmga003.jf.intel.com with ESMTP; 08 Jun 2018 08:53:52 -0700 Message-ID: <1528473039.8058.11.camel@2b52.sc.intel.com> Subject: Re: [PATCH 04/10] x86/cet: Handle thread shadow stack From: Yu-cheng Yu To: Andy Lutomirski Cc: Florian Weimer , LKML , linux-doc@vger.kernel.org, Linux-MM , linux-arch , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , "H. J. Lu" , "Shanbhogue, Vedvyas" , "Ravi V. Shankar" , Dave Hansen , Jonathan Corbet , Oleg Nesterov , Arnd Bergmann , mike.kravetz@oracle.com Date: Fri, 08 Jun 2018 08:50:39 -0700 In-Reply-To: References: <20180607143807.3611-1-yu-cheng.yu@intel.com> <20180607143807.3611-5-yu-cheng.yu@intel.com> <3c1bdf85-0c52-39ed-a799-e26ac0e52391@redhat.com> <6ee29e8b-4a0a-3459-a1ee-03923ba4e15d@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2018-06-08 at 08:01 -0700, Andy Lutomirski wrote: > On Fri, Jun 8, 2018 at 7:53 AM Florian Weimer wrote: > > > > On 06/07/2018 10:53 PM, Andy Lutomirski wrote: > > > On Thu, Jun 7, 2018 at 12:47 PM Florian Weimer wrote: > > >> > > >> On 06/07/2018 08:21 PM, Andy Lutomirski wrote: > > >>> On Thu, Jun 7, 2018 at 7:41 AM Yu-cheng Yu wrote: > > >>>> > > >>>> When fork() specifies CLONE_VM but not CLONE_VFORK, the child > > >>>> needs a separate program stack and a separate shadow stack. > > >>>> This patch handles allocation and freeing of the thread shadow > > >>>> stack. > > >>> > > >>> Aha -- you're trying to make this automatic. I'm not convinced this > > >>> is a good idea. The Linux kernel has a long and storied history of > > >>> enabling new hardware features in ways that are almost entirely > > >>> useless for userspace. > > >>> > > >>> Florian, do you have any thoughts on how the user/kernel interaction > > >>> for the shadow stack should work? > > >> > > >> I have not looked at this in detail, have not played with the emulator, > > >> and have not been privy to any discussions before these patches have > > >> been posted, however … > > >> > > >> I believe that we want as little code in userspace for shadow stack > > >> management as possible. One concern I have is that even with the code > > >> we arguably need for various kinds of stack unwinding, we might have > > >> unwittingly built a generic trampoline that leads to full CET bypass. > > > > > > I was imagining an API like "allocate a shadow stack for the current > > > thread, fail if the current thread already has one, and turn on the > > > shadow stack". glibc would call clone and then call this ABI pretty > > > much immediately (i.e. before making any calls from which it expects > > > to return). > > > > Ahh. So you propose not to enable shadow stack enforcement on the new > > thread even if it is enabled for the current thread? For the cases > > where CLONE_VM is involved? > > > > It will still need a new assembler wrapper which sets up the shadow > > stack, and it's probably required to disable signals. > > > > I think it should be reasonable safe and actually implementable. But > > the benefits are not immediately obvious to me. > > Doing it this way would have been my first incliniation. It would > avoid all the oddities of the kernel magically creating a VMA when > clone() is called, guessing the shadow stack size, etc. But I'm okay > with having the kernel do it automatically, too. HJ wanted to add a arch_prctl that allocates a new shadow stack and switches to it. That was mainly for swapcontext. Perhaps we can also use that for threads? HJ, can you comment on this? > I think it would be > very nice to have a way for user code to find out the size of the > shadow stack and change it, though. (And relocate it, but maybe > that's impossible. The CET documentation doesn't have a clear > description of the shadow stack layout.) The shadow stack is vm_mmap'ed from memory and does not have any special layout. We can add a arch_prctl to find out shadow stack's address and size. > > > > > We definitely want strong enough user control that tools like CRIU can > > > continue to work. I haven't looked at the SDM recently enough to > > > remember for sure, but I'm reasonably confident that user code can > > > learn the address of its own shadow stack. If nothing else, CRIU > > > needs to be able to restore from a context where there's a signal on > > > the stack and the signal frame contains a shadow stack pointer. > > > > CRIU also needs the shadow stack *contents*, which shouldn't be directly > > available to the process. So it needs very special interfaces anyway. > > True. I proposed in a different email that ptrace() have full control > of the shadow stack (read, write, lock, unlock, etc). PTRACE can do PTRACE_POKEDATA on shadow stack. We can add lock/unlock. > > > > Does CRIU implement MPX support? > > Dunno. But given that MPX seems to be dying, I'm not sure it matters. > > --Andy