From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28948C43460 for ; Mon, 3 May 2021 15:29:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7EDB260C3E for ; Mon, 3 May 2021 15:29:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7EDB260C3E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amacapital.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CC8C26B0036; Mon, 3 May 2021 11:29:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C64BD6B006E; Mon, 3 May 2021 11:29:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB7D56B0070; Mon, 3 May 2021 11:29:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 8BD6F6B0036 for ; Mon, 3 May 2021 11:29:15 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 44058759A for ; Mon, 3 May 2021 15:29:15 +0000 (UTC) X-FDA: 78100303470.08.ED7CAA0 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf14.hostedemail.com (Postfix) with ESMTP id C99DDC0007EA for ; Mon, 3 May 2021 15:28:55 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id k19so1452256pfu.5 for ; Mon, 03 May 2021 08:29:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=2V3HYq3YbH5eZIxZs7nKcsCMHCZy6Xq7DA+V46SoNmI=; b=TpPnf3bolYDVSjJPRDR8FjKGu85pfRv77SBJZBQHoYEbphH6DRrCLMsD3dOCctjAnL wYvu3jVE2AQewOxDy3vbfn34JxTgO54FuX0FukEmGBlfMPgi/OwJRDW8pwT8TYvQLPIa SAxm3wjxi7gZbV7CyGYGyYJ7J8VFqd+lGgJA5S/dnsskEew6z4LDbJqlNSWLZW7VQcQw XE8cpOE0KyYGVnmLW0WdupKeoC3QvggNovzlgSezkkAQc1nuKa8GFvy0zhhielqXmimy /B82tbcges+mfMiayPFZ6b3WCcwAP4Tzfr3NKzRg7CsvpOwm8JdBdR2efMIQ8UrLRsFm 7CnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=2V3HYq3YbH5eZIxZs7nKcsCMHCZy6Xq7DA+V46SoNmI=; b=hdL6hv25UY66LcZCJ87RdmpUXHAuzBwn4Rdmz0q7AfVngjP1ySmQtTccAU4pGRoqyr mdBFmi13G+gCOPEAf1KwJbPvKD1IzbpAbFNxOGqu+iNegTeL3vpGl1LnAV+q9EPL4JZd SWNsYj715tMjN5mH4hEEN05GeyN2uhdecGFvjFD42Kf+aApwEQXo6JfHuOOnPnN1WlOu bAOXvXP4R788OvdkjYZUaNz8Qe3nQ6ZUxvUlBh+xwlcK/sYjSST8NnQiVdHNRPB2ZKTh 61vSKh5ZKCAIGdT8XIamICTkDH/Uwy/Zb9RHQXqftryp2lX9/5XX0pHe31Z2888FJKoA WNOg== X-Gm-Message-State: AOAM532FYICGfE8iJCy/d1g6sralK6l0vZnnHeibCtNVWUORJunMEWWc bWGtrNGpCKWv35TXC6nI9apnHQ== X-Google-Smtp-Source: ABdhPJzUqbXM/URmBF8ZSKH3HSbo8rHKVMg/k9x+YM8ZCOfN/FE0pEJH6vDM7iXTKEJD89xzJexv7A== X-Received: by 2002:a62:1888:0:b029:262:de45:b458 with SMTP id 130-20020a6218880000b0290262de45b458mr19722073pfy.20.1620055753894; Mon, 03 May 2021 08:29:13 -0700 (PDT) Received: from smtpclient.apple ([2601:646:c200:1ef2:1960:85f5:fe97:e8ac]) by smtp.gmail.com with ESMTPSA id l3sm17757773pju.44.2021.05.03.08.29.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 May 2021 08:29:13 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: extending ucontext (Re: [PATCH v26 25/30] x86/cet/shstk: Handle signals for shadow stack) Date: Mon, 3 May 2021 08:29:11 -0700 Message-Id: <2D8926E4-F1B6-433A-96EA-995A66F3F42D@amacapital.net> References: <782ffe96-b830-d13b-db80-5b60f41ccdbf@intel.com> Cc: Andy Lutomirski , linux-arch , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , LKML , "open list:DOCUMENTATION" , Linux-MM , Linux API , Arnd Bergmann , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu , Haitao Huang In-Reply-To: <782ffe96-b830-d13b-db80-5b60f41ccdbf@intel.com> To: "Yu, Yu-cheng" X-Mailer: iPhone Mail (18E199) X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C99DDC0007EA X-Stat-Signature: qyd4g7nrrxyk5fpgsxo88gsensghebgt Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=TpPnf3bo; spf=pass (imf14.hostedemail.com: domain of luto@amacapital.net designates 209.85.210.172 as permitted sender) smtp.mailfrom=luto@amacapital.net; dmarc=none Received-SPF: none (amacapital.net>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail-pf1-f172.google.com; client-ip=209.85.210.172 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620055735-830063 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On May 3, 2021, at 8:14 AM, Yu, Yu-cheng wrote: >=20 > =EF=BB=BFOn 5/2/2021 4:23 PM, Andy Lutomirski wrote: >>> On Fri, Apr 30, 2021 at 10:47 AM Andy Lutomirski wrote= : >>>=20 >>> On Fri, Apr 30, 2021 at 10:00 AM Yu, Yu-cheng wr= ote: >>>>=20 >>>> On 4/28/2021 4:03 PM, Andy Lutomirski wrote: >>>>> On Tue, Apr 27, 2021 at 1:44 PM Yu-cheng Yu wr= ote: >>>>>>=20 >>>>>> When shadow stack is enabled, a task's shadow stack states must be sa= ved >>>>>> along with the signal context and later restored in sigreturn. Howev= er, >>>>>> currently there is no systematic facility for extending a signal cont= ext. >>>>>> There is some space left in the ucontext, but changing ucontext is li= kely >>>>>> to create compatibility issues and there is not enough space for furt= her >>>>>> extensions. >>>>>>=20 >>>>>> Introduce a signal context extension struct 'sc_ext', which is used t= o save >>>>>> shadow stack restore token address. The extension is located above t= he fpu >>>>>> states, plus alignment. The struct can be extended (such as the ibt'= s >>>>>> wait_endbr status to be introduced later), and sc_ext.total_size fiel= d >>>>>> keeps track of total size. >>>>>=20 >>>>> I still don't like this. >>>>>=20 >>>>> Here's how the signal layout works, for better or for worse: >>>>>=20 >=20 > [...] >=20 >>>>>=20 >>>>> That's where we are right now upstream. The kernel has a parser for >>>>> the FPU state that is bugs piled upon bugs and is going to have to be >>>>> rewritten sometime soon. On top of all this, we have two upcoming >>>>> features, both of which require different kinds of extensions: >>>>>=20 >>>>> 1. AVX-512. (Yeah, you thought this story was over a few years ago, >>>>> but no. And AMX makes it worse.) To make a long story short, we >>>>> promised user code many years ago that a signal frame fit in 2048 >>>>> bytes with some room to spare. With AVX-512 this is false. With AMX >>>>> it's so wrong it's not even funny. The only way out of the mess >>>>> anyone has come up with involves making the length of the FPU state >>>>> vary depending on which features are INIT, i.e. making it more compact= >>>>> than "compact" mode is. This has a side effect: it's no longer >>>>> possible to modify the state in place, because enabling a feature with= >>>>> no space allocated will make the structure bigger, and the stack won't= >>>>> have room. Fortunately, one can relocate the entire FPU state, update= >>>>> the pointer in mcontext, and the kernel will happily follow the >>>>> pointer. So new code on a new kernel using a super-compact state >>>>> could expand the state by allocating new memory (on the heap? very >>>>> awkwardly on the stack?) and changing the pointer. For all we know, >>>>> some code already fiddles with the pointer. This is great, except >>>>> that your patch sticks more data at the end of the FPU block that no >>>>> one is expecting, and your sigreturn code follows that pointer, and >>>>> will read off into lala land. >>>>>=20 >>>>=20 >>>> Then, what about we don't do that at all. Is it possible from now on w= e >>>> don't stick more data at the end, and take the relocating-fpu approach?= >>>>=20 >>>>> 2. CET. CET wants us to find a few more bytes somewhere, and those >>>>> bytes logically belong in ucontext, and here we are. >>>>>=20 >>>>=20 >>>> Fortunately, we can spare CET the need of ucontext extension. When the= >>>> kernel handles sigreturn, the user-mode shadow stack pointer is right a= t >>>> the restore token. There is no need to put that in ucontext. >>>=20 >>> That seems entirely reasonable. This might also avoid needing to >>> teach CRIU about CET at all. >> Wait, what's the actual shadow stack token format? And is the token >> on the new stack or the old stack when sigaltstack is in use? For >> that matter, is there any support for an alternate shadow stack for >> signals? >=20 > The restore token is a pointer pointing directly above itself and bit[0] i= ndicates 64-bit mode. >=20 > Because the shadow stack stores only return addresses, there is no alterna= te shadow stack. However, the application can allocate and switch to a new s= hadow stack. I think we should make the ABI support an alternate shadow stack even if we d= on=E2=80=99t implement it initially. After all, some day someone might want t= o register a handler for shadow stack overflow. >=20 > Yu-cheng