From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84EE7C433F2 for ; Thu, 23 Jul 2020 21:30:23 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4DC8F20768 for ; Thu, 23 Jul 2020 21:30:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="QinynW2K" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4DC8F20768 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 273D31252813E; Thu, 23 Jul 2020 14:30:23 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=198.145.29.99; helo=mail.kernel.org; envelope-from=luto@kernel.org; receiver= Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id B67D312520400 for ; Thu, 23 Jul 2020 14:30:20 -0700 (PDT) Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DAE7322CB3 for ; Thu, 23 Jul 2020 21:30:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539820; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QinynW2K93EXrPR6DHoUgxUGXaHWzMH146Nrrve6MhpMPCFwGDoURTCwZW01w2nig yjJRiHvrDGxPpz7uOlli64G8+T1vxQUU9Hzaz9+hUdIyI0JcVCYuSGY1wFgfAPnSa/ Qz1ISBhK19OBZO0AlIkHeM5E1hrKxyuX0nsm1UII= Received: by mail-wr1-f48.google.com with SMTP id z18so2924461wrm.12 for ; Thu, 23 Jul 2020 14:30:19 -0700 (PDT) X-Gm-Message-State: AOAM532I3zjeYQq6MbUHI3EQZ4qQdalhZjCD3xhyw2q4VUSE3D9iqLMi JkDCTzSYh6d9PpLMzUd0qzmG1GcZmQHUnJ0Mj649Fg== X-Google-Smtp-Source: ABdhPJxkIDtPFawZcWsduLG2ySWexVRWYEyTMNYsD2OEQFrKOENShjDryhjYBkc7VXix4cqL0MKNzSQ5LIQb5iHR31E= X-Received: by 2002:a5d:5273:: with SMTP id l19mr5578852wrc.257.1595539818063; Thu, 23 Jul 2020 14:30:18 -0700 (PDT) MIME-Version: 1.0 References: <20200723165204.GB77434@romley-ivt3.sc.intel.com> <87imeevv6b.fsf@nanos.tec.linutronix.de> In-Reply-To: <87imeevv6b.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Thu, 23 Jul 2020 14:30:06 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions To: Thomas Gleixner Message-ID-Hash: MDUMTG3DXERS6PWBULJGKDBQK6RANYM5 X-Message-ID-Hash: MDUMTG3DXERS6PWBULJGKDBQK6RANYM5 X-MailFrom: luto@kernel.org X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Fenghua Yu , Dave Hansen , Andy Lutomirski , Ingo Molnar , Borislav Petkov , Peter Zijlstra , Dave Hansen , X86 ML , Andrew Morton , "open list:DOCUMENTATION" , LKML , linux-nvdimm , Linux FS Devel , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PiBPbiBKdWwgMjMsIDIwMjAsIGF0IDE6MjIgUE0sIFRob21hcyBHbGVpeG5lciA8dGdseEBsaW51 dHJvbml4LmRlPiB3cm90ZToNCj4NCj4g77u/QW5keSBMdXRvbWlyc2tpIDxsdXRvQGFtYWNhcGl0 YWwubmV0PiB3cml0ZXM6DQo+DQo+PiBTdXBwb3NlIHNvbWUga2VybmVsIGNvZGUgKGEgc3lzY2Fs bCBvciBrZXJuZWwgdGhyZWFkKSBjaGFuZ2VzIFBLUlMNCj4+IHRoZW4gdGFrZXMgYSBwYWdlIGZh dWx0LiBUaGUgcGFnZSBmYXVsdCBoYW5kbGVyIG5lZWRzIGEgZnJlc2gNCj4+IFBLUlMuIFRoZW4g dGhlIHBhZ2UgZmF1bHQgaGFuZGxlciAoc2F5IGEgVk1B4oCZcyAuZmF1bHQgaGFuZGxlcikgY2hh bmdlcw0KPj4gUEtSUy4gIFRoZSB3ZSBnZXQgYW4gaW50ZXJydXB0LiBUaGUgaW50ZXJydXB0ICph bHNvKiBuZWVkcyBhIGZyZXNoDQo+PiBQS1JTIGFuZCB0aGUgcGFnZSBmYXVsdCB2YWx1ZSBuZWVk cyB0byBiZSBzYXZlZCBzb21ld2hlcmUuDQo+Pg0KPj4gU28gd2UgaGF2ZSBtb3JlIHRoYW4gb25l IHNhdmVkIHZhbHVlIHBlciB0aHJlYWQsIGFuZCB0aHJlYWRfc3RydWN0DQo+PiBpc27igJl0IGdv aW5nIHRvIHNvbHZlIHRoaXMgcHJvYmxlbS4NCj4NCj4gQSBzdGFjayBvZiA3IGVudHJpZXMgYW5k IGFuIGluZGV4IG5lZWRzIDMyYnl0ZXMgdG90YWwgd2hpY2ggaXMgYQ0KPiByZWFzb25hYmxlIGFt b3VudCBhbmQgc29sdmVzIHRoZSBwcm9ibGVtIGluY2x1ZGluZyBzY2hlZHVsaW5nIGZyb20gI1BG DQo+IG5pY2VseS4gTWFrZSBpdCAxNSBhbmQgaXQncyBzdGlsbCBvbmx5IDY0IGJ5dGVzLg0KPg0K Pj4gQnV0IGlkdGVudHJ5X3N0YXRlIGlzIGFsc28gbm90IGdyZWF0IGZvciBhIGNvdXBsZSByZWFz b25zLiAgTm90IGFsbA0KPj4gZW50cmllcyBoYXZlIGlkdGVudHJ5X3N0YXRlLCBhbmQgdGhlIHVu d2luZGVyIGNhbuKAmXQgZmluZCBpdCBmb3INCj4+IGRlYnVnZ2luZy4gRm9yIHRoYXQgbWF0dGVy LCB0aGUgcGFnZSBmYXVsdCBsb2dpYyBwcm9iYWJseSB3YW50cyB0bw0KPj4ga25vdyB0aGUgcHJl dmlvdXMgUEtSUywgc28gaXQgc2hvdWxkIGVpdGhlciBiZSBzdGFzaGVkIHNvbWV3aGVyZQ0KPj4g ZmluZGFibGUgb3IgaXQgc2hvdWxkIGJlIGV4cGxpY2l0bHkgcGFzc2VkIGFyb3VuZC4NCj4+DQo+ PiBNeSBzdWdnZXN0aW9uIGlzIHRvIGVubGFyZ2UgcHRfcmVncy4gIFRoZSBzYXZlIGFuZCByZXN0 b3JlIGxvZ2ljIGNhbg0KPj4gcHJvYmFibHkgYmUgaW4gQywgYnV0IHB0X3JlZ3MgaXMgdGhlIGxv Z2ljYWwgcGxhY2UgdG8gcHV0IGEgcmVnaXN0ZXINCj4+IHRoYXQgaXMgc2F2ZWQgYW5kIHJlc3Rv cmVkIGFjcm9zcyBhbGwgZW50cmllcy4NCj4NCj4gS2luZGEsIGJ1dCB0aGF0IHN0aWxsIHN1Y2tz IGJlY2F1c2Ugc2NoZWR1bGUgZnJvbSAjUEYgd2lsbCBnZXQgaXQgd3JvbmcNCj4gdW5sZXNzIHlv dSBkbyBleHRyYSBuYXN0aWVzLg0KDQpUaGlzIHNlZW1zIGxpa2Ugd2XigJlyZSByZWludmVudGlu ZyB0aGUgd2hlZWwuICBQS1JTIGlzIG5vdA0KZnVuZGFtZW50YWxseSBkaWZmZXJlbnQgZnJvbSwg c2F5LCBSU1AuICBJZiB3ZSB3YW50IHRvIHNhdmUgaXQgYWNyb3NzDQpleGNlcHRpb25zLCB3ZSBz YXZlIGl0IG9uIGVudHJ5IGFuZCBjb250ZXh0LXN3aXRjaC1vdXQgYW5kIHJlc3RvcmUgaXQNCm9u IGV4aXQgYW5kIGNvbnRleHQtc3dpdGNoLWluLg0KDQoNCj4NCj4+IFdob2V2ZXIgZG9lcyB0aGlz IHdvcmsgd2lsbCBoYXZlIHRoZSBkZWxpZ2h0ZnVsIGpvYiBvZiBmaWd1cmluZyBvdXQNCj4+IHdo ZXRoZXIgQlBGIHRoaW5rcyB0aGF0IHRoZSBsYXlvdXQgb2YgcHRfcmVncyBpcyBBQkkgYW5kLCBp ZiBzbywNCj4+IGZpeGluZyB0aGUgcmVzdWx0aW5nIG1lc3MuDQo+Pg0KPj4gVGhlIGZhY3QgdGhl IG5ldyBmaWVsZHMgd2lsbCBnbyBhdCB0aGUgYmVnaW5uaW5nIG9mIHB0X3JlZ3Mgd2lsbCBtYWtl DQo+PiB0aGlzIGFuIGVudGVydGFpbmluZyBwcm9zcGVjdC4NCj4NCj4gR29vZCBsdWNrIHdpdGgg YWxsIG9mIHRoYXQuDQoNCldlIGNhbiBhbHdheXMgY2hlYXQgbGlrZSB0aGlzOg0KDQpzdHJ1Y3Qg cmVhbF9wdF9yZWdzIHsNCiAgdW5zaWduZWQgbG9uZyBwa3JzOw0KICBzdHJ1Y3QgcHRfcmVncyBy ZWdzOw0KfTsNCg0KYW5kIHBhc3MgYSBwb2ludGVyIHRvIHJlZ3MgYXJvdW5kLiAgV2hhdCBCUEYg ZG9lc24ndCBrbm93IGFib3V0IGNhbid0IGh1cnQgaXQuCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCkxpbnV4LW52ZGltbSBtYWlsaW5nIGxpc3QgLS0gbGlu dXgtbnZkaW1tQGxpc3RzLjAxLm9yZwpUbyB1bnN1YnNjcmliZSBzZW5kIGFuIGVtYWlsIHRvIGxp bnV4LW52ZGltbS1sZWF2ZUBsaXN0cy4wMS5vcmcK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5926AC433F1 for ; Thu, 23 Jul 2020 21:30:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 332902086A for ; Thu, 23 Jul 2020 21:30:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539822; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=ldhl7HAP9C26rspS5ai2HpXVE7YDh8lYR8FkArPbhatCBScR7BviM5LmzwuWJZnGY /f9rf9QyEUUrcbSdFkk6S61tXoL5ce+soB/gGZYDS1QI1YRcEJBit7qVAE0rxfZX17 /Rbgeqty06lVj2B69/5uYibCCj9jCcBzcVocdKTA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727786AbgGWVaV (ORCPT ); Thu, 23 Jul 2020 17:30:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:53158 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726390AbgGWVaV (ORCPT ); Thu, 23 Jul 2020 17:30:21 -0400 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E0A2C22CF7 for ; Thu, 23 Jul 2020 21:30:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539820; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QinynW2K93EXrPR6DHoUgxUGXaHWzMH146Nrrve6MhpMPCFwGDoURTCwZW01w2nig yjJRiHvrDGxPpz7uOlli64G8+T1vxQUU9Hzaz9+hUdIyI0JcVCYuSGY1wFgfAPnSa/ Qz1ISBhK19OBZO0AlIkHeM5E1hrKxyuX0nsm1UII= Received: by mail-wr1-f47.google.com with SMTP id b6so6474929wrs.11 for ; Thu, 23 Jul 2020 14:30:19 -0700 (PDT) X-Gm-Message-State: AOAM533pQzHigWhIg4TKPZQmbpU6z1pZfqkjn5H2J3/xw0dD8QsfFnNB XBK/Q4xtkiEexgoFEC/HLASDcbiWppJHwaA9iGpNcQ== X-Google-Smtp-Source: ABdhPJxkIDtPFawZcWsduLG2ySWexVRWYEyTMNYsD2OEQFrKOENShjDryhjYBkc7VXix4cqL0MKNzSQ5LIQb5iHR31E= X-Received: by 2002:a5d:5273:: with SMTP id l19mr5578852wrc.257.1595539818063; Thu, 23 Jul 2020 14:30:18 -0700 (PDT) MIME-Version: 1.0 References: <20200723165204.GB77434@romley-ivt3.sc.intel.com> <87imeevv6b.fsf@nanos.tec.linutronix.de> In-Reply-To: <87imeevv6b.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Thu, 23 Jul 2020 14:30:06 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions To: Thomas Gleixner Cc: Fenghua Yu , Dave Hansen , Andy Lutomirski , Weiny Ira , Ingo Molnar , Borislav Petkov , Peter Zijlstra , Dave Hansen , X86 ML , Dan Williams , Vishal Verma , Andrew Morton , "open list:DOCUMENTATION" , LKML , linux-nvdimm , Linux FS Devel , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jul 23, 2020, at 1:22 PM, Thomas Gleixner wrote: > > =EF=BB=BFAndy Lutomirski writes: > >> Suppose some kernel code (a syscall or kernel thread) changes PKRS >> then takes a page fault. The page fault handler needs a fresh >> PKRS. Then the page fault handler (say a VMA=E2=80=99s .fault handler) c= hanges >> PKRS. The we get an interrupt. The interrupt *also* needs a fresh >> PKRS and the page fault value needs to be saved somewhere. >> >> So we have more than one saved value per thread, and thread_struct >> isn=E2=80=99t going to solve this problem. > > A stack of 7 entries and an index needs 32bytes total which is a > reasonable amount and solves the problem including scheduling from #PF > nicely. Make it 15 and it's still only 64 bytes. > >> But idtentry_state is also not great for a couple reasons. Not all >> entries have idtentry_state, and the unwinder can=E2=80=99t find it for >> debugging. For that matter, the page fault logic probably wants to >> know the previous PKRS, so it should either be stashed somewhere >> findable or it should be explicitly passed around. >> >> My suggestion is to enlarge pt_regs. The save and restore logic can >> probably be in C, but pt_regs is the logical place to put a register >> that is saved and restored across all entries. > > Kinda, but that still sucks because schedule from #PF will get it wrong > unless you do extra nasties. This seems like we=E2=80=99re reinventing the wheel. PKRS is not fundamentally different from, say, RSP. If we want to save it across exceptions, we save it on entry and context-switch-out and restore it on exit and context-switch-in. > >> Whoever does this work will have the delightful job of figuring out >> whether BPF thinks that the layout of pt_regs is ABI and, if so, >> fixing the resulting mess. >> >> The fact the new fields will go at the beginning of pt_regs will make >> this an entertaining prospect. > > Good luck with all of that. We can always cheat like this: struct real_pt_regs { unsigned long pkrs; struct pt_regs regs; }; and pass a pointer to regs around. What BPF doesn't know about can't hurt = it. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8674AC433F4 for ; Thu, 23 Jul 2020 21:30:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F7EA207C4 for ; Thu, 23 Jul 2020 21:30:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="QinynW2K" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F7EA207C4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7E2698D0002; Thu, 23 Jul 2020 17:30:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76C598D0001; Thu, 23 Jul 2020 17:30:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 635AF8D0002; Thu, 23 Jul 2020 17:30:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 4ABE88D0001 for ; Thu, 23 Jul 2020 17:30:22 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CF127E080 for ; Thu, 23 Jul 2020 21:30:21 +0000 (UTC) X-FDA: 77070634242.26.army89_1016edb26f41 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id A551518014AFD for ; Thu, 23 Jul 2020 21:30:21 +0000 (UTC) X-HE-Tag: army89_1016edb26f41 X-Filterd-Recvd-Size: 5084 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Thu, 23 Jul 2020 21:30:20 +0000 (UTC) Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B0102207C4 for ; Thu, 23 Jul 2020 21:30:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539820; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QinynW2K93EXrPR6DHoUgxUGXaHWzMH146Nrrve6MhpMPCFwGDoURTCwZW01w2nig yjJRiHvrDGxPpz7uOlli64G8+T1vxQUU9Hzaz9+hUdIyI0JcVCYuSGY1wFgfAPnSa/ Qz1ISBhK19OBZO0AlIkHeM5E1hrKxyuX0nsm1UII= Received: by mail-wr1-f43.google.com with SMTP id f1so5945193wro.2 for ; Thu, 23 Jul 2020 14:30:19 -0700 (PDT) X-Gm-Message-State: AOAM533eMv1HIsNMaVfoY6iKA2dj+XGcvJJ8cb9kcK21eARNz+S+2CgD +C3JUAmruNQ9JiurVM0ndFC+gtsLCNKanL6V4cZ7hg== X-Google-Smtp-Source: ABdhPJxkIDtPFawZcWsduLG2ySWexVRWYEyTMNYsD2OEQFrKOENShjDryhjYBkc7VXix4cqL0MKNzSQ5LIQb5iHR31E= X-Received: by 2002:a5d:5273:: with SMTP id l19mr5578852wrc.257.1595539818063; Thu, 23 Jul 2020 14:30:18 -0700 (PDT) MIME-Version: 1.0 References: <20200723165204.GB77434@romley-ivt3.sc.intel.com> <87imeevv6b.fsf@nanos.tec.linutronix.de> In-Reply-To: <87imeevv6b.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Thu, 23 Jul 2020 14:30:06 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions To: Thomas Gleixner Cc: Fenghua Yu , Dave Hansen , Andy Lutomirski , Weiny Ira , Ingo Molnar , Borislav Petkov , Peter Zijlstra , Dave Hansen , X86 ML , Dan Williams , Vishal Verma , Andrew Morton , "open list:DOCUMENTATION" , LKML , linux-nvdimm , Linux FS Devel , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A551518014AFD X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Jul 23, 2020, at 1:22 PM, Thomas Gleixner wrote: > > =EF=BB=BFAndy Lutomirski writes: > >> Suppose some kernel code (a syscall or kernel thread) changes PKRS >> then takes a page fault. The page fault handler needs a fresh >> PKRS. Then the page fault handler (say a VMA=E2=80=99s .fault handler) c= hanges >> PKRS. The we get an interrupt. The interrupt *also* needs a fresh >> PKRS and the page fault value needs to be saved somewhere. >> >> So we have more than one saved value per thread, and thread_struct >> isn=E2=80=99t going to solve this problem. > > A stack of 7 entries and an index needs 32bytes total which is a > reasonable amount and solves the problem including scheduling from #PF > nicely. Make it 15 and it's still only 64 bytes. > >> But idtentry_state is also not great for a couple reasons. Not all >> entries have idtentry_state, and the unwinder can=E2=80=99t find it for >> debugging. For that matter, the page fault logic probably wants to >> know the previous PKRS, so it should either be stashed somewhere >> findable or it should be explicitly passed around. >> >> My suggestion is to enlarge pt_regs. The save and restore logic can >> probably be in C, but pt_regs is the logical place to put a register >> that is saved and restored across all entries. > > Kinda, but that still sucks because schedule from #PF will get it wrong > unless you do extra nasties. This seems like we=E2=80=99re reinventing the wheel. PKRS is not fundamentally different from, say, RSP. If we want to save it across exceptions, we save it on entry and context-switch-out and restore it on exit and context-switch-in. > >> Whoever does this work will have the delightful job of figuring out >> whether BPF thinks that the layout of pt_regs is ABI and, if so, >> fixing the resulting mess. >> >> The fact the new fields will go at the beginning of pt_regs will make >> this an entertaining prospect. > > Good luck with all of that. We can always cheat like this: struct real_pt_regs { unsigned long pkrs; struct pt_regs regs; }; and pass a pointer to regs around. What BPF doesn't know about can't hurt = it.