From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6D37C433ED for ; Thu, 23 Jul 2020 21:30:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB27022CA0 for ; Thu, 23 Jul 2020 21:30:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539821; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=K5Sip590pAOvE8O78XzsrEtD5vHyCSlOOZi/i1FGyrRAcJBJWp4U4hY+Mhu4C4K43 OkXvJ2K5ay8v63ivPJZQDkIAb15dBleswdD0ssV2qVeIG83LYgV6GIRrw63yFNKjf3 Bc9zHDuLS/2nrJvu5BGextIS67c28WLeeZUs0Crs= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727109AbgGWVaV (ORCPT ); Thu, 23 Jul 2020 17:30:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:53150 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726029AbgGWVaU (ORCPT ); Thu, 23 Jul 2020 17:30:20 -0400 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D961022CB2 for ; Thu, 23 Jul 2020 21:30:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595539820; bh=LjPYQXYCMq01OkWf8ZVQxeTp9Qz+blIk2hR6fIzeLNs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QinynW2K93EXrPR6DHoUgxUGXaHWzMH146Nrrve6MhpMPCFwGDoURTCwZW01w2nig yjJRiHvrDGxPpz7uOlli64G8+T1vxQUU9Hzaz9+hUdIyI0JcVCYuSGY1wFgfAPnSa/ Qz1ISBhK19OBZO0AlIkHeM5E1hrKxyuX0nsm1UII= Received: by mail-wr1-f46.google.com with SMTP id b6so6474927wrs.11 for ; Thu, 23 Jul 2020 14:30:19 -0700 (PDT) X-Gm-Message-State: AOAM532fU1BqWxQOSZ2S5EfF1cEyM0CT4bXhWlKwfvuVTg0waab4AHYO /pwftXuxZZ32fd3xd3a2+sNOVutFJ4I19KmDDBKx+Q== X-Google-Smtp-Source: ABdhPJxkIDtPFawZcWsduLG2ySWexVRWYEyTMNYsD2OEQFrKOENShjDryhjYBkc7VXix4cqL0MKNzSQ5LIQb5iHR31E= X-Received: by 2002:a5d:5273:: with SMTP id l19mr5578852wrc.257.1595539818063; Thu, 23 Jul 2020 14:30:18 -0700 (PDT) MIME-Version: 1.0 References: <20200723165204.GB77434@romley-ivt3.sc.intel.com> <87imeevv6b.fsf@nanos.tec.linutronix.de> In-Reply-To: <87imeevv6b.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Thu, 23 Jul 2020 14:30:06 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions To: Thomas Gleixner Cc: Fenghua Yu , Dave Hansen , Andy Lutomirski , Weiny Ira , Ingo Molnar , Borislav Petkov , Peter Zijlstra , Dave Hansen , X86 ML , Dan Williams , Vishal Verma , Andrew Morton , "open list:DOCUMENTATION" , LKML , linux-nvdimm , Linux FS Devel , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org > On Jul 23, 2020, at 1:22 PM, Thomas Gleixner wrote: > > =EF=BB=BFAndy Lutomirski writes: > >> Suppose some kernel code (a syscall or kernel thread) changes PKRS >> then takes a page fault. The page fault handler needs a fresh >> PKRS. Then the page fault handler (say a VMA=E2=80=99s .fault handler) c= hanges >> PKRS. The we get an interrupt. The interrupt *also* needs a fresh >> PKRS and the page fault value needs to be saved somewhere. >> >> So we have more than one saved value per thread, and thread_struct >> isn=E2=80=99t going to solve this problem. > > A stack of 7 entries and an index needs 32bytes total which is a > reasonable amount and solves the problem including scheduling from #PF > nicely. Make it 15 and it's still only 64 bytes. > >> But idtentry_state is also not great for a couple reasons. Not all >> entries have idtentry_state, and the unwinder can=E2=80=99t find it for >> debugging. For that matter, the page fault logic probably wants to >> know the previous PKRS, so it should either be stashed somewhere >> findable or it should be explicitly passed around. >> >> My suggestion is to enlarge pt_regs. The save and restore logic can >> probably be in C, but pt_regs is the logical place to put a register >> that is saved and restored across all entries. > > Kinda, but that still sucks because schedule from #PF will get it wrong > unless you do extra nasties. This seems like we=E2=80=99re reinventing the wheel. PKRS is not fundamentally different from, say, RSP. If we want to save it across exceptions, we save it on entry and context-switch-out and restore it on exit and context-switch-in. > >> Whoever does this work will have the delightful job of figuring out >> whether BPF thinks that the layout of pt_regs is ABI and, if so, >> fixing the resulting mess. >> >> The fact the new fields will go at the beginning of pt_regs will make >> this an entertaining prospect. > > Good luck with all of that. We can always cheat like this: struct real_pt_regs { unsigned long pkrs; struct pt_regs regs; }; and pass a pointer to regs around. What BPF doesn't know about can't hurt = it.