From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 004FFC433E0 for ; Sat, 25 Jul 2020 00:09:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCC4E20759 for ; Sat, 25 Jul 2020 00:09:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595635755; bh=btOIZU1MW+6PlkkqDJUz7DcCUStR/8qT0xAu53GlQ6s=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=UdzLI4pfjnqtvi2hmaeG6shbLo/hMHFQoLry8/sbIt1/2v2wWeSH28d/VWd3XUuJ8 VJtstfYPwMsFSTZU86Wqitm9ec3dLxRfHUZFTV5DrS4DuWHMcvnBYYkEC3FDGEtp6B 8I2GzXKqPMLNgjcqOX+uR/Eee6uSqA9EC0IUdHOM= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726977AbgGYAJP (ORCPT ); Fri, 24 Jul 2020 20:09:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:58374 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726782AbgGYAJP (ORCPT ); Fri, 24 Jul 2020 20:09:15 -0400 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0DB58207DF for ; Sat, 25 Jul 2020 00:09:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595635754; bh=btOIZU1MW+6PlkkqDJUz7DcCUStR/8qT0xAu53GlQ6s=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=sxS88DZLD1Ub3RW0YPgyikY7BDShzwcfepCzbZBjkaqBJP77hyvnz9aXpp551L9Pb zU2WTGc5yM4bV7AnuKssI6OXXS3oITRb+uD4VVa0gWA1nxKyleZhPlLV48Zgna5QtC kYQUX5BdvbpUC+Yv/uFXlOSxeVLIjzbZ7mfuQHwg= Received: by mail-wm1-f49.google.com with SMTP id c80so9302636wme.0 for ; Fri, 24 Jul 2020 17:09:13 -0700 (PDT) X-Gm-Message-State: AOAM530zHISeTD+2OMuMYB29RiLgdq1g2bqAtlK35WnbQ7ANCzChq9uE 2g7sz3lfd4miIJUGkYze5bE32xQgX04tMAZ8ID9xTQ== X-Google-Smtp-Source: ABdhPJzODXWRSR4bEqjZdQmjvnAm94COixhWx/ZHwpxi601Bsr0u6vIlMnxzH7evHcvtBmUNUEDn46VTXFtuOAxUs7I= X-Received: by 2002:a1c:56c3:: with SMTP id k186mr10711218wmb.21.1595635752371; Fri, 24 Jul 2020 17:09:12 -0700 (PDT) MIME-Version: 1.0 References: <20200717072056.73134-1-ira.weiny@intel.com> <20200717072056.73134-18-ira.weiny@intel.com> <87r1t2vwi7.fsf@nanos.tec.linutronix.de> <20200723220435.GI844235@iweiny-DESK2.sc.intel.com> <87mu3pvly7.fsf@nanos.tec.linutronix.de> <874kpwtxlh.fsf@nanos.tec.linutronix.de> In-Reply-To: <874kpwtxlh.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Fri, 24 Jul 2020 17:09:00 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions To: Thomas Gleixner , Josh Poimboeuf Cc: Ira Weiny , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra , Dave Hansen , X86 ML , Dan Williams , Vishal Verma , Andrew Morton , Fenghua Yu , "open list:DOCUMENTATION" , LKML , linux-nvdimm , Linux FS Devel , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Sender: linux-kselftest-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org On Fri, Jul 24, 2020 at 2:25 PM Thomas Gleixner wrote: > > Ira, > > Thomas Gleixner writes: > > Ira Weiny writes: > >> On Thu, Jul 23, 2020 at 09:53:20PM +0200, Thomas Gleixner wrote: > >> I think, after fixing my code (see below), using idtentry_state could still > >> work. If the per-cpu cache and the MSR is updated in idtentry_exit() that > >> should carry the state to the new cpu, correct? > > > > I'm way too tired to think about that now. Will have a look tomorrow > > with brain awake. > > Not that I'm way more awake now, but at least I have the feeling that my > brain is not completely useless. > > Let me summarize what I understood: > > 1) A per CPU cache which shadows the current state of the MSR, i.e. the > current valid key. You use that to avoid costly MSR writes if the > key does not change. > > 2) On idtentry you store the key on entry in idtentry_state, clear it > in the MSR and shadow state if necessary and restore it on exit. > > 3) On context switch out you save the per CPU cache value in the task > and on context switch in you restore it from there. > > Yes, that works (see below for #2) and sorry for my confusion yesterday > about storing this in task state. > > #2 requires to handle the exceptions which do not go through > idtentry_enter/exit() seperately, but that's a manageable amount. It's > the ones which use IDTENTRY_RAW or a variant of it. > > #BP, #MC, #NMI, #DB, #DF need extra local storage as all the kernel > entries for those use nmi_enter()/exit(). So you just can create > wrappers around those. Somehting like this > > static __always_inline idtentry_state_t idtentry_nmi_enter(void) > { > idtentry_state_t state = {}; > > nmi_enter(); > instrumentation_begin(); > state.key = save_and_clear_key(); > instrumentation_end(); > } > > static __always_inline void idtentry_nmi_exit(idtentry_state_t state) > { > instrumentation_begin(); > restore_key(state.key); > instrumentation_end(); > nmi_exit(); > } > > #UD and #PF are using the raw entry variant as well but still invoke > idtentry_enter()/exit(). #PF does not need any work. #UD handles > WARN/BUG without going through idtentry_enter() first, but I don't think > that's an issue unless a not 0 key would prevent writing to the console > device. You surely can figure that out. Putting on my mm maintainer hat for a moment, I really think that we want oopses to print PKRS along with all the other registers when we inevitably oops due to a page fault. And we probably also want it in the nasty nested case where we get infinite page faults and eventually double fault. I'm sure it's *possible* to wire this up if we stick it in idtentry_state, but it's trivial if we stick it in pt_regs. I'm okay with doing the save/restore in C (in fact, I prefer that), but I think that either the value should be stuck in pt_regs or we should find a way to teach the unwinder to locate idtentry_state. And, if we go with idtentry_state, we should make a signature change to nmi_enter() to also provide idtentry_state or some equivalent object. --Andy