From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B34ACC43457 for ; Fri, 9 Oct 2020 19:43:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 599D922282 for ; Fri, 9 Oct 2020 19:43:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 599D922282 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D62F36B0074; Fri, 9 Oct 2020 15:43:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEDD86B0075; Fri, 9 Oct 2020 15:43:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA2C16B0078; Fri, 9 Oct 2020 15:43:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 7A8CF6B0074 for ; Fri, 9 Oct 2020 15:43:31 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 215AE1EE6 for ; Fri, 9 Oct 2020 19:43:31 +0000 (UTC) X-FDA: 77353411422.17.steam86_40158a7271e3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id F2AE7180D0180 for ; Fri, 9 Oct 2020 19:43:30 +0000 (UTC) X-HE-Tag: steam86_40158a7271e3 X-Filterd-Recvd-Size: 10751 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 19:43:29 +0000 (UTC) IronPort-SDR: R8GKi5ZCZfEc87msz5Eo4Q9Aqns7T3MZtbFej2dkc/GDO0xAUbBgGYHeS8D1orrEAzvIdX02vg 49oCIiHC6tIA== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="165591406" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="165591406" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 12:43:28 -0700 IronPort-SDR: U5A+gl4oWFGZa9Q6DUignU6A8dgYfIP71pLrcRb2eqmqdmwzofAlTp5hKTDIL127ghjKdH+BQJ PCWjfdFM3joA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="329004275" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 12:43:27 -0700 From: ira.weiny@intel.com To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Peter Zijlstra Cc: Ira Weiny , x86@kernel.org, Dave Hansen , Dan Williams , Andrew Morton , Fenghua Yu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Subject: [PATCH RFC V3 8/9] x86/fault: Report the PKRS state on fault Date: Fri, 9 Oct 2020 12:42:57 -0700 Message-Id: <20201009194258.3207172-9-ira.weiny@intel.com> X-Mailer: git-send-email 2.28.0.rc0.12.gb6a658bd00c9 In-Reply-To: <20201009194258.3207172-1-ira.weiny@intel.com> References: <20201009194258.3207172-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ira Weiny When only user space pkeys are enabled faulting within the kernel was an unexpected condition which should never happen, therefore a WARN_ON was added to the kernel fault handler to detect if it ever did. Now that PKS can be enabled this is no longer the case. Report a Pkey fault with a normal splat and add the PKRS state to the fault splat text. Note the PKS register is reset during an exception therefore the saved PKRS value from before the beginning of the exception is passed down. If PKS is not enabled, or not active, maintain the WARN_ON_ONCE() from before. Because each fault has its own state the pkrs information will be correctly reported even if a fault 'faults'. Suggested-by: Andy Lutomirski Signed-off-by: Ira Weiny --- arch/x86/mm/fault.c | 59 ++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 25 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e55bc4bff389..ee761c993f58 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -504,7 +504,8 @@ static void show_ldttss(const struct desc_ptr *gdt, c= onst char *name, u16 index) } =20 static void -show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned= long address) +show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned= long address, + irqentry_state_t *irq_state) { if (!oops_may_print()) return; @@ -548,6 +549,11 @@ show_fault_oops(struct pt_regs *regs, unsigned long = error_code, unsigned long ad (error_code & X86_PF_PK) ? "protection keys violation" : "permissions violation"); =20 +#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS + if (irq_state && (error_code & X86_PF_PK)) + pr_alert("PKRS: 0x%x\n", irq_state->pkrs); +#endif + if (!(error_code & X86_PF_USER) && user_mode(regs)) { struct desc_ptr idt, gdt; u16 ldtr, tr; @@ -626,7 +632,8 @@ static void set_signal_archinfo(unsigned long address= , =20 static noinline void no_context(struct pt_regs *regs, unsigned long error_code, - unsigned long address, int signal, int si_code) + unsigned long address, int signal, int si_code, + irqentry_state_t *irq_state) { struct task_struct *tsk =3D current; unsigned long flags; @@ -732,7 +739,7 @@ no_context(struct pt_regs *regs, unsigned long error_= code, */ flags =3D oops_begin(); =20 - show_fault_oops(regs, error_code, address); + show_fault_oops(regs, error_code, address, irq_state); =20 if (task_stack_end_corrupted(tsk)) printk(KERN_EMERG "Thread overran stack, or stack corrupted\n"); @@ -785,7 +792,8 @@ static bool is_vsyscall_vaddr(unsigned long vaddr) =20 static void __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, - unsigned long address, u32 pkey, int si_code) + unsigned long address, u32 pkey, int si_code, + irqentry_state_t *state) { struct task_struct *tsk =3D current; =20 @@ -832,14 +840,14 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsign= ed long error_code, if (is_f00f_bug(regs, address)) return; =20 - no_context(regs, error_code, address, SIGSEGV, si_code); + no_context(regs, error_code, address, SIGSEGV, si_code, state); } =20 static noinline void bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, - unsigned long address) + unsigned long address, irqentry_state_t *state) { - __bad_area_nosemaphore(regs, error_code, address, 0, SEGV_MAPERR); + __bad_area_nosemaphore(regs, error_code, address, 0, SEGV_MAPERR, state= ); } =20 static void @@ -853,7 +861,7 @@ __bad_area(struct pt_regs *regs, unsigned long error_= code, */ mmap_read_unlock(mm); =20 - __bad_area_nosemaphore(regs, error_code, address, pkey, si_code); + __bad_area_nosemaphore(regs, error_code, address, pkey, si_code, NULL); } =20 static noinline void @@ -923,7 +931,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_c= ode, unsigned long address, { /* Kernel mode? Handle exceptions or die: */ if (!(error_code & X86_PF_USER)) { - no_context(regs, error_code, address, SIGBUS, BUS_ADRERR); + no_context(regs, error_code, address, SIGBUS, BUS_ADRERR, NULL); return; } =20 @@ -957,7 +965,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long er= ror_code, unsigned long address, vm_fault_t fault) { if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) { - no_context(regs, error_code, address, 0, 0); + no_context(regs, error_code, address, 0, 0, NULL); return; } =20 @@ -965,7 +973,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long er= ror_code, /* Kernel mode? Handle exceptions or die: */ if (!(error_code & X86_PF_USER)) { no_context(regs, error_code, address, - SIGSEGV, SEGV_MAPERR); + SIGSEGV, SEGV_MAPERR, NULL); return; } =20 @@ -980,7 +988,7 @@ mm_fault_error(struct pt_regs *regs, unsigned long er= ror_code, VM_FAULT_HWPOISON_LARGE)) do_sigbus(regs, error_code, address, fault); else if (fault & VM_FAULT_SIGSEGV) - bad_area_nosemaphore(regs, error_code, address); + bad_area_nosemaphore(regs, error_code, address, NULL); else BUG(); } @@ -1148,14 +1156,15 @@ static int fault_in_kernel_space(unsigned long ad= dress) */ static void do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code, - unsigned long address) + unsigned long address, irqentry_state_t *irq_state) { /* - * Protection keys exceptions only happen on user pages. We - * have no user pages in the kernel portion of the address - * space, so do not expect them here. + * If protection keys are not enabled for kernel space + * do not expect Pkey errors here. */ - WARN_ON_ONCE(hw_error_code & X86_PF_PK); + if (!IS_ENABLED(CONFIG_ARCH_HAS_SUPERVISOR_PKEYS) || + !cpu_feature_enabled(X86_FEATURE_PKS)) + WARN_ON_ONCE(hw_error_code & X86_PF_PK); =20 #ifdef CONFIG_X86_32 /* @@ -1204,7 +1213,7 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned l= ong hw_error_code, * Don't take the mm semaphore here. If we fixup a prefetch * fault we could otherwise deadlock: */ - bad_area_nosemaphore(regs, hw_error_code, address); + bad_area_nosemaphore(regs, hw_error_code, address, irq_state); } NOKPROBE_SYMBOL(do_kern_addr_fault); =20 @@ -1245,7 +1254,7 @@ void do_user_addr_fault(struct pt_regs *regs, !(hw_error_code & X86_PF_USER) && !(regs->flags & X86_EFLAGS_AC))) { - bad_area_nosemaphore(regs, hw_error_code, address); + bad_area_nosemaphore(regs, hw_error_code, address, NULL); return; } =20 @@ -1254,7 +1263,7 @@ void do_user_addr_fault(struct pt_regs *regs, * in a region with pagefaults disabled then we must not take the fault */ if (unlikely(faulthandler_disabled() || !mm)) { - bad_area_nosemaphore(regs, hw_error_code, address); + bad_area_nosemaphore(regs, hw_error_code, address, NULL); return; } =20 @@ -1316,7 +1325,7 @@ void do_user_addr_fault(struct pt_regs *regs, * Fault from code in kernel from * which we do not expect faults. */ - bad_area_nosemaphore(regs, hw_error_code, address); + bad_area_nosemaphore(regs, hw_error_code, address, NULL); return; } retry: @@ -1375,7 +1384,7 @@ void do_user_addr_fault(struct pt_regs *regs, if (fault_signal_pending(fault, regs)) { if (!user_mode(regs)) no_context(regs, hw_error_code, address, SIGBUS, - BUS_ADRERR); + BUS_ADRERR, NULL); return; } =20 @@ -1415,7 +1424,7 @@ trace_page_fault_entries(struct pt_regs *regs, unsi= gned long error_code, =20 static __always_inline void handle_page_fault(struct pt_regs *regs, unsigned long error_code, - unsigned long address) + unsigned long address, irqentry_state_t *irq_state) { trace_page_fault_entries(regs, error_code, address); =20 @@ -1424,7 +1433,7 @@ handle_page_fault(struct pt_regs *regs, unsigned lo= ng error_code, =20 /* Was the fault on kernel-controlled part of the address space? */ if (unlikely(fault_in_kernel_space(address))) { - do_kern_addr_fault(regs, error_code, address); + do_kern_addr_fault(regs, error_code, address, irq_state); } else { do_user_addr_fault(regs, error_code, address); /* @@ -1479,7 +1488,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) irqentry_enter(regs, &state); =20 instrumentation_begin(); - handle_page_fault(regs, error_code, address); + handle_page_fault(regs, error_code, address, &state); instrumentation_end(); =20 irqentry_exit(regs, &state); --=20 2.28.0.rc0.12.gb6a658bd00c9