From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6271CCA482 for ; Sat, 16 Jul 2022 23:17:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232721AbiGPXRV (ORCPT ); Sat, 16 Jul 2022 19:17:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232578AbiGPXRR (ORCPT ); Sat, 16 Jul 2022 19:17:17 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80C9A1AF22 for ; Sat, 16 Jul 2022 16:17:15 -0700 (PDT) Message-ID: <20220716230952.787452088@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=4fi47O/0pVrlNGLHp44UcxjED6kI5STBjAcxzwbWyJc=; b=we7GlwNbtnKX6t1UjXXCfOgk/d/tK0f/WoIQ5JnMbR3PvYnECLlzc4RV+x1XxwH8+ilB9l VInJpKLOLGx4/BEEDAfVlGmzJ/l1WZY6AMVytFrbWQXgcsIsem+J3Ie7QRfWVxcxiZh0AT NRvWB8zFwv9V3cvhTnb5SKxJgsCv7iJNPjocjcaxaCJHb/4ScmpC+rrhdIsaer/KlnD//Q yUE04pTIYOlF87oG1eTI9+f9wmltFIwJLII6M86NQLUJVDGsnZfCocTVDAeJn5PhVK3Jib 2ccaES0yrqIx3zSYYYKk4vnh74SFvgXEYgW8ZlTDtEJSUHqtvrld0wScgcITPA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=4fi47O/0pVrlNGLHp44UcxjED6kI5STBjAcxzwbWyJc=; b=qdCa5TIaaXeJkUFTt802dhkWoRMEiCfJKRWByzRlmwPoch0DFgbD6xpJzyZixxHfQbHdIw 05z/jsknHGNa3QDg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 02/38] x86/cpu: Use native_wrmsrl() in load_percpu_segment() References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Date: Sun, 17 Jul 2022 01:17:12 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org load_percpu_segment() is using wrmsr() which is paravirtualized. That's an issue because the code sequence is: __loadsegment_simple(gs, 0); wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); So anything which uses a per CPU variable between setting GS to 0 and writing GSBASE is going to end up in a NULL pointer dereference. That's can be triggered with instrumentation and is guaranteed to be triggered with callthunks for call depth tracking. Use native_wrmsrl() instead. XEN_PV will trap and emulate, but that's not a hot path. Also make it static and mark it noinstr so neither kprobes, sanitizers or whatever can touch it. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/processor.h | 1 - arch/x86/kernel/cpu/common.c | 12 ++++++++++-- 2 files changed, 10 insertions(+), 3 deletions(-) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -673,7 +673,6 @@ extern struct desc_ptr early_gdt_descr; extern void switch_to_new_gdt(int); extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); -extern void load_percpu_segment(int); extern void cpu_init(void); extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -701,13 +701,21 @@ static const char *table_lookup_model(st __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); __u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); -void load_percpu_segment(int cpu) +static noinstr void load_percpu_segment(int cpu) { #ifdef CONFIG_X86_32 loadsegment(fs, __KERNEL_PERCPU); #else __loadsegment_simple(gs, 0); - wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); + /* + * Because of the __loadsegment_simple(gs, 0) above, any GS-prefixed + * instruction will explode right about here. As such, we must not have + * any CALL-thunks using per-cpu data. + * + * Therefore, use native_wrmsrl() and have XenPV take the fault and + * emulate. + */ + native_wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); #endif }