From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CA1FC433F5 for ; Wed, 17 Nov 2021 02:38:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3F61C61BFB for ; Wed, 17 Nov 2021 02:38:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232432AbhKQCk7 (ORCPT ); Tue, 16 Nov 2021 21:40:59 -0500 Received: from mga03.intel.com ([134.134.136.65]:57589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231998AbhKQCk5 (ORCPT ); Tue, 16 Nov 2021 21:40:57 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10170"; a="233813562" X-IronPort-AV: E=Sophos;i="5.87,239,1631602800"; d="scan'208";a="233813562" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2021 18:38:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,239,1631602800"; d="scan'208";a="494730832" Received: from shbuild999.sh.intel.com ([10.239.146.189]) by orsmga007.jf.intel.com with ESMTP; 16 Nov 2021 18:37:56 -0800 From: Feng Tang To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org Cc: paulmck@kernel.org, rui.zhang@intel.com, andi.kleen@intel.com, len.brown@intel.com, tim.c.chen@intel.com, Feng Tang Subject: [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified platforms Date: Wed, 17 Nov 2021 10:37:51 +0800 Message-Id: <20211117023751.24190-2-feng.tang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211117023751.24190-1-feng.tang@intel.com> References: <20211117023751.24190-1-feng.tang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There are cases that tsc clocksources are wrongly judged as unstable by clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While there is hardly a general reliable way to check the validity of a watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]: "I'm inclined to lift that requirement when the CPU has: 1) X86_FEATURE_CONSTANT_TSC 2) X86_FEATURE_NONSTOP_TSC 3) X86_FEATURE_NONSTOP_TSC_S3 4) X86_FEATURE_TSC_ADJUST 5) At max. 4 sockets After two decades of horrors we're finally at a point where TSC seems to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST was really key as we can now detect even small modifications reliably and the important point is that we can cure them as well (not pretty but better than all other options)." As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations of Atom processor, and is always coupled with X86_FEATURE_CONSTANT_TSC and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive to use maxim of 2 sockets. The check is done inside tsc_init() before registering 'tsc-early' and 'tsc' clocksources, as there were cases that both of them had been wrongly judged as unreliable. For more background of tsc/watchdog, there is a good summary in [2] [1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/ [2]. https://lore.kernel.org/lkml/87a6pimt1f.ffs@nanos.tec.linutronix.de/ Suggested-by: Thomas Gleixner Signed-off-by: Feng Tang --- Change log: v3: * rebased against 5.16-rc1 * refine commit log v2: * Directly skip watchdog check without messing flag 'tsc_clocksource_reliable' (Thomas) arch/x86/kernel/tsc.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 2e076a459a0c..389511f59101 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1180,6 +1180,12 @@ void mark_tsc_unstable(char *reason) EXPORT_SYMBOL_GPL(mark_tsc_unstable); +static void __init tsc_skip_watchdog_verify(void) +{ + clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY; + clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; +} + static void __init check_system_tsc_reliable(void) { #if defined(CONFIG_MGEODEGX1) || defined(CONFIG_MGEODE_LX) || defined(CONFIG_X86_GENERIC) @@ -1196,6 +1202,17 @@ static void __init check_system_tsc_reliable(void) #endif if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) tsc_clocksource_reliable = 1; + + /* + * Ideally the socket number should be checked, but this is called + * by tsc_init() which is in early boot phase and the socket numbers + * may not be available. Use 'nr_online_nodes' as a fallback solution + */ + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && + boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && + boot_cpu_has(X86_FEATURE_TSC_ADJUST) && + nr_online_nodes <= 2) + tsc_skip_watchdog_verify(); } /* @@ -1387,9 +1404,6 @@ static int __init init_tsc_clocksource(void) if (tsc_unstable) goto unreg; - if (tsc_clocksource_reliable || no_tsc_watchdog) - clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; - if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; @@ -1527,7 +1541,7 @@ void __init tsc_init(void) } if (tsc_clocksource_reliable || no_tsc_watchdog) - clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY; + tsc_skip_watchdog_verify(); clocksource_register_khz(&clocksource_tsc_early, tsc_khz); detect_art(); -- 2.27.0