From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51FBDC433FE for ; Mon, 7 Nov 2022 03:21:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230437AbiKGDVj (ORCPT ); Sun, 6 Nov 2022 22:21:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230370AbiKGDVN (ORCPT ); Sun, 6 Nov 2022 22:21:13 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EAD16448; Sun, 6 Nov 2022 19:20:49 -0800 (PST) Received: from dggpemm500020.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4N5GhV2HZJzRp2w; Mon, 7 Nov 2022 11:20:42 +0800 (CST) Received: from dggpemm500006.china.huawei.com (7.185.36.236) by dggpemm500020.china.huawei.com (7.185.36.49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 7 Nov 2022 11:20:48 +0800 Received: from [10.174.178.55] (10.174.178.55) by dggpemm500006.china.huawei.com (7.185.36.236) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 7 Nov 2022 11:20:47 +0800 Subject: Re: [PATCH v4 4/4] rcu: Add RCU stall diagnosis information To: CC: "Elliott, Robert (Servers)" , Frederic Weisbecker , Neeraj Upadhyay , "Josh Triplett" , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , "rcu@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <20221104021224.102-1-thunder.leizhen@huawei.com> <20221104021224.102-5-thunder.leizhen@huawei.com> <14e3aa38-98e6-01be-8cde-68fd1f85edf4@huawei.com> <20221105203220.GD28461@paulmck-ThinkPad-P17-Gen-1> From: "Leizhen (ThunderTown)" Message-ID: Date: Mon, 7 Nov 2022 11:20:46 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: <20221105203220.GD28461@paulmck-ThinkPad-P17-Gen-1> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.55] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500006.china.huawei.com (7.185.36.236) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/11/6 4:32, Paul E. McKenney wrote: > On Sat, Nov 05, 2022 at 03:03:14PM +0800, Leizhen (ThunderTown) wrote: >> On 2022/11/5 9:58, Elliott, Robert (Servers) wrote: > > [ . . . ] > >>>> +int rcu_cpu_stall_cputime __read_mostly = >>>> IS_ENABLED(CONFIG_RCU_CPU_STALL_CPUTIME); >>> >>> As a config option and module parameter, adding some more >>> instrumentation overhead might be worthwhile for other >>> likely causes of rcu stalls. >>> >>> For example, if enabled, have these functions (if available >>> on the architecture) maintain a per-CPU running count of >>> their invocations, which also cause the CPU to be unavailable >>> for rcu: >>> - kernel_fpu_begin() calls - FPU/SIMD context preservation, >>> which also calls preempt_disable() >>> - preempt_disable() calls - scheduler context switches disabled >>> - local_irq_save() calls - interrupts disabled >>> - cond_resched() calls - lack of these is a problem >>> >>> For kernel_fpu_begin and preempt_disable, knowing if it is >>> currently blocked for those reasons is probably the most >>> helpful. >> >> These instructions is already in Documentation/RCU/stallwarn.rst > > Excellent point -- this document also needs to be updated with this > new information. I have pulled in your four patches as noted in my > previous email. They are on the -rcu tree's "dev" branch. OK, thanks. > > Could you please send a patch containing an initial update to > stallwarn.rst? The main thing I need is your perspective on how each > field is used. Okay, I'll add some descriptions to illustrate how to use this function to identify each RCU stall cases. > > Thanx, Paul > . > -- Regards, Zhen Lei