From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50D4FC77B7C for ; Mon, 8 May 2023 01:05:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232057AbjEHBE7 (ORCPT ); Sun, 7 May 2023 21:04:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232050AbjEHBE6 (ORCPT ); Sun, 7 May 2023 21:04:58 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E03CF1160E; Sun, 7 May 2023 18:04:55 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-24df4ef05d4so3555147a91.2; Sun, 07 May 2023 18:04:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=g6nKyzCdl7wTzFzj0LPDCvxv55Fy7TfpGWgO3pV6ij+uOoq81YB2jNTSX5iaFONC1a 2mmifmqG7gTqnR60RJOMRKTGxfC1MPZOBDnxBxC8EfLhTHJL8AhE1/n2yUOJyPMLC9o/ uuml0GrB72oOnQcqwCMNQfB2MRnFdMZQT3rB+TP3y0CeS4NjZ5eUYbvjHqlpvDWij0mY +Rsmlbec0zcoH6AhBTwgJ1Kv0dXBCznV96/reb0frVyB8K6Y485zWs34lthhXNKMQ7fk dmwcTIP128fzDg1lcXBtIwm0yEzMGvicYdNfyIwQCmGboUqYj0oPhBVNRspiaZjXGsoc yQQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=Rwn1gDs9Ux87sCcOre1k4ysdM8AdA2GoHVqsnPXWw09eav5Dwyb2uZKyEORPbFqDYp 8yF0AqyX1Xp2lLH+DAn3xWndfrFcVwOU5RjWqUYBcAimXuvdZjeCR8kHkArDKX/FVkqU T6fwK1lWctr0DTzQR/TdwOQma9seevf+JmcMEAjoqfMnbVjcwRAAohYltkWDE/BXXaQ6 QmPeFC2xY+yDb6Jn+FXSwEdweXYYZcQtlbPh/i+MoPS8k/HGuGO2/L6r1WbIXd62HGGE gpGY7tMjkYjMeye2ILJ+3ZD8Pepj2lkrz28HAXwI0SB5nGXG4/zINM4vARjEMBoXXw9J lPng== X-Gm-Message-State: AC+VfDzfmDhcP+Zqcbj4FaZx3yIIAE+JqqUcaUuWvY1MoV0F1OGm2iRC S5/jeZskQyuiELDLrg+iVbg= X-Google-Smtp-Source: ACHHUZ59icTjTc2OdzI15EfiZxFSZ69uNN4QwbtfViigDNmld+b2odQxDB6B0BmyEFVXN9x050Oztw== X-Received: by 2002:a17:90b:3142:b0:246:f8d7:3083 with SMTP id ip2-20020a17090b314200b00246f8d73083mr8577679pjb.16.1683507895288; Sun, 07 May 2023 18:04:55 -0700 (PDT) Received: from localhost (58-6-235-78.tpgi.com.au. [58.6.235.78]) by smtp.gmail.com with ESMTPSA id ie14-20020a17090b400e00b0024e1172c1d3sm11713201pjb.32.2023.05.07.18.04.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 07 May 2023 18:04:54 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 08 May 2023 11:04:40 +1000 Message-Id: From: "Nicholas Piggin" To: "Doug Anderson" Cc: "Petr Mladek" , "Andrew Morton" , "Sumit Garg" , "Mark Rutland" , "Matthias Kaehlcke" , "Stephane Eranian" , "Stephen Boyd" , , "Tzung-Bi Shih" , "Lecopzer Chen" , , "Masayoshi Mizuma" , "Guenter Roeck" , "Pingfan Liu" , "Andi Kleen" , "Ian Rogers" , , , , "Randy Dunlap" , "Chen-Yu Tsai" , , , , , "Will Deacon" , , , "Marc Zyngier" , "Catalin Marinas" , "Daniel Thompson" , "Colin Cross" Subject: Re: [PATCH v4 13/17] watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs X-Mailer: aerc 0.14.0 References: <20230504221349.1535669-1-dianders@chromium.org> <20230504151100.v4.13.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid> In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-perf-users@vger.kernel.org On Sat May 6, 2023 at 2:35 AM AEST, Doug Anderson wrote: > Hi, > > On Thu, May 4, 2023 at 7:36=E2=80=AFPM Nicholas Piggin wrote: > > > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > > From: Colin Cross > > > > > > Implement a hardlockup detector that doesn't doesn't need any extra > > > arch-specific support code to detect lockups. Instead of using > > > something arch-specific we will use the buddy system, where each CPU > > > watches out for another one. Specifically, each CPU will use its > > > softlockup hrtimer to check that the next CPU is processing hrtimer > > > interrupts by verifying that a counter is increasing. > > > > Powerpc's watchdog has an SMP checker, did you see it? > > No, I wasn't aware of it. Interesting, it seems to basically enable > both types of hardlockup detectors together. If that really catches > more lockups, it seems like we could do the same thing for the buddy > system. It doesn't catch more lockups. On powerpc we don't have a reliable periodic NMI hence the SMP checker. But it is preferable that a CPU detects its own lockup because NMI IPIs can result in crashes if they are taken in certain critical sections. > If people want, I don't think it would be very hard to make > the buddy system _not_ exclusive of the perf system. Instead of having > the buddy system implement the "weak" functions I could just call the > buddy functions in the right places directly and leave the "weak" > functions for a more traditional hardlockup detector to implement. > Opinions? > > Maybe after all this lands, the powerpc watchdog could move to use the > common code? As evidenced by this patch series, there's not really a > reason for the SMP detection to be platform specific. The powerpc SMP checker could certainly move to common code if others wanted to use it. > > It's all to > > all rather than buddy which makes it more complicated but arguably > > bit better functionality. > > Can you come up with an example crash where the "all to all" would > work better than the simple buddy system provided by this patch? CPU2 CPU3 spin_lock_irqsave(A) spin_lock_irqsave(B) spin_lock_irqsave(B) spin_lock_irqsave(A) CPU1 will detect the lockup on CPU2, but CPU3's lockup won't be detected so we don't get the trace that can diagnose the bug. Another thing I actually found it useful for is you can easily see if a core (i.e., all threads in the core) or a chip has died. Maybe more useful when doing presilicon and bring up work or firmware hacking, but still useful. Thanks, Nick > It > seems like they would be equivalent, but I could be missing something. > Specifically they both need at least one non-locked-up CPU to detect a > problem. If one or more CPUs is locked up then we'll always detect it. > I suppose maybe you could provide a better error message at lockup > time saying that several CPUs were locked up and that could be > helpful. For now, I'd keep the current buddy system the way it is and > if you want to provide a patch improving things to be "all-to-all" in > the future that would be interesting to review. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13F27C77B7D for ; Mon, 8 May 2023 01:06:03 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QF3551m3Hz3c96 for ; Mon, 8 May 2023 11:06:01 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=g6nKyzCd; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::102f; helo=mail-pj1-x102f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=g6nKyzCd; dkim-atps=neutral Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QF3402MGGz30Lt for ; Mon, 8 May 2023 11:05:02 +1000 (AEST) Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-24eab83867dso3556007a91.3 for ; Sun, 07 May 2023 18:05:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=g6nKyzCdl7wTzFzj0LPDCvxv55Fy7TfpGWgO3pV6ij+uOoq81YB2jNTSX5iaFONC1a 2mmifmqG7gTqnR60RJOMRKTGxfC1MPZOBDnxBxC8EfLhTHJL8AhE1/n2yUOJyPMLC9o/ uuml0GrB72oOnQcqwCMNQfB2MRnFdMZQT3rB+TP3y0CeS4NjZ5eUYbvjHqlpvDWij0mY +Rsmlbec0zcoH6AhBTwgJ1Kv0dXBCznV96/reb0frVyB8K6Y485zWs34lthhXNKMQ7fk dmwcTIP128fzDg1lcXBtIwm0yEzMGvicYdNfyIwQCmGboUqYj0oPhBVNRspiaZjXGsoc yQQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=eR8MxmKvu5tz27j6vVN1u5X9mXeB7FVoL4HvJVjT9aNgfNtUfy8IIrOW6Tt4ln5ASY cHP75fE+gjGaoP1+KUNN3ZXWFXNZJsh7eVWPwS1S9O48BH8QGAe18lFb9ubYOuIQ6FtY QzQGhj/lgLl9+Nk0e58M8QoeAMJRoA+/c9qGpJ1TPigl4ovuZNOSCITgLsiwS1LfdX6N 5v01o3JIIOCC2B1f1UZIBgKEuamAh7lerj2T8DsSG9X16q69tHeONKTxevhTa/IfMW1z GLrUSAzjsNtKvM5+7lXrTeG1j5Li2nZH+oPc3L1Z0cXs1GNUKHRgxnk8FOyHfdT+u3Hf nkyw== X-Gm-Message-State: AC+VfDymhcggqysTnt8+eDwJua8Nr4HHOFYy2q8sxtaHnQhFgczhMlHM eC9PtBZIcCPuNs5iRYSWy8w= X-Google-Smtp-Source: ACHHUZ59icTjTc2OdzI15EfiZxFSZ69uNN4QwbtfViigDNmld+b2odQxDB6B0BmyEFVXN9x050Oztw== X-Received: by 2002:a17:90b:3142:b0:246:f8d7:3083 with SMTP id ip2-20020a17090b314200b00246f8d73083mr8577679pjb.16.1683507895288; Sun, 07 May 2023 18:04:55 -0700 (PDT) Received: from localhost (58-6-235-78.tpgi.com.au. [58.6.235.78]) by smtp.gmail.com with ESMTPSA id ie14-20020a17090b400e00b0024e1172c1d3sm11713201pjb.32.2023.05.07.18.04.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 07 May 2023 18:04:54 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 08 May 2023 11:04:40 +1000 Message-Id: From: "Nicholas Piggin" To: "Doug Anderson" Subject: Re: [PATCH v4 13/17] watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs X-Mailer: aerc 0.14.0 References: <20230504221349.1535669-1-dianders@chromium.org> <20230504151100.v4.13.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid> In-Reply-To: X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Ian Rogers , Randy Dunlap , Lecopzer Chen , ravi.v.shankar@intel.com, kgdb-bugreport@lists.sourceforge.net, ricardo.neri@intel.com, Stephane Eranian , sparclinux@vger.kernel.org, Guenter Roeck , Will Deacon , Daniel Thompson , Andi Kleen , Chen-Yu Tsai , Matthias Kaehlcke , Catalin Marinas , Masayoshi Mizuma , Petr Mladek , Tzung-Bi Shih , Colin Cross , Stephen Boyd , Pingfan Liu , linux-arm-kernel@lists.infradead.org, Sumit Garg , ito-yuichi@fujitsu.com, linux-perf-users@vger.kernel.org, Marc Zyngier , Andrew Morton , linuxppc-dev@lists.ozlabs.org, davem@davemloft.net Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Sat May 6, 2023 at 2:35 AM AEST, Doug Anderson wrote: > Hi, > > On Thu, May 4, 2023 at 7:36=E2=80=AFPM Nicholas Piggin wrote: > > > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > > From: Colin Cross > > > > > > Implement a hardlockup detector that doesn't doesn't need any extra > > > arch-specific support code to detect lockups. Instead of using > > > something arch-specific we will use the buddy system, where each CPU > > > watches out for another one. Specifically, each CPU will use its > > > softlockup hrtimer to check that the next CPU is processing hrtimer > > > interrupts by verifying that a counter is increasing. > > > > Powerpc's watchdog has an SMP checker, did you see it? > > No, I wasn't aware of it. Interesting, it seems to basically enable > both types of hardlockup detectors together. If that really catches > more lockups, it seems like we could do the same thing for the buddy > system. It doesn't catch more lockups. On powerpc we don't have a reliable periodic NMI hence the SMP checker. But it is preferable that a CPU detects its own lockup because NMI IPIs can result in crashes if they are taken in certain critical sections. > If people want, I don't think it would be very hard to make > the buddy system _not_ exclusive of the perf system. Instead of having > the buddy system implement the "weak" functions I could just call the > buddy functions in the right places directly and leave the "weak" > functions for a more traditional hardlockup detector to implement. > Opinions? > > Maybe after all this lands, the powerpc watchdog could move to use the > common code? As evidenced by this patch series, there's not really a > reason for the SMP detection to be platform specific. The powerpc SMP checker could certainly move to common code if others wanted to use it. > > It's all to > > all rather than buddy which makes it more complicated but arguably > > bit better functionality. > > Can you come up with an example crash where the "all to all" would > work better than the simple buddy system provided by this patch? CPU2 CPU3 spin_lock_irqsave(A) spin_lock_irqsave(B) spin_lock_irqsave(B) spin_lock_irqsave(A) CPU1 will detect the lockup on CPU2, but CPU3's lockup won't be detected so we don't get the trace that can diagnose the bug. Another thing I actually found it useful for is you can easily see if a core (i.e., all threads in the core) or a chip has died. Maybe more useful when doing presilicon and bring up work or firmware hacking, but still useful. Thanks, Nick > It > seems like they would be equivalent, but I could be missing something. > Specifically they both need at least one non-locked-up CPU to detect a > problem. If one or more CPUs is locked up then we'll always detect it. > I suppose maybe you could provide a better error message at lockup > time saying that several CPUs were locked up and that could be > helpful. For now, I'd keep the current buddy system the way it is and > if you want to provide a patch improving things to be "all-to-all" in > the future that would be interesting to review. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E284DC77B7C for ; Mon, 8 May 2023 02:16:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:References:Subject:Cc:To: From:Message-Id:Date:Mime-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kcec3XBLcTcKqdGNfexowvBzeCTcoQVPPeFtevJQxWo=; b=atJW1IovGDEGa3 VwdBMh2NmqrvcFfVyp9LctHM8PKtJ/drxhOHJQdHFUoIvJ+hUmTRtMPOUS/Oue2yrMNBy32dmZonl 3tbQfG7MjilgOmNic1IjV1HahL8Xm5E+eyO8i/4LPXFD0J7rGfUlWHGpU+bp4ovvFYGnYNtJv5DM8 /knL3AGqQVGRUdfsn3VtMZQxVbVs3mRGPUiWLSHryiHLlPvtUv8Yt54Qi3QndRy0NOx5v9ynODG9Z 7+E9GrfdQWflj3k26gDH0K8gxtkf/x+ZVjJiLPQARwBxVq5iuLmleGMCL1bjhpi7SvdP3wT3zFO/j 256HXWe3zdm55U2+wexw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pvqPD-00H19I-2b; Mon, 08 May 2023 02:15:15 +0000 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pvpJB-00Guym-2V for linux-arm-kernel@lists.infradead.org; Mon, 08 May 2023 01:04:59 +0000 Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-24e3a0aa408so3563694a91.1 for ; Sun, 07 May 2023 18:04:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=g6nKyzCdl7wTzFzj0LPDCvxv55Fy7TfpGWgO3pV6ij+uOoq81YB2jNTSX5iaFONC1a 2mmifmqG7gTqnR60RJOMRKTGxfC1MPZOBDnxBxC8EfLhTHJL8AhE1/n2yUOJyPMLC9o/ uuml0GrB72oOnQcqwCMNQfB2MRnFdMZQT3rB+TP3y0CeS4NjZ5eUYbvjHqlpvDWij0mY +Rsmlbec0zcoH6AhBTwgJ1Kv0dXBCznV96/reb0frVyB8K6Y485zWs34lthhXNKMQ7fk dmwcTIP128fzDg1lcXBtIwm0yEzMGvicYdNfyIwQCmGboUqYj0oPhBVNRspiaZjXGsoc yQQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683507895; x=1686099895; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=L+PPFKXX425r8vzKPVM6Zic55f59QA5mguBoW8dBKWo=; b=kWVDHm5LqbR/vfWHyyX9TbKaZGuDD1SChJJ1TrhU250vUzoqP29Tziw+bVenOTZ2dh wgBTCmE/3RrkgRRfe+vIBtYSP9lD07E50xtRdVtuRRUOBCTJbQYUlhDClnXH1yj38fRT cL1uUJWfXkau6N8ynPnOIyHvDNDaYqPgjhvgBr57TO5tmczSxoXdlqGUXo/QOuUqxzpB v9Xr7uP0qB3yPFz4WbpSPrQ15h6rnnqS1yV85gBIbmtFBISJ0ug4vayb6Whn9I4vletS yfE88bSCMpuZ5EbLJfnj/jRIcN9yVFJr1BvHV19IgliCKTiqGN756Q2vxzRaGP8CGxfi /4Fg== X-Gm-Message-State: AC+VfDzLCVyM9sA+JdQOAXULamq36WUJ+KgzyO7a8qnI8oZZ+OBUaix8 tvhVZwTOs5PFvWIJDfJpM3Gc2gA4JcQ= X-Google-Smtp-Source: ACHHUZ59icTjTc2OdzI15EfiZxFSZ69uNN4QwbtfViigDNmld+b2odQxDB6B0BmyEFVXN9x050Oztw== X-Received: by 2002:a17:90b:3142:b0:246:f8d7:3083 with SMTP id ip2-20020a17090b314200b00246f8d73083mr8577679pjb.16.1683507895288; Sun, 07 May 2023 18:04:55 -0700 (PDT) Received: from localhost (58-6-235-78.tpgi.com.au. [58.6.235.78]) by smtp.gmail.com with ESMTPSA id ie14-20020a17090b400e00b0024e1172c1d3sm11713201pjb.32.2023.05.07.18.04.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 07 May 2023 18:04:54 -0700 (PDT) Mime-Version: 1.0 Date: Mon, 08 May 2023 11:04:40 +1000 Message-Id: From: "Nicholas Piggin" To: "Doug Anderson" Cc: "Petr Mladek" , "Andrew Morton" , "Sumit Garg" , "Mark Rutland" , "Matthias Kaehlcke" , "Stephane Eranian" , "Stephen Boyd" , , "Tzung-Bi Shih" , "Lecopzer Chen" , , "Masayoshi Mizuma" , "Guenter Roeck" , "Pingfan Liu" , "Andi Kleen" , "Ian Rogers" , , , , "Randy Dunlap" , "Chen-Yu Tsai" , , , , , "Will Deacon" , , , "Marc Zyngier" , "Catalin Marinas" , "Daniel Thompson" , "Colin Cross" Subject: Re: [PATCH v4 13/17] watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs X-Mailer: aerc 0.14.0 References: <20230504221349.1535669-1-dianders@chromium.org> <20230504151100.v4.13.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid> In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230507_180457_815921_BBDFCE53 X-CRM114-Status: GOOD ( 33.83 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org T24gU2F0IE1heSA2LCAyMDIzIGF0IDI6MzUgQU0gQUVTVCwgRG91ZyBBbmRlcnNvbiB3cm90ZToK PiBIaSwKPgo+IE9uIFRodSwgTWF5IDQsIDIwMjMgYXQgNzozNuKAr1BNIE5pY2hvbGFzIFBpZ2dp biA8bnBpZ2dpbkBnbWFpbC5jb20+IHdyb3RlOgo+ID4KPiA+IE9uIEZyaSBNYXkgNSwgMjAyMyBh dCA4OjEzIEFNIEFFU1QsIERvdWdsYXMgQW5kZXJzb24gd3JvdGU6Cj4gPiA+IEZyb206IENvbGlu IENyb3NzIDxjY3Jvc3NAYW5kcm9pZC5jb20+Cj4gPiA+Cj4gPiA+IEltcGxlbWVudCBhIGhhcmRs b2NrdXAgZGV0ZWN0b3IgdGhhdCBkb2Vzbid0IGRvZXNuJ3QgbmVlZCBhbnkgZXh0cmEKPiA+ID4g YXJjaC1zcGVjaWZpYyBzdXBwb3J0IGNvZGUgdG8gZGV0ZWN0IGxvY2t1cHMuIEluc3RlYWQgb2Yg dXNpbmcKPiA+ID4gc29tZXRoaW5nIGFyY2gtc3BlY2lmaWMgd2Ugd2lsbCB1c2UgdGhlIGJ1ZGR5 IHN5c3RlbSwgd2hlcmUgZWFjaCBDUFUKPiA+ID4gd2F0Y2hlcyBvdXQgZm9yIGFub3RoZXIgb25l LiBTcGVjaWZpY2FsbHksIGVhY2ggQ1BVIHdpbGwgdXNlIGl0cwo+ID4gPiBzb2Z0bG9ja3VwIGhy dGltZXIgdG8gY2hlY2sgdGhhdCB0aGUgbmV4dCBDUFUgaXMgcHJvY2Vzc2luZyBocnRpbWVyCj4g PiA+IGludGVycnVwdHMgYnkgdmVyaWZ5aW5nIHRoYXQgYSBjb3VudGVyIGlzIGluY3JlYXNpbmcu Cj4gPgo+ID4gUG93ZXJwYydzIHdhdGNoZG9nIGhhcyBhbiBTTVAgY2hlY2tlciwgZGlkIHlvdSBz ZWUgaXQ/Cj4KPiBObywgSSB3YXNuJ3QgYXdhcmUgb2YgaXQuIEludGVyZXN0aW5nLCBpdCBzZWVt cyB0byBiYXNpY2FsbHkgZW5hYmxlCj4gYm90aCB0eXBlcyBvZiBoYXJkbG9ja3VwIGRldGVjdG9y cyB0b2dldGhlci4gSWYgdGhhdCByZWFsbHkgY2F0Y2hlcwo+IG1vcmUgbG9ja3VwcywgaXQgc2Vl bXMgbGlrZSB3ZSBjb3VsZCBkbyB0aGUgc2FtZSB0aGluZyBmb3IgdGhlIGJ1ZGR5Cj4gc3lzdGVt LgoKSXQgZG9lc24ndCBjYXRjaCBtb3JlIGxvY2t1cHMuIE9uIHBvd2VycGMgd2UgZG9uJ3QgaGF2 ZSBhIHJlbGlhYmxlCnBlcmlvZGljIE5NSSBoZW5jZSB0aGUgU01QIGNoZWNrZXIuIEJ1dCBpdCBp cyBwcmVmZXJhYmxlIHRoYXQgYSBDUFUKZGV0ZWN0cyBpdHMgb3duIGxvY2t1cCBiZWNhdXNlIE5N SSBJUElzIGNhbiByZXN1bHQgaW4gY3Jhc2hlcyBpZgp0aGV5IGFyZSB0YWtlbiBpbiBjZXJ0YWlu IGNyaXRpY2FsIHNlY3Rpb25zLgoKPiBJZiBwZW9wbGUgd2FudCwgSSBkb24ndCB0aGluayBpdCB3 b3VsZCBiZSB2ZXJ5IGhhcmQgdG8gbWFrZQo+IHRoZSBidWRkeSBzeXN0ZW0gX25vdF8gZXhjbHVz aXZlIG9mIHRoZSBwZXJmIHN5c3RlbS4gSW5zdGVhZCBvZiBoYXZpbmcKPiB0aGUgYnVkZHkgc3lz dGVtIGltcGxlbWVudCB0aGUgIndlYWsiIGZ1bmN0aW9ucyBJIGNvdWxkIGp1c3QgY2FsbCB0aGUK PiBidWRkeSBmdW5jdGlvbnMgaW4gdGhlIHJpZ2h0IHBsYWNlcyBkaXJlY3RseSBhbmQgbGVhdmUg dGhlICJ3ZWFrIgo+IGZ1bmN0aW9ucyBmb3IgYSBtb3JlIHRyYWRpdGlvbmFsIGhhcmRsb2NrdXAg ZGV0ZWN0b3IgdG8gaW1wbGVtZW50Lgo+IE9waW5pb25zPwo+Cj4gTWF5YmUgYWZ0ZXIgYWxsIHRo aXMgbGFuZHMsIHRoZSBwb3dlcnBjIHdhdGNoZG9nIGNvdWxkIG1vdmUgdG8gdXNlIHRoZQo+IGNv bW1vbiBjb2RlPyBBcyBldmlkZW5jZWQgYnkgdGhpcyBwYXRjaCBzZXJpZXMsIHRoZXJlJ3Mgbm90 IHJlYWxseSBhCj4gcmVhc29uIGZvciB0aGUgU01QIGRldGVjdGlvbiB0byBiZSBwbGF0Zm9ybSBz cGVjaWZpYy4KClRoZSBwb3dlcnBjIFNNUCBjaGVja2VyIGNvdWxkIGNlcnRhaW5seSBtb3ZlIHRv IGNvbW1vbiBjb2RlIGlmCm90aGVycyB3YW50ZWQgdG8gdXNlIGl0LgoKPiA+IEl0J3MgYWxsIHRv Cj4gPiBhbGwgcmF0aGVyIHRoYW4gYnVkZHkgd2hpY2ggbWFrZXMgaXQgbW9yZSBjb21wbGljYXRl ZCBidXQgYXJndWFibHkKPiA+IGJpdCBiZXR0ZXIgZnVuY3Rpb25hbGl0eS4KPgo+IENhbiB5b3Ug Y29tZSB1cCB3aXRoIGFuIGV4YW1wbGUgY3Jhc2ggd2hlcmUgdGhlICJhbGwgdG8gYWxsIiB3b3Vs ZAo+IHdvcmsgYmV0dGVyIHRoYW4gdGhlIHNpbXBsZSBidWRkeSBzeXN0ZW0gcHJvdmlkZWQgYnkg dGhpcyBwYXRjaD8KCkNQVTIgICAgICAgICAgICAgICAgICAgICBDUFUzCnNwaW5fbG9ja19pcnFz YXZlKEEpICAgICBzcGluX2xvY2tfaXJxc2F2ZShCKQpzcGluX2xvY2tfaXJxc2F2ZShCKSAgICAg c3Bpbl9sb2NrX2lycXNhdmUoQSkKCkNQVTEgd2lsbCBkZXRlY3QgdGhlIGxvY2t1cCBvbiBDUFUy LCBidXQgQ1BVMydzIGxvY2t1cCB3b24ndCBiZQpkZXRlY3RlZCBzbyB3ZSBkb24ndCBnZXQgdGhl IHRyYWNlIHRoYXQgY2FuIGRpYWdub3NlIHRoZSBidWcuCgpBbm90aGVyIHRoaW5nIEkgYWN0dWFs bHkgZm91bmQgaXQgdXNlZnVsIGZvciBpcyB5b3UgY2FuIGVhc2lseQpzZWUgaWYgYSBjb3JlIChp LmUuLCBhbGwgdGhyZWFkcyBpbiB0aGUgY29yZSkgb3IgYSBjaGlwIGhhcwpkaWVkLiBNYXliZSBt b3JlIHVzZWZ1bCB3aGVuIGRvaW5nIHByZXNpbGljb24gYW5kIGJyaW5nIHVwIHdvcmsKb3IgZmly bXdhcmUgaGFja2luZywgYnV0IHN0aWxsIHVzZWZ1bC4KClRoYW5rcywKTmljawoKPiBJdAo+IHNl ZW1zIGxpa2UgdGhleSB3b3VsZCBiZSBlcXVpdmFsZW50LCBidXQgSSBjb3VsZCBiZSBtaXNzaW5n IHNvbWV0aGluZy4KPiBTcGVjaWZpY2FsbHkgdGhleSBib3RoIG5lZWQgYXQgbGVhc3Qgb25lIG5v bi1sb2NrZWQtdXAgQ1BVIHRvIGRldGVjdCBhCj4gcHJvYmxlbS4gSWYgb25lIG9yIG1vcmUgQ1BV cyBpcyBsb2NrZWQgdXAgdGhlbiB3ZSdsbCBhbHdheXMgZGV0ZWN0IGl0Lgo+IEkgc3VwcG9zZSBt YXliZSB5b3UgY291bGQgcHJvdmlkZSBhIGJldHRlciBlcnJvciBtZXNzYWdlIGF0IGxvY2t1cAo+ IHRpbWUgc2F5aW5nIHRoYXQgc2V2ZXJhbCBDUFVzIHdlcmUgbG9ja2VkIHVwIGFuZCB0aGF0IGNv dWxkIGJlCj4gaGVscGZ1bC4gRm9yIG5vdywgSSdkIGtlZXAgdGhlIGN1cnJlbnQgYnVkZHkgc3lz dGVtIHRoZSB3YXkgaXQgaXMgYW5kCj4gaWYgeW91IHdhbnQgdG8gcHJvdmlkZSBhIHBhdGNoIGlt cHJvdmluZyB0aGluZ3MgdG8gYmUgImFsbC10by1hbGwiIGluCj4gdGhlIGZ1dHVyZSB0aGF0IHdv dWxkIGJlIGludGVyZXN0aW5nIHRvIHJldmlldy4KCgpfX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fXwpsaW51eC1hcm0ta2VybmVsIG1haWxpbmcgbGlzdApsaW51 eC1hcm0ta2VybmVsQGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5v cmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0ta2VybmVsCg==