From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54E8CC433EF for ; Wed, 16 Feb 2022 13:29:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234267AbiBPNaK (ORCPT ); Wed, 16 Feb 2022 08:30:10 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:42278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232210AbiBPNaI (ORCPT ); Wed, 16 Feb 2022 08:30:08 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C1501617F9; Wed, 16 Feb 2022 05:29:56 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 959C761704; Wed, 16 Feb 2022 13:29:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D3678C004E1; Wed, 16 Feb 2022 13:29:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1645018194; bh=WtsL6bmJF0TP/UdYn2xyujuyFM2kSwSxAXWzphx1wLE=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=VlaoJ9LAFgWpoRgQfLqn4pZ47rY5S/1Jvomp2LEQcRxvhBLvjTLbJzJg8HhW06NEd QfVYN7vvJbNjByMBJ3s7lKj/rnNR5XBVOGEtY703AWcpSnEWw+t6TMdAI6e8NsFY0M xwmjm+u8boy/ld7dW5rEuQh1aL4Md0ZUJ/BG0BBgcKBJNAHp9dxtMg1K29JYPB+MWL j7T0GV3DvM4xJyvdl81Ef4EBRFlGYUXt5JWKjSC9G12Bhjsi0k0ixdgNXA3FUn+goc Pqn9OhNK8JiL4ysVHydYs9siuhG6YovEBZGbpN8PE6nlG7m0gS+Hz0UmUYcdQMRQG/ NXhDfGUXbW8ow== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nKKNT-008L49-Ui; Wed, 16 Feb 2022 13:29:52 +0000 Date: Wed, 16 Feb 2022 13:29:51 +0000 Message-ID: <877d9v3po0.wl-maz@kernel.org> From: Marc Zyngier To: Marcin Wojtas Cc: Linux Kernel Mailing List , netdev , Greg Kroah-Hartman , Russell King , "David S. Miller" , Jakub Kicinski , Thomas Gleixner , John Garry , kernel-team@android.com Subject: Re: [PATCH 0/2] net: mvpp2: Survive CPU hotplug events In-Reply-To: References: <20220216090845.1278114-1-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: mw@semihalf.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, gregkh@linuxfoundation.org, linux@armlinux.org.uk, davem@davemloft.net, kuba@kernel.org, tglx@linutronix.de, john.garry@huawei.com, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 16 Feb 2022 13:19:30 +0000, Marcin Wojtas wrote: >=20 > Hi Marc, >=20 > =C5=9Br., 16 lut 2022 o 10:08 Marc Zyngier napisa=C5=82(= a): > > > > I recently realised that playing with CPU hotplug on a system equiped > > with a set of MVPP2 devices (Marvell 8040) was fraught with danger and > > would result in a rapid lockup or panic. > > > > As it turns out, the per-CPU nature of the MVPP2 interrupts are > > getting in the way. A good solution for this seems to rely on the > > kernel's managed interrupt approach, where the core kernel will not > > move interrupts around as the CPUs for down, but will simply disable > > the corresponding interrupt. > > > > Converting the driver to this requires a bit of refactoring in the IRQ > > subsystem to expose the required primitive, as well as a bit of > > surgery in the driver itself. > > > > Note that although the system now survives such event, the driver > > seems to assume that all queues are always active and doesn't inform > > the device that a CPU has gone away. Someout who actually understand > > this driver should have a look at it. > > > > Patches on top of 5.17-rc3, lightly tested on a McBin. > > >=20 > Thank you for the patches. Can you, please, share the commands you > used? I'd like to test it more. Offline CPU3: # echo 0 > /sys/devices/system/cpu/cpu3/online Online CPU3: # echo 1 > /sys/devices/system/cpu/cpu3/online Put that in a loop, using different CPUs. On my HW, turning off CPU0 leads to odd behaviours (I wouldn't be surprised if the firmware was broken in that respect, and also the fact that the device keeps trying to send stuff to that CPU...). M. --=20 Without deviation from the norm, progress is not possible.