From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8028C433FE for ; Tue, 17 May 2022 16:39:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243247AbiEQQjf (ORCPT ); Tue, 17 May 2022 12:39:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236502AbiEQQja (ORCPT ); Tue, 17 May 2022 12:39:30 -0400 Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86B3A33E86; Tue, 17 May 2022 09:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=nkckcEBnQlRsFTpASqSz6WprnZRaV0YhYSvPtBA0AVg=; b=GXitEwhCwUT4w9+D5pmH2zeSKa 60OoBts2cxKaTjkNlztKOiPg7JmJSxA8Nf6Mc3GM9eoyPx0kgrCnR1AZ6EjsP0NE4+IxyqcasC5FH Bzs6mPDD2Ojz26SlVHGGSFKa4MS4PaKNy9b4ykBdp04iHw75Gq64zjSE08zAzR/RUXjez/E1PIcPr w9iTODel9NpjUtjDaWlc8Esr0zkZKONhALq9dFCmuDS5Nj18Wn3hMj4H1VTlzFy6uMc+YdCA3ReKq k/CxfFfQpOp3w1u8K+s1/QjiYVlCZZRZYeOFoWStjActxBApHl98rSBUYLtUajOsHPvmBlqZChf2A VEU39bjw==; Received: from 200-161-159-120.dsl.telesp.net.br ([200.161.159.120] helo=[192.168.1.60]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1nr0Da-008gju-GJ; Tue, 17 May 2022 18:38:42 +0200 Message-ID: Date: Tue, 17 May 2022 13:37:58 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list Content-Language: en-US To: Petr Mladek , Evan Green , David Gow , Julius Werner Cc: Scott Branden , bcm-kernel-feedback-list@broadcom.com, Sebastian Reichel , Linux PM , Florian Fainelli , Andrew Morton , bhe@redhat.com, kexec@lists.infradead.org, LKML , linuxppc-dev@lists.ozlabs.org, linux-alpha@vger.kernel.org, linux-arm Mailing List , linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-leds@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-s390@vger.kernel.org, linux-tegra@vger.kernel.org, linux-um@lists.infradead.org, linux-xtensa@linux-xtensa.org, netdev@vger.kernel.org, openipmi-developer@lists.sourceforge.net, rcu@vger.kernel.org, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, x86@kernel.org, kernel-dev@igalia.com, kernel@gpiccoli.net, halves@canonical.com, fabiomirmar@gmail.com, alejandro.j.jimenez@oracle.com, Andy Shevchenko , Arnd Bergmann , Borislav Petkov , Jonathan Corbet , d.hatayama@jp.fujitsu.com, dave.hansen@linux.intel.com, dyoung@redhat.com, feng.tang@intel.com, Greg Kroah-Hartman , mikelley@microsoft.com, hidehiro.kawai.ez@hitachi.com, jgross@suse.com, john.ogness@linutronix.de, Kees Cook , luto@kernel.org, mhiramat@kernel.org, mingo@redhat.com, paulmck@kernel.org, peterz@infradead.org, rostedt@goodmis.org, senozhatsky@chromium.org, Alan Stern , Thomas Gleixner , vgoyal@redhat.com, vkuznets@redhat.com, Will Deacon , Alexander Gordeev , Andrea Parri , Ard Biesheuvel , Benjamin Herrenschmidt , Brian Norris , Christian Borntraeger , Christophe JAILLET , "David S. Miller" , Dexuan Cui , Doug Berger , Haiyang Zhang , Hari Bathini , Heiko Carstens , Justin Chen , "K. Y. Srinivasan" , Lee Jones , Markus Mayer , Michael Ellerman , Mihai Carabas , Nicholas Piggin , Paul Mackerras , Pavel Machek , Shile Zhang , Stephen Hemminger , Sven Schnelle , Thomas Bogendoerfer , Tianyu Lan , Vasily Gorbik , Wang ShaoBo , Wei Liu , zhenwei pi , Stephen Boyd References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> From: "Guilherme G. Piccoli" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/05/2022 10:28, Petr Mladek wrote: > [...] >>> Disagree here. I'm looping Google maintainers, so they can comment. >>> (CCed Evan, David, Julius) >>> >>> This notifier is clearly a hypervisor notification mechanism. I've fixed >>> a locking stuff there (in previous patch), I feel it's low-risk but even >>> if it's mid-risk, the class of such callback remains a perfect fit with >>> the hypervisor list IMHO. >> >> This logs a panic to our "eventlog", a tiny logging area in SPI flash >> for critical and power-related events. In some cases this ends up >> being the only clue we get in a Chromebook feedback report that a >> panic occurred, so from my perspective moving it to the front of the >> line seems like a good idea. > > IMHO, this would really better fit into the pre-reboot notifier list: > > + the callback stores the log so it is similar to kmsg_dump() > or console_flush_on_panic() > > + the callback should be proceed after "info" notifiers > that might add some other useful information. > > Honestly, I am not sure what exactly hypervisor callbacks do. But I > think that they do not try to extract the kernel log because they > would need to handle the internal format. > I guess the main point in your response is : "I am not sure what exactly hypervisor callbacks do". We need to be sure about the semantics of such list, and agree on that. So, my opinion about this first list, that we call "hypervisor list", is: it contains callbacks that (1) should run early, preferably before kdump (or even if kdump isn't set, should run ASAP); (2) these callbacks perform some communication with an abstraction that runs "below" the kernel, like a firmware or hypervisor. Classic example: pvpanic, that communicates with VMM (usually qemu) and allow such VMM to snapshot the full guest memory, for example. (3) Should be low-risk. What defines risk is the level of reliability of subsequent operations - if the callback have 50% of chance of "bricking" the system totally and prevent kdump / kmsg_dump() / reboot , this is high risk one for example. Some good fits IMO: pvpanic, sstate_panic_event() [sparc], fadump in powerpc, etc. So, this is a good case for the Google notifier as well - it's not collecting data like the dmesg (hence your second bullet seems to not apply here, info notifiers won't add info to be collected by gsmi). It is a firmware/hypervisor/whatever-gsmi-is notification mechanism, that tells such "lower" abstraction a panic occurred. It seems low risk and we want it to run ASAP, if possible. So, I'd like to keep it here, unless gsmi maintainers disagree or I'm perhaps misunderstanding the meaning of this first list. Cheers, Guilherme