From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 995C1C433F5 for ; Tue, 17 May 2022 16:43:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351124AbiEQQnC (ORCPT ); Tue, 17 May 2022 12:43:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350063AbiEQQnB (ORCPT ); Tue, 17 May 2022 12:43:01 -0400 Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C073B3FDAD; Tue, 17 May 2022 09:42:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=scD/9Z9PP3JtS/QbR0S82HRrb/qzzKD7mmsgRlJKv48=; b=A0Wed33QPDatmI/wU05BORZ9n1 0hhJk6U25fx6bbbZAsMcq0PkUqYkw6doFXVO3lE2gRlx4mNP2Gg+o2qX5D1iIed/eOGnpU07MiS7v johrL8ZXBr9bRs4m6YspjNjUrp3CRZWOwFXObkbK34aTXOqWVxjbnj+S21xz3SW3iyyzLS8il/lBE YgblTLXs216LyG4XvwA74bc/PHj8Q4o7WtYFZIjzCamKnOfAXbs0CZLburXLM75F/4BH78OE9Feno 1CTSSOT/QCKGLXFRsMSKXpMjqSuNKxrbhetupOiCZ9xDYHRTskdd6CLiaYJY53xEUzZgCM24bzp0B WqYqJCVA==; Received: from 200-161-159-120.dsl.telesp.net.br ([200.161.159.120] helo=[192.168.1.60]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1nr0Hb-008gzd-4h; Tue, 17 May 2022 18:42:51 +0200 Message-ID: Date: Tue, 17 May 2022 13:42:06 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list Content-Language: en-US To: Petr Mladek , Scott Branden , Sebastian Reichel , Florian Fainelli Cc: David Gow , Evan Green , Julius Werner , bcm-kernel-feedback-list@broadcom.com, linux-pm@vger.kernel.org, akpm@linux-foundation.org, bhe@redhat.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-leds@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-s390@vger.kernel.org, linux-tegra@vger.kernel.org, linux-um@lists.infradead.org, linux-xtensa@linux-xtensa.org, netdev@vger.kernel.org, openipmi-developer@lists.sourceforge.net, rcu@vger.kernel.org, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, x86@kernel.org, kernel-dev@igalia.com, kernel@gpiccoli.net, halves@canonical.com, fabiomirmar@gmail.com, alejandro.j.jimenez@oracle.com, andriy.shevchenko@linux.intel.com, arnd@arndb.de, bp@alien8.de, corbet@lwn.net, d.hatayama@jp.fujitsu.com, dave.hansen@linux.intel.com, dyoung@redhat.com, feng.tang@intel.com, gregkh@linuxfoundation.org, mikelley@microsoft.com, hidehiro.kawai.ez@hitachi.com, jgross@suse.com, john.ogness@linutronix.de, keescook@chromium.org, luto@kernel.org, mhiramat@kernel.org, mingo@redhat.com, paulmck@kernel.org, peterz@infradead.org, rostedt@goodmis.org, senozhatsky@chromium.org, stern@rowland.harvard.edu, tglx@linutronix.de, vgoyal@redhat.com, vkuznets@redhat.com, will@kernel.org, Alexander Gordeev , Andrea Parri , Ard Biesheuvel , Benjamin Herrenschmidt , Brian Norris , Christian Borntraeger , Christophe JAILLET , "David S. Miller" , Dexuan Cui , Doug Berger , Haiyang Zhang , Hari Bathini , Heiko Carstens , Justin Chen , "K. Y. Srinivasan" , Lee Jones , Markus Mayer , Michael Ellerman , Mihai Carabas , Nicholas Piggin , Paul Mackerras , Pavel Machek , Shile Zhang , Stephen Hemminger , Sven Schnelle , Thomas Bogendoerfer , Tianyu Lan , Vasily Gorbik , Wang ShaoBo , Wei Liu , zhenwei pi References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> From: "Guilherme G. Piccoli" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-remoteproc@vger.kernel.org On 17/05/2022 10:57, Petr Mladek wrote: > [...] >>>> --- a/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> +++ b/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> @@ -1446,7 +1446,7 @@ static int bcm_vk_probe(struct pci_dev *pdev, const struct pci_device_id *ent) >>> [... snip ...] >>> It seems to reset some hardware or so. IMHO, it should go into the >>> pre-reboot list. >> >> Mixed feelings here, I'm looping Broadcom maintainers to comment. >> (CC Scott and Broadcom list) >> >> I'm afraid it breaks kdump if this device is not reset beforehand - it's >> a doorbell write, so not high risk I think... >> >> But in case the not-reset device can be probed normally in kdump kernel, >> then I'm fine in moving this to the reboot list! I don't have the HW to >> test myself. > > Good question. Well, it if has to be called before kdump then > even "hypervisor" list is a wrong place because is not always > called before kdump. Agreed! I'll defer that to Scott and Broadcom folks to comment. If it's not strictly necessary, I'll happily move it to the reboot list. If necessary, we could use the machine_crash_kexec() approach, but we'll fall into the case arm64 doesn't support it and I'm not sure if this device is available for arm - again a question for the maintainers. > [...] >>>> --- a/drivers/power/reset/ltc2952-poweroff.c >>>> +++ b/drivers/power/reset/ltc2952-poweroff.c >> [...] >> This is setting a variable only, and once it's set (data->kernel_panic >> is the bool's name), it just bails out the IRQ handler and a timer >> setting - this timer seems kinda tricky, so bailing out ASAP makes sense >> IMHO. > > IMHO, the timer informs the hardware that the system is still alive > in the middle of panic(). If the timer is not working then the > hardware (chip) will think that the system frozen in panic() > and will power off the system. See the comments in > drivers/power/reset/ltc2952-poweroff.c: > [.... snip ...] > IMHO, we really have to keep it alive until we reach the reboot stage. > > Another question is how it actually works when the interrupts are > disabled during panic() and the timer callbacks are not handled. Agreed here! Guess I can move this one the reboot list, fine by me. Unless PM folks think otherwise. > [...] >> Disagree here, I'm CCing Florian for information. >> >> This notifier preserves RAM so it's *very interesting* if we have >> kmsg_dump() for example, but maybe might be also relevant in case kdump >> kernel is configured to store something in a persistent RAM (then, >> without this notifier, after kdump reboots the system data would be lost). > > I see. It is actually similar problem as with > drivers/firmware/google/gsmi.c. > > I does similar things like kmsg_dump() so it should be called in > the same location (after info notifier list and before kdump). > > A solution might be to put it at these notifiers at the very > end of the "info" list or make extra "dump" notifier list. Here I still disagree. I've commented in the other response thread (about Google gsmi) about the semantics of the hypervisor list, but again: this list should contain callbacks that (a) Should run early, _by default_ before a kdump; (b) Communicate with the firmware/hypervisor in a "low-risk" way; Imagine a scenario where users configure kdump kernel to save something in a persistent form in DRAM - it'd be like a late pstore, in the next kernel. This callback enables that, it's meant to inform FW "hey, panic happened, please from now on don't clear the RAM in the next FW-reboot". I don't see a reason to postpone that - let's see if the maintainers have an opinion. Cheers, Guilherme From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 637B6C433EF for ; Tue, 17 May 2022 23:27:21 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4L2sj352Ncz3cJr for ; Wed, 18 May 2022 09:27:19 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.a=rsa-sha256 header.s=20170329 header.b=A0Wed33Q; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=igalia.com (client-ip=178.60.130.6; helo=fanzine2.igalia.com; envelope-from=gpiccoli@igalia.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.a=rsa-sha256 header.s=20170329 header.b=A0Wed33Q; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4L2hkj1Xhzz3bq9 for ; Wed, 18 May 2022 02:43:09 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=scD/9Z9PP3JtS/QbR0S82HRrb/qzzKD7mmsgRlJKv48=; b=A0Wed33QPDatmI/wU05BORZ9n1 0hhJk6U25fx6bbbZAsMcq0PkUqYkw6doFXVO3lE2gRlx4mNP2Gg+o2qX5D1iIed/eOGnpU07MiS7v johrL8ZXBr9bRs4m6YspjNjUrp3CRZWOwFXObkbK34aTXOqWVxjbnj+S21xz3SW3iyyzLS8il/lBE YgblTLXs216LyG4XvwA74bc/PHj8Q4o7WtYFZIjzCamKnOfAXbs0CZLburXLM75F/4BH78OE9Feno 1CTSSOT/QCKGLXFRsMSKXpMjqSuNKxrbhetupOiCZ9xDYHRTskdd6CLiaYJY53xEUzZgCM24bzp0B WqYqJCVA==; Received: from 200-161-159-120.dsl.telesp.net.br ([200.161.159.120] helo=[192.168.1.60]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1nr0Hb-008gzd-4h; Tue, 17 May 2022 18:42:51 +0200 Message-ID: Date: Tue, 17 May 2022 13:42:06 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list Content-Language: en-US To: Petr Mladek , Scott Branden , Sebastian Reichel , Florian Fainelli References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> From: "Guilherme G. Piccoli" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Wed, 18 May 2022 09:23:23 +1000 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paul Mackerras , Justin Chen , Pavel Machek , Alexander Gordeev , "K. Y. Srinivasan" , Wei Liu , stern@rowland.harvard.edu, xen-devel@lists.xenproject.org, Christian Borntraeger , linux-pm@vger.kernel.org, linux-um@lists.infradead.org, Nicholas Piggin , luto@kernel.org, Mihai Carabas , tglx@linutronix.de, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, senozhatsky@chromium.org, d.hatayama@jp.fujitsu.com, Sven Schnelle , akpm@linux-foundation.org, linux-hyperv@vger.kernel.org, dave.hansen@linux.intel.com, linux-s390@vger.kernel.org, Stephen Hemminger , Vasily Gorbik , vgoyal@redhat.com, mhiramat@kernel.org, Andrea Parri , linux-xtensa@linux-xtensa.org, john.ogness@linutronix.de, Markus Mayer , hidehiro.kawai.ez@hitachi.com, linux-arm-kernel@lists.infradead.org, kernel-dev@igalia.com, fabiomirmar@gmail.com, halves@canonical.com, alejandro.j.jimenez@oracle.com, feng.tang@intel.com, zhenwei pi , will@kernel.org, Doug Berger , bhe@redhat.com, corbet@lwn.net, Dexuan Cui , Evan Green , bcm-kernel-feedback-list@broadcom.com, Tianyu Lan , keescook@chromium.org, arnd@arndb.de, Haiyang Zhang , rostedt@goodmis.org, rcu@vger.kernel.org, bp@alien8.de, openipmi-developer@lists.sourceforge.net, Thomas Bogendoerfer , linux-parisc@vger.kernel.org, linux-alpha@vger.kernel.org, Brian Norris , "David S. Miller" , peterz@infradead.org, linux-remoteproc@vger.kernel.org, mikelley@microsoft.com, sparclinux@vger.kernel.org, Lee Jones , Ard Biesheuvel , linux-leds@vger.kernel.org, x86@kernel.org, mingo@redhat.com, dyoung@redhat.com, paulmck@kernel.org, Heiko Carstens , Shile Zhang , Wang ShaoBo , Christophe JAILLET , David Gow , linux-tegra@vger.kernel.org, andriy.shevchenko@linux.intel.com, Hari Bathini , linux-edac@vger.kernel.org, jgross@suse.com, netdev@vger.kernel.org, kernel@gpiccoli.net, kexec@lists.infradead.org, linux-mips@vger.kernel.org, Julius Werner , vkuznets@redhat.com, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 17/05/2022 10:57, Petr Mladek wrote: > [...] >>>> --- a/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> +++ b/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> @@ -1446,7 +1446,7 @@ static int bcm_vk_probe(struct pci_dev *pdev, const struct pci_device_id *ent) >>> [... snip ...] >>> It seems to reset some hardware or so. IMHO, it should go into the >>> pre-reboot list. >> >> Mixed feelings here, I'm looping Broadcom maintainers to comment. >> (CC Scott and Broadcom list) >> >> I'm afraid it breaks kdump if this device is not reset beforehand - it's >> a doorbell write, so not high risk I think... >> >> But in case the not-reset device can be probed normally in kdump kernel, >> then I'm fine in moving this to the reboot list! I don't have the HW to >> test myself. > > Good question. Well, it if has to be called before kdump then > even "hypervisor" list is a wrong place because is not always > called before kdump. Agreed! I'll defer that to Scott and Broadcom folks to comment. If it's not strictly necessary, I'll happily move it to the reboot list. If necessary, we could use the machine_crash_kexec() approach, but we'll fall into the case arm64 doesn't support it and I'm not sure if this device is available for arm - again a question for the maintainers. > [...] >>>> --- a/drivers/power/reset/ltc2952-poweroff.c >>>> +++ b/drivers/power/reset/ltc2952-poweroff.c >> [...] >> This is setting a variable only, and once it's set (data->kernel_panic >> is the bool's name), it just bails out the IRQ handler and a timer >> setting - this timer seems kinda tricky, so bailing out ASAP makes sense >> IMHO. > > IMHO, the timer informs the hardware that the system is still alive > in the middle of panic(). If the timer is not working then the > hardware (chip) will think that the system frozen in panic() > and will power off the system. See the comments in > drivers/power/reset/ltc2952-poweroff.c: > [.... snip ...] > IMHO, we really have to keep it alive until we reach the reboot stage. > > Another question is how it actually works when the interrupts are > disabled during panic() and the timer callbacks are not handled. Agreed here! Guess I can move this one the reboot list, fine by me. Unless PM folks think otherwise. > [...] >> Disagree here, I'm CCing Florian for information. >> >> This notifier preserves RAM so it's *very interesting* if we have >> kmsg_dump() for example, but maybe might be also relevant in case kdump >> kernel is configured to store something in a persistent RAM (then, >> without this notifier, after kdump reboots the system data would be lost). > > I see. It is actually similar problem as with > drivers/firmware/google/gsmi.c. > > I does similar things like kmsg_dump() so it should be called in > the same location (after info notifier list and before kdump). > > A solution might be to put it at these notifiers at the very > end of the "info" list or make extra "dump" notifier list. Here I still disagree. I've commented in the other response thread (about Google gsmi) about the semantics of the hypervisor list, but again: this list should contain callbacks that (a) Should run early, _by default_ before a kdump; (b) Communicate with the firmware/hypervisor in a "low-risk" way; Imagine a scenario where users configure kdump kernel to save something in a persistent form in DRAM - it'd be like a late pstore, in the next kernel. This callback enables that, it's meant to inform FW "hey, panic happened, please from now on don't clear the RAM in the next FW-reboot". I don't see a reason to postpone that - let's see if the maintainers have an opinion. Cheers, Guilherme From mboxrd@z Thu Jan 1 00:00:00 1970 From: Guilherme G. Piccoli Date: Tue, 17 May 2022 13:42:06 -0300 Subject: [PATCH 19/30] panic: Add the panic hypervisor notifier list In-Reply-To: References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kexec@lists.infradead.org On 17/05/2022 10:57, Petr Mladek wrote: > [...] >>>> --- a/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> +++ b/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> @@ -1446,7 +1446,7 @@ static int bcm_vk_probe(struct pci_dev *pdev, const struct pci_device_id *ent) >>> [... snip ...] >>> It seems to reset some hardware or so. IMHO, it should go into the >>> pre-reboot list. >> >> Mixed feelings here, I'm looping Broadcom maintainers to comment. >> (CC Scott and Broadcom list) >> >> I'm afraid it breaks kdump if this device is not reset beforehand - it's >> a doorbell write, so not high risk I think... >> >> But in case the not-reset device can be probed normally in kdump kernel, >> then I'm fine in moving this to the reboot list! I don't have the HW to >> test myself. > > Good question. Well, it if has to be called before kdump then > even "hypervisor" list is a wrong place because is not always > called before kdump. Agreed! I'll defer that to Scott and Broadcom folks to comment. If it's not strictly necessary, I'll happily move it to the reboot list. If necessary, we could use the machine_crash_kexec() approach, but we'll fall into the case arm64 doesn't support it and I'm not sure if this device is available for arm - again a question for the maintainers. > [...] >>>> --- a/drivers/power/reset/ltc2952-poweroff.c >>>> +++ b/drivers/power/reset/ltc2952-poweroff.c >> [...] >> This is setting a variable only, and once it's set (data->kernel_panic >> is the bool's name), it just bails out the IRQ handler and a timer >> setting - this timer seems kinda tricky, so bailing out ASAP makes sense >> IMHO. > > IMHO, the timer informs the hardware that the system is still alive > in the middle of panic(). If the timer is not working then the > hardware (chip) will think that the system frozen in panic() > and will power off the system. See the comments in > drivers/power/reset/ltc2952-poweroff.c: > [.... snip ...] > IMHO, we really have to keep it alive until we reach the reboot stage. > > Another question is how it actually works when the interrupts are > disabled during panic() and the timer callbacks are not handled. Agreed here! Guess I can move this one the reboot list, fine by me. Unless PM folks think otherwise. > [...] >> Disagree here, I'm CCing Florian for information. >> >> This notifier preserves RAM so it's *very interesting* if we have >> kmsg_dump() for example, but maybe might be also relevant in case kdump >> kernel is configured to store something in a persistent RAM (then, >> without this notifier, after kdump reboots the system data would be lost). > > I see. It is actually similar problem as with > drivers/firmware/google/gsmi.c. > > I does similar things like kmsg_dump() so it should be called in > the same location (after info notifier list and before kdump). > > A solution might be to put it at these notifiers at the very > end of the "info" list or make extra "dump" notifier list. Here I still disagree. I've commented in the other response thread (about Google gsmi) about the semantics of the hypervisor list, but again: this list should contain callbacks that (a) Should run early, _by default_ before a kdump; (b) Communicate with the firmware/hypervisor in a "low-risk" way; Imagine a scenario where users configure kdump kernel to save something in a persistent form in DRAM - it'd be like a late pstore, in the next kernel. This callback enables that, it's meant to inform FW "hey, panic happened, please from now on don't clear the RAM in the next FW-reboot". I don't see a reason to postpone that - let's see if the maintainers have an opinion. Cheers, Guilherme From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: Date: Tue, 17 May 2022 13:42:06 -0300 MIME-Version: 1.0 Subject: Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list Content-Language: en-US References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> From: "Guilherme G. Piccoli" In-Reply-To: List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-um" Errors-To: linux-um-bounces+geert=linux-m68k.org@lists.infradead.org To: Petr Mladek , Scott Branden , Sebastian Reichel , Florian Fainelli Cc: David Gow , Evan Green , Julius Werner , bcm-kernel-feedback-list@broadcom.com, linux-pm@vger.kernel.org, akpm@linux-foundation.org, bhe@redhat.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-leds@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-s390@vger.kernel.org, linux-tegra@vger.kernel.org, linux-um@lists.infradead.org, linux-xtensa@linux-xtensa.org, netdev@vger.kernel.org, openipmi-developer@lists.sourceforge.net, rcu@vger.kernel.org, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, x86@kernel.org, kernel-dev@igalia.com, kernel@gpiccoli.net, halves@canonical.com, fabiomirmar@gmail.com, alejandro.j.jimenez@oracle.com, andriy.shevchenko@linux.intel.com, arnd@arndb.de, bp@alien8.de, corbet@lwn.net, d.hatayama@jp.fujitsu.com, dave.hansen@linux.intel.com, dyoung@redhat.com, feng.tang@intel.com, gregkh@linuxfoundation.org, mikelley@microsoft.com, hidehiro.kawai.ez@hitachi.com, jgross@suse.com, john.ogness@linutronix.de, keescook@chromium.org, luto@kernel.org, mhiramat@kernel.org, mingo@redhat.com, paulmck@kernel.org, peterz@infradead.org, rostedt@goodmis.org, senozhatsky@chromium.org, stern@rowland.harvard.edu, tglx@linutronix.de, vgoyal@redhat.com, vkuznets@redhat.com, will@kernel.org, Alexander Gordeev , Andrea Parri , Ard Biesheuvel , Benjamin Herrenschmidt , Brian Norris , Christian Borntraeger , Christophe JAILLET , "David S. Miller" , Dexuan Cui , Doug Berger , Haiyang Zhang , Hari Bathini , Heiko Carstens , Justin Chen , "K. Y. Srinivasan" , Lee Jones , Markus Mayer , Michael Ellerman , Mihai Carabas , Nicholas Piggin , Paul Mackerras , Pavel Machek , Shile Zhang , Stephen Hemminger , Sven Schnelle , Thomas Bogendoerfer , Tianyu Lan , Vasily Gorbik , Wang ShaoBo , Wei Liu , zhenwei pi On 17/05/2022 10:57, Petr Mladek wrote: > [...] >>>> --- a/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> +++ b/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> @@ -1446,7 +1446,7 @@ static int bcm_vk_probe(struct pci_dev *pdev, const struct pci_device_id *ent) >>> [... snip ...] >>> It seems to reset some hardware or so. IMHO, it should go into the >>> pre-reboot list. >> >> Mixed feelings here, I'm looping Broadcom maintainers to comment. >> (CC Scott and Broadcom list) >> >> I'm afraid it breaks kdump if this device is not reset beforehand - it's >> a doorbell write, so not high risk I think... >> >> But in case the not-reset device can be probed normally in kdump kernel, >> then I'm fine in moving this to the reboot list! I don't have the HW to >> test myself. > > Good question. Well, it if has to be called before kdump then > even "hypervisor" list is a wrong place because is not always > called before kdump. Agreed! I'll defer that to Scott and Broadcom folks to comment. If it's not strictly necessary, I'll happily move it to the reboot list. If necessary, we could use the machine_crash_kexec() approach, but we'll fall into the case arm64 doesn't support it and I'm not sure if this device is available for arm - again a question for the maintainers. > [...] >>>> --- a/drivers/power/reset/ltc2952-poweroff.c >>>> +++ b/drivers/power/reset/ltc2952-poweroff.c >> [...] >> This is setting a variable only, and once it's set (data->kernel_panic >> is the bool's name), it just bails out the IRQ handler and a timer >> setting - this timer seems kinda tricky, so bailing out ASAP makes sense >> IMHO. > > IMHO, the timer informs the hardware that the system is still alive > in the middle of panic(). If the timer is not working then the > hardware (chip) will think that the system frozen in panic() > and will power off the system. See the comments in > drivers/power/reset/ltc2952-poweroff.c: > [.... snip ...] > IMHO, we really have to keep it alive until we reach the reboot stage. > > Another question is how it actually works when the interrupts are > disabled during panic() and the timer callbacks are not handled. Agreed here! Guess I can move this one the reboot list, fine by me. Unless PM folks think otherwise. > [...] >> Disagree here, I'm CCing Florian for information. >> >> This notifier preserves RAM so it's *very interesting* if we have >> kmsg_dump() for example, but maybe might be also relevant in case kdump >> kernel is configured to store something in a persistent RAM (then, >> without this notifier, after kdump reboots the system data would be lost). > > I see. It is actually similar problem as with > drivers/firmware/google/gsmi.c. > > I does similar things like kmsg_dump() so it should be called in > the same location (after info notifier list and before kdump). > > A solution might be to put it at these notifiers at the very > end of the "info" list or make extra "dump" notifier list. Here I still disagree. I've commented in the other response thread (about Google gsmi) about the semantics of the hypervisor list, but again: this list should contain callbacks that (a) Should run early, _by default_ before a kdump; (b) Communicate with the firmware/hypervisor in a "low-risk" way; Imagine a scenario where users configure kdump kernel to save something in a persistent form in DRAM - it'd be like a late pstore, in the next kernel. This callback enables that, it's meant to inform FW "hey, panic happened, please from now on don't clear the RAM in the next FW-reboot". I don't see a reason to postpone that - let's see if the maintainers have an opinion. Cheers, Guilherme _______________________________________________ linux-um mailing list linux-um@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Guilherme G. Piccoli" Subject: Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list Date: Tue, 17 May 2022 13:42:06 -0300 Message-ID: References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-20-gpiccoli@igalia.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=scD/9Z9PP3JtS/QbR0S82HRrb/qzzKD7mmsgRlJKv48=; b=A0Wed33QPDatmI/wU05BORZ9n1 0hhJk6U25fx6bbbZAsMcq0PkUqYkw6doFXVO3lE2gRlx4mNP2Gg+o2qX5D1iIed/eOGnpU07MiS7v johrL8ZXBr9bRs4m6YspjNjUrp3CRZWOwFXObkbK34aTXOqWVxjbnj+S21xz3SW3iyyzLS8il/lBE YgblTLXs216LyG4XvwA74bc/PHj8Q4o7WtYFZIjzCamKnOfAXbs0CZLburXLM75F/4BH78OE9Feno 1CTSSOT/QCKGLXFRsMSKXpMjqSuNKxrbhetupOiCZ9xDYHRTskdd6CLiaYJY53xEUzZgCM24bzp0B WqYqJCV Content-Language: en-US In-Reply-To: List-ID: Content-Type: text/plain; charset="us-ascii" To: Petr Mladek , Scott Branden , Sebastian Reichel , Florian Fainelli Cc: David Gow , Evan Green , Julius Werner , bcm-kernel-feedback-list@broadcom.com, linux-pm@vger.kernel.org, akpm@linux-foundation.org, bhe@redhat.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-leds@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-s390@vger.kernel.org, linux-tegra@vger.kernel.org, linux-um@lists.infradead.org, linux-xtensa@linux-xtensa.org, netdev@vger.kernel.org, openipmi-developer@lists.sourceforge.net, rcu@vger.kernel.org, sparclinux@vger.kernel.org, xen-devel@li On 17/05/2022 10:57, Petr Mladek wrote: > [...] >>>> --- a/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> +++ b/drivers/misc/bcm-vk/bcm_vk_dev.c >>>> @@ -1446,7 +1446,7 @@ static int bcm_vk_probe(struct pci_dev *pdev, const struct pci_device_id *ent) >>> [... snip ...] >>> It seems to reset some hardware or so. IMHO, it should go into the >>> pre-reboot list. >> >> Mixed feelings here, I'm looping Broadcom maintainers to comment. >> (CC Scott and Broadcom list) >> >> I'm afraid it breaks kdump if this device is not reset beforehand - it's >> a doorbell write, so not high risk I think... >> >> But in case the not-reset device can be probed normally in kdump kernel, >> then I'm fine in moving this to the reboot list! I don't have the HW to >> test myself. > > Good question. Well, it if has to be called before kdump then > even "hypervisor" list is a wrong place because is not always > called before kdump. Agreed! I'll defer that to Scott and Broadcom folks to comment. If it's not strictly necessary, I'll happily move it to the reboot list. If necessary, we could use the machine_crash_kexec() approach, but we'll fall into the case arm64 doesn't support it and I'm not sure if this device is available for arm - again a question for the maintainers. > [...] >>>> --- a/drivers/power/reset/ltc2952-poweroff.c >>>> +++ b/drivers/power/reset/ltc2952-poweroff.c >> [...] >> This is setting a variable only, and once it's set (data->kernel_panic >> is the bool's name), it just bails out the IRQ handler and a timer >> setting - this timer seems kinda tricky, so bailing out ASAP makes sense >> IMHO. > > IMHO, the timer informs the hardware that the system is still alive > in the middle of panic(). If the timer is not working then the > hardware (chip) will think that the system frozen in panic() > and will power off the system. See the comments in > drivers/power/reset/ltc2952-poweroff.c: > [.... snip ...] > IMHO, we really have to keep it alive until we reach the reboot stage. > > Another question is how it actually works when the interrupts are > disabled during panic() and the timer callbacks are not handled. Agreed here! Guess I can move this one the reboot list, fine by me. Unless PM folks think otherwise. > [...] >> Disagree here, I'm CCing Florian for information. >> >> This notifier preserves RAM so it's *very interesting* if we have >> kmsg_dump() for example, but maybe might be also relevant in case kdump >> kernel is configured to store something in a persistent RAM (then, >> without this notifier, after kdump reboots the system data would be lost). > > I see. It is actually similar problem as with > drivers/firmware/google/gsmi.c. > > I does similar things like kmsg_dump() so it should be called in > the same location (after info notifier list and before kdump). > > A solution might be to put it at these notifiers at the very > end of the "info" list or make extra "dump" notifier list. Here I still disagree. I've commented in the other response thread (about Google gsmi) about the semantics of the hypervisor list, but again: this list should contain callbacks that (a) Should run early, _by default_ before a kdump; (b) Communicate with the firmware/hypervisor in a "low-risk" way; Imagine a scenario where users configure kdump kernel to save something in a persistent form in DRAM - it'd be like a late pstore, in the next kernel. This callback enables that, it's meant to inform FW "hey, panic happened, please from now on don't clear the RAM in the next FW-reboot". I don't see a reason to postpone that - let's see if the maintainers have an opinion. Cheers, Guilherme