From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759423AbbKSXuL (ORCPT ); Thu, 19 Nov 2015 18:50:11 -0500 Received: from mail-io0-f178.google.com ([209.85.223.178]:34799 "EHLO mail-io0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754690AbbKSXuH (ORCPT ); Thu, 19 Nov 2015 18:50:07 -0500 From: Al Stone Subject: Re: [Linaro-acpi] [PATCH v8 5/5] Watchdog: introduce ARM SBSA watchdog driver To: Timur Tabi , Guenter Roeck , Fu Wei References: <1445961999-9506-1-git-send-email-fu.wei@linaro.org> <1445961999-9506-6-git-send-email-fu.wei@linaro.org> <563AE588.1080009@roeck-us.net> <563B5DF9.6080102@codeaurora.org> <563B62F7.3050307@codeaurora.org> <563B6A4B.7090400@codeaurora.org> <563B869F.2010004@roeck-us.net> <56452970.4070209@linaro.org> <56452D98.80704@codeaurora.org> Cc: Pratyush Anand , devicetree@vger.kernel.org, linux-watchdog@vger.kernel.org, Arnd Bergmann , linux-doc@vger.kernel.org, Jon Masters , Linaro ACPI Mailman List , "Rafael J. Wysocki" , lkml , Will Deacon , Wim Van Sebroeck , Rob Herring , Catalin Marinas , Wei Fu , Jonathan Corbet , Dave Young , Vipul Gandhi X-Enigmail-Draft-Status: N1110 Message-ID: <564E602B.6060606@linaro.org> Date: Thu, 19 Nov 2015 16:50:03 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <56452D98.80704@codeaurora.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sorry for the delayed response...I've got some difficult family things to work on IRL that are taking priority... On 11/12/2015 05:23 PM, Timur Tabi wrote: > On 11/12/2015 06:06 PM, Al Stone wrote: >> If it is a NAK, that's fine, but I also want to be sure I understand what the >> objections are. Based on my understanding of the discussion so far over the >> multiple versions, I think the primary objection is that the use of pretimeout >> makes this driver too complex, and indeed complex enough that there is some >> concern that it could destabilize a running system. Do I have that right? > > I don't have a problem with the concept of pre-timeout per se. My primary > objection is this code: > >> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id) >> +{ >> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id; >> + struct watchdog_device *wdd = &gwdt->wdd; >> + >> + /* We don't use pretimeout, trigger WS1 now */ >> + if (!wdd->pretimeout) >> + sbsa_gwdt_set_wcv(wdd, 0); > > This driver depends on an interrupt handler in order to properly program the > hardware. Unlike some other devices, the SBSA watchdog does not need assistance > to reset on a timeout -- it is a "fire and forget" device. What happens if > there is a hard lockup, and interrupts no longer work? Aha. I see now. That helps clarify a lot. Thanks. > The reason why Fu does this is because he wants to support a pre-timeout value > that's independent of the timeout value. The SBSA watchdog is normally > programmed where real timeout equals twice the pre-timeout. I would prefer that > the driver adhere to this limitation. That would eliminate the need to > pre-program the hardware in the interrupt handler. The "normally programmed" limitation described is interesting; forgive my ignorance, but where is that specified? I couldn't find anything that specific in the SBSA, or the ARM ARM, but I could have missed it. That being said, keeping them independent at least seems like a good idea; if I think about kdump/kexec or some other recovery mechanism wanting to perhaps copy part of RAM or flush a filesystem/database, or maybe do some other magic to recover enough to be able to reset the timer, that may be a really long interval on a large server. I could easily see that being very different from a watchdog timer that's meant to just make sure the platform is still making progress. Conversely, I could see that recovery interval being very small or zero on a guest OS, for example, and the watchdog still different. >> And finally, a simpler, single stage timeout watchdog driver would be a >> reasonable thing to accept, yes? I can see where that would make sense. > > I would be okay with merging such a driver, and then enhancing it later to add > pre-timeout support. > >> The issue for me in that case is that the SBSA requires a two stage timeout, >> so a single stage driver has no real value for me. > > There are plenty of existing watchdog devices that have a two-stage timeout but > the driver treats it as a single stage. The PowerPC watchdog driver is like > that. The hardware is programmed for the second stage to cause a hardware > reset, and the interrupt handler is typically a no-op or just a printk(). > Hrm. Thanks for the pointer. I _think_ I see a way to do that with arm64, and perhaps combine this driver's functionality with what Timur did originally, but still have it reasonably straightforward. I need to do the experiments, though, and see if it actually works first. -- ciao, al ----------------------------------- Al Stone Software Engineer Linaro Enterprise Group al.stone@linaro.org -----------------------------------