From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5E51C433FE for ; Tue, 11 Oct 2022 07:25:56 +0000 (UTC) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id 06C6A84C0C; Tue, 11 Oct 2022 09:25:55 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.b="Mjui/boH"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id D35A784F06; Tue, 11 Oct 2022 09:25:52 +0200 (CEST) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id D7BAF84A20 for ; Tue, 11 Oct 2022 09:25:49 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=pali@kernel.org Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C9C49B810B2; Tue, 11 Oct 2022 07:25:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A7BAC433D6; Tue, 11 Oct 2022 07:25:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1665473146; bh=9j9ZnWUYzDJNBUQhF0yf0fwBw/OHow8mjxvUFiJaRkk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Mjui/boHdioP8PsSOuykJkdlp9dBTDzz+JANqOEncGEDsyGtjEF+iTn1ynR2Nca1a YkuwN9tSw+XumryYYssMtMZFMXDKKU/Oef7/UlvZ6hEsvbJ5svzfaYGC5+czAhkTNI 0cZHfVlCcqeBTVSmnTe5GuMNmilBk5sSjRV8Br7A5Heztb5NClrFdVZ4yyFgmerBpu a3M6g50/lCyC20yvp9idfvb21wK3C/Iei8kXjr454i/iI9jiQv5myNfYxkJ01IzEb6 hMm0eOsApeKeXFVZil68faFMaYo6/Bhvv7tJoXNHjTl9pV0u1UJ/N9pJMrRVnDfkNY lO/jfKLLJV1ew== Received: by pali.im (Postfix) id 365947B3; Tue, 11 Oct 2022 09:25:43 +0200 (CEST) Date: Tue, 11 Oct 2022 09:25:43 +0200 From: Pali =?utf-8?B?Um9ow6Fy?= To: Rasmus Villemoes Cc: Tom Rini , Stefan Roese , u-boot@lists.denx.de Subject: Re: Broken watchdog in u-boot master branch Message-ID: <20221011072543.wp666tzuf7hdeny2@pali> References: <20221009191225.65jwebefhqng3qbi@pali> <20221010135512.GF2020586@bill-the-cat> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20180716 X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.39 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.6 at phobos.denx.de X-Virus-Status: Clean On Tuesday 11 October 2022 09:18:56 Rasmus Villemoes wrote: > On 10/10/2022 15.55, Tom Rini wrote: > > On Sun, Oct 09, 2022 at 09:12:25PM +0200, Pali Rohár wrote: > > > >> Hello! Watchdog code seems to be broken in u-boot master branch. > >> On Nokia N900 I'm getting following message in qemu: > >> > >> cyclic function rx51_watchdog took too long: 10000us vs 1000us max, disabling > >> > >> Seems that watchdog core code is not prepared for "slower" watchdogs > >> which communicate over slower i2c bus, like it is the case for N900. > >> > >> Disabling slower watchdog is a bad idea as it would result in reboot > >> loop instead of slower - but working code. > > So, a few thoughts. > > First, I assume that that board has a very coarse-grained tick, probably > just 1000Hz. IIRC i2c is running at 100kHz. But u-boot i2c driver contains more udelay() calls and more busy loops. And I was told that there are also some hw erratas. So at the end if some other driver calls u-boot dm function for i2c transfer, it can spend more time than expected... Also there is another issue is that udelay() itself calls watchdog reset function. So if watchdog is on i2c which driver calls udelay() it means that watchdog reset function may be called in infinite loop. For this reason n900 watchdog driver has lock which prevents and eliminate this loop (if somebody calls reset during active lock then reset does nothing). > Otherwise it would be pretty amazing for cpu_time to come > out as 10ms exactly. That's not the board's fault, of course, just an > observation, but it is something we need to bear in mind. If the > resolution is merely 100Hz, so 10ms is simply the granularity, we cannot > really meaningfully compare the cpu_time to anything less than that, > because every once in a while it _will_ happen that we sample "now" just > before the tick, run the function, then sample again just after, and it > may only have taken 17us, yet the diff comes out as 10ms. > > Second, perhaps the threshold should not be a compile-time constant, but > instead a fraction of the requested call frequency (say 1.5%, 1/64). > I.e., if we've registered a function to be called every 10 seconds, we'd > check if its runtime exceeded (10000000 >> 6) us. Preferably per above > that bound is rounded up to a multiple of the timer's granularity (we > can get that, right?) > > Third, perhaps we shouldn't disable it, but just print a (one-time) > warning. Adding a "already-warned" field to struct cyclic_info is > certainly simple enough. > > Rasmus