From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D2DADC433FE for ; Mon, 17 Oct 2022 06:53:08 +0000 (UTC) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id D202484F44; Mon, 17 Oct 2022 08:53:06 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=none (p=none dis=none) header.from=denx.de Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=denx.de; s=phobos-20191101; t=1665989587; bh=YdGTaQgfGZds0TXmdN+kpYeuEJJfaiHOEDybYKQrD/k=; h=Date:Subject:To:Cc:References:From:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From; b=Em4/xXeAin+0As6uz/PaR7jzVRL5lTitGW+eR8M/BAbdG5OSBvz3/EM0Lbj3Es83t NJvduAt/m3eC/If/jn0Zrs9Wt707SBicUMHImnCC/9eyT9gMLlUE7hyuf2L2FJH3TI jSRI4S6H2lvksHjAhWmWf2Ue7Un+SfzvYxLiszbJs7Kkgz2xCmEHElc2X2wIFbSl20 YXq2mtdpV9nysUj6IKoL/uniRhA4zMt8BE+B54Oi8qVw9njhrBc+PiGpSLbltEGtad K0CC6PQQKCE/xK4v0stlNv5JYT97MnmmTI2Iaxn/OgDTIw6PuGnJpZLRGOqdf2cFEE ODD6RVTOs/WMA== Received: by phobos.denx.de (Postfix, from userid 109) id E2F3684F4A; Mon, 17 Oct 2022 08:53:04 +0200 (CEST) Received: from mout-u-204.mailbox.org (mout-u-204.mailbox.org [80.241.59.204]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id 3E9DC808A1 for ; Mon, 17 Oct 2022 08:53:02 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=none (p=none dis=none) header.from=denx.de Authentication-Results: phobos.denx.de; spf=fail smtp.mailfrom=sr@denx.de Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-u-204.mailbox.org (Postfix) with ESMTPS id 4MrSP82bh5z9sM8; Mon, 17 Oct 2022 08:53:00 +0200 (CEST) Message-ID: <6c03d748-358f-7638-0003-68328a16973e@denx.de> Date: Mon, 17 Oct 2022 08:52:59 +0200 MIME-Version: 1.0 Subject: Re: Broken watchdog in u-boot master branch Content-Language: en-US To: Rasmus Villemoes , Tom Rini , =?UTF-8?Q?Pali_Roh=c3=a1r?= Cc: u-boot@lists.denx.de References: <20221009191225.65jwebefhqng3qbi@pali> <20221010135512.GF2020586@bill-the-cat> From: Stefan Roese In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.39 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.6 at phobos.denx.de X-Virus-Status: Clean On 11.10.22 09:18, Rasmus Villemoes wrote: > On 10/10/2022 15.55, Tom Rini wrote: >> On Sun, Oct 09, 2022 at 09:12:25PM +0200, Pali Rohár wrote: >> >>> Hello! Watchdog code seems to be broken in u-boot master branch. >>> On Nokia N900 I'm getting following message in qemu: >>> >>> cyclic function rx51_watchdog took too long: 10000us vs 1000us max, disabling >>> >>> Seems that watchdog core code is not prepared for "slower" watchdogs >>> which communicate over slower i2c bus, like it is the case for N900. >>> >>> Disabling slower watchdog is a bad idea as it would result in reboot >>> loop instead of slower - but working code. > > So, a few thoughts. > > First, I assume that that board has a very coarse-grained tick, probably > just 1000Hz. Otherwise it would be pretty amazing for cpu_time to come > out as 10ms exactly. That's not the board's fault, of course, just an > observation, but it is something we need to bear in mind. If the > resolution is merely 100Hz, so 10ms is simply the granularity, we cannot > really meaningfully compare the cpu_time to anything less than that, > because every once in a while it _will_ happen that we sample "now" just > before the tick, run the function, then sample again just after, and it > may only have taken 17us, yet the diff comes out as 10ms. > > Second, perhaps the threshold should not be a compile-time constant, but > instead a fraction of the requested call frequency (say 1.5%, 1/64). > I.e., if we've registered a function to be called every 10 seconds, we'd > check if its runtime exceeded (10000000 >> 6) us. In general I think this is a good idea. Now that we have multiple users of the cyclic IF in one board it shows, that this threshold might better be defined as per-cyclic-function. Or ... > Preferably per above > that bound is rounded up to a multiple of the timer's granularity (we > can get that, right?) > > Third, perhaps we shouldn't disable it, but just print a (one-time) > warning. Adding a "already-warned" field to struct cyclic_info is > certainly simple enough. ... just print this warning once. As it's pretty simple, I'll send a patch implementing this change later today. We can work on other improvements later, once we've fixed this watchdog breakage. Thanks, Stefan