All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Waiman Long <longman@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	Kees Cook <keescook@chromium.org>,
	Waiman Long <longman@redhat.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Robert Richter <rrichter@marvell.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v2] watchdog: Fix possible soft lockup warning at bootup
Date: Thu, 16 Jan 2020 12:44:07 +0100	[thread overview]
Message-ID: <87blr3wrqw.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <87sgkgw3xq.fsf@nanos.tec.linutronix.de>

Thomas Gleixner <tglx@linutronix.de> writes:

Added ARM64 and ThunderX folks 

> Waiman Long <longman@redhat.com> writes:
>> By adding some instrumentation code, it was found that for cpu 14,
>> watchdog_enable() was called early with a timestamp of 1. That activates
>> the watchdog time checking logic. It was also found that the monotonic
>> time measured during the smp_init() phase runs much slower than the
>> real elapsed time as shown by the below debug printf output:
>>
>>   [    1.138522] run_queues, watchdog_timer_fn: now =  170000000
>>   [   25.519391] run_queues, watchdog_timer_fn: now = 4170000000
>>
>> In this particular case, it took about 24.4s of elapsed time for the
>> clock to advance 4s which is the soft expiration time that is required
>> to trigger the calling of watchdog_timer_fn(). That clock slowdown
>> stopped once the smp_init() call was done and the clock time ran at
>> the same rate as the elapsed time afterward.

And looking at this with a more awake brain, the root cause is pretty
obvious.

sched_clock() advances by 24 seconds, but clock MONOTONIC on which the
watchdog timer is based does not. As the timestamps you printed have 7
trailing zeros, it's pretty clear that timekeeping is still jiffies
based at this point and HZ is set to 100.

So while bringing up the non-boot CPUs the boot CPU loses ~2000 timer
interrupts. That needs to be fixed and not papered over.

Thanks,

        tglx

  reply	other threads:[~2020-01-16 11:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-03 15:10 [PATCH v2] watchdog: Fix possible soft lockup warning at bootup Waiman Long
2020-01-16  2:06 ` Thomas Gleixner
2020-01-16 11:44   ` Thomas Gleixner [this message]
2020-01-16 15:11     ` Robert Richter
2020-01-16 16:57       ` Thomas Gleixner
2020-01-16 17:34         ` Waiman Long
2020-01-16 19:10           ` Thomas Gleixner
2020-01-16 19:13             ` Waiman Long
2020-01-16 18:17       ` [PATCH] watchdog/softlockup: Enforce that timestamp is valid on boot Thomas Gleixner
2020-01-17 10:25         ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2020-01-16 15:34     ` [PATCH v2] watchdog: Fix possible soft lockup warning at bootup Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87blr3wrqw.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@linux.ibm.com \
    --cc=rrichter@marvell.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.