linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Ashok Raj <ashok.raj@intel.com>, Borislav Petkov <bp@suse.de>,
	Tony Luck <tony.luck@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	x86@kernel.org, sparclinux@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Jacob Pan <jacob.jun.pan@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Don Zickus <dzickus@redhat.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Frederic Weisbecker <frederic@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Babu Moger <babu.moger@oracle.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Colin Ian King <colin.king@canonical.com>,
	Byungchul Park <byungchul.park@lge.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Christoffer Dall <cdall@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	David Rientjes <rientjes@google.com>,
	iommu@lists.linux-foundation.org
Subject: Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI
Date: Wed, 20 Jun 2018 17:25:18 -0700	[thread overview]
Message-ID: <20180621002518.GA16266@voyager> (raw)
In-Reply-To: <c040d4e7-f33b-9c93-6f5a-6bf960943e36@infradead.org>

On Tue, Jun 19, 2018 at 05:25:09PM -0700, Randy Dunlap wrote:
> On 06/19/2018 05:15 PM, Ricardo Neri wrote:
> > On Sat, Jun 16, 2018 at 03:24:49PM +0200, Thomas Gleixner wrote:
> >> On Fri, 15 Jun 2018, Ricardo Neri wrote:
> >>> On Fri, Jun 15, 2018 at 11:19:09AM +0200, Thomas Gleixner wrote:
> >>>> On Thu, 14 Jun 2018, Ricardo Neri wrote:
> >>>>> Alternatively, there could be a counter that skips reading the HPET status
> >>>>> register (and the detection of hardlockups) for every X NMIs. This would
> >>>>> reduce the overall frequency of HPET register reads.
> >>>>
> >>>> Great plan. So if the watchdog is the only NMI (because perf is off) then
> >>>> you delay the watchdog detection by that count.
> >>>
> >>> OK. This was a bad idea. Then, is it acceptable to have an read to an HPET
> >>> register per NMI just to check in the status register if the HPET timer
> >>> caused the NMI?
> >>
> >> The status register is useless in case of MSI. MSI is edge triggered ....
> >>
> >> The only register which gives you proper information is the counter
> >> register itself. That adds an massive overhead to each NMI, because the
> >> counter register access is synchronized to the HPET clock with hardware
> >> magic. Plus on larger systems, the HPET access is cross node and even
> >> slower.
> > 
> > It starts to sound that the HPET is too slow to drive the hardlockup detector.
> > 
> > Would it be possible to envision a variant of this implementation? In this
> > variant, the HPET only targets a single CPU. The actual hardlockup detector
> > is implemented by this single CPU sending interprocessor interrupts to the
> > rest of the CPUs.
> > 
> > In this manner only one CPU has to deal with the slowness of the HPET; the
> > rest of the CPUs don't have to read or write any HPET registers. A sysfs
> > entry could be added to configure which CPU will have to deal with the HPET
> > timer. However, profiling could not be done accurately on such CPU.
> 
> Please forgive my simple question:
> 
> What happens when this one CPU is the one that locks up?

I think that in this particular case this one CPU would check for hardlockups
on itself when it receives the NMI from the HPET timer. It would also issue
NMIs to the other monitored processors.

Thanks and BR,
Ricardo

  reply	other threads:[~2018-06-21  0:29 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-13  0:57 [RFC PATCH 00/23] Implement an HPET-based hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 01/23] x86/apic: Add a parameter for the APIC delivery mode Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 02/23] genirq: Introduce IRQD_DELIVER_AS_NMI Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI Ricardo Neri
2018-06-13  8:34   ` Peter Zijlstra
2018-06-13  8:59     ` Julien Thierry
2018-06-13  9:20       ` Thomas Gleixner
2018-06-13  9:36         ` Julien Thierry
2018-06-13  9:49           ` Julien Thierry
2018-06-13  9:57           ` Thomas Gleixner
2018-06-13 10:25             ` Julien Thierry
2018-06-13 10:06         ` Marc Zyngier
2018-06-15  2:12           ` Ricardo Neri
2018-06-15  8:01             ` Julien Thierry
2018-06-16  0:39               ` Ricardo Neri
2018-06-16 13:36                 ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 04/23] iommu/vt-d/irq_remapping: Add support for IRQCHIP_CAN_DELIVER_AS_NMI Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 05/23] x86/msi: " Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 06/23] x86/ioapic: Add support for IRQCHIP_CAN_DELIVER_AS_NMI with interrupt remapping Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 07/23] x86/hpet: Expose more functions to read and write registers Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 08/23] x86/hpet: Calculate ticks-per-second in a separate function Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 09/23] x86/hpet: Reserve timer for the HPET hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 10/23] x86/hpet: Relocate flag definitions to a header file Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 11/23] x86/hpet: Configure the timer used by the hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations Ricardo Neri
2018-06-13  7:41   ` Nicholas Piggin
2018-06-13  8:42     ` Peter Zijlstra
2018-06-13  9:26       ` Thomas Gleixner
2018-06-13 11:52         ` Nicholas Piggin
2018-06-14  1:31           ` Ricardo Neri
2018-06-14  2:32             ` Nicholas Piggin
2018-06-14  8:32               ` Thomas Gleixner
2018-06-15  2:21               ` Ricardo Neri
2018-06-14  1:26       ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 13/23] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2018-06-13  8:43   ` Peter Zijlstra
2018-06-14  1:19     ` Ricardo Neri
2018-06-14  1:41       ` Nicholas Piggin
2018-06-15  2:23         ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 15/23] kernel/watchdog: Add a function to obtain the watchdog_allowed_mask Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 16/23] watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2018-06-13  5:23   ` Randy Dunlap
2018-06-14  1:00     ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI Ricardo Neri
2018-06-13  9:07   ` Peter Zijlstra
2018-06-15  2:07     ` Ricardo Neri
2018-06-13  9:40   ` Thomas Gleixner
2018-06-15  2:03     ` Ricardo Neri
2018-06-15  9:19       ` Thomas Gleixner
2018-06-16  0:51         ` Ricardo Neri
2018-06-16 13:24           ` Thomas Gleixner
2018-06-20  0:15             ` Ricardo Neri
2018-06-20  0:25               ` Randy Dunlap
2018-06-21  0:25                 ` Ricardo Neri [this message]
2018-06-20  7:47               ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 18/23] watchdog/hardlockup/hpet: Add the NMI watchdog operations Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 19/23] watchdog/hardlockup: Make arch_touch_nmi_watchdog() to hpet-based implementation Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Ricardo Neri
2018-06-13  9:48   ` Thomas Gleixner
2018-06-15  2:16     ` Ricardo Neri
2018-06-15 10:29       ` Thomas Gleixner
2018-06-16  0:46         ` Ricardo Neri
2018-06-16 13:27           ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 21/23] watchdog/hardlockup/hpet: Adjust timer expiration on the number of " Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2018-06-13  5:26   ` Randy Dunlap
2018-06-14  0:58     ` Ricardo Neri
2018-06-14  3:30       ` Randy Dunlap
2018-06-13  0:57 ` [RFC PATCH 23/23] watchdog/hardlockup: Activate the HPET-based lockup detector Ricardo Neri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180621002518.GA16266@voyager \
    --to=ricardo.neri-calderon@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=ashok.raj@intel.com \
    --cc=ast@kernel.org \
    --cc=babu.moger@oracle.com \
    --cc=bp@suse.de \
    --cc=byungchul.park@lge.com \
    --cc=cdall@linaro.org \
    --cc=colin.king@canonical.com \
    --cc=dave@stgolabs.net \
    --cc=dzickus@redhat.com \
    --cc=frederic@kernel.org \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jpoimboe@redhat.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=longman@redhat.com \
    --cc=marc.zyngier@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).