linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Randy Dunlap <rdunlap@infradead.org>
To: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Ashok Raj <ashok.raj@intel.com>, Borislav Petkov <bp@suse.de>,
	Tony Luck <tony.luck@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	x86@kernel.org, sparclinux@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Jacob Pan <jacob.jun.pan@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Don Zickus <dzickus@redhat.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Frederic Weisbecker <frederic@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Babu Moger <babu.moger@oracle.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Colin Ian King <colin.king@canonical.com>,
	Byungchul Park <byungchul.park@lge.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Christoffer Dall <cdall@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	David Rientjes <rientjes@google.com>,
	iommu@lists.linux-foundation.org
Subject: Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI
Date: Tue, 19 Jun 2018 17:25:09 -0700	[thread overview]
Message-ID: <c040d4e7-f33b-9c93-6f5a-6bf960943e36@infradead.org> (raw)
In-Reply-To: <20180620001535.GA27531@voyager>

On 06/19/2018 05:15 PM, Ricardo Neri wrote:
> On Sat, Jun 16, 2018 at 03:24:49PM +0200, Thomas Gleixner wrote:
>> On Fri, 15 Jun 2018, Ricardo Neri wrote:
>>> On Fri, Jun 15, 2018 at 11:19:09AM +0200, Thomas Gleixner wrote:
>>>> On Thu, 14 Jun 2018, Ricardo Neri wrote:
>>>>> Alternatively, there could be a counter that skips reading the HPET status
>>>>> register (and the detection of hardlockups) for every X NMIs. This would
>>>>> reduce the overall frequency of HPET register reads.
>>>>
>>>> Great plan. So if the watchdog is the only NMI (because perf is off) then
>>>> you delay the watchdog detection by that count.
>>>
>>> OK. This was a bad idea. Then, is it acceptable to have an read to an HPET
>>> register per NMI just to check in the status register if the HPET timer
>>> caused the NMI?
>>
>> The status register is useless in case of MSI. MSI is edge triggered ....
>>
>> The only register which gives you proper information is the counter
>> register itself. That adds an massive overhead to each NMI, because the
>> counter register access is synchronized to the HPET clock with hardware
>> magic. Plus on larger systems, the HPET access is cross node and even
>> slower.
> 
> It starts to sound that the HPET is too slow to drive the hardlockup detector.
> 
> Would it be possible to envision a variant of this implementation? In this
> variant, the HPET only targets a single CPU. The actual hardlockup detector
> is implemented by this single CPU sending interprocessor interrupts to the
> rest of the CPUs.
> 
> In this manner only one CPU has to deal with the slowness of the HPET; the
> rest of the CPUs don't have to read or write any HPET registers. A sysfs
> entry could be added to configure which CPU will have to deal with the HPET
> timer. However, profiling could not be done accurately on such CPU.

Please forgive my simple question:

What happens when this one CPU is the one that locks up?

thnx,
-- 
~Randy

  reply	other threads:[~2018-06-20  0:26 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-13  0:57 [RFC PATCH 00/23] Implement an HPET-based hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 01/23] x86/apic: Add a parameter for the APIC delivery mode Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 02/23] genirq: Introduce IRQD_DELIVER_AS_NMI Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI Ricardo Neri
2018-06-13  8:34   ` Peter Zijlstra
2018-06-13  8:59     ` Julien Thierry
2018-06-13  9:20       ` Thomas Gleixner
2018-06-13  9:36         ` Julien Thierry
2018-06-13  9:49           ` Julien Thierry
2018-06-13  9:57           ` Thomas Gleixner
2018-06-13 10:25             ` Julien Thierry
2018-06-13 10:06         ` Marc Zyngier
2018-06-15  2:12           ` Ricardo Neri
2018-06-15  8:01             ` Julien Thierry
2018-06-16  0:39               ` Ricardo Neri
2018-06-16 13:36                 ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 04/23] iommu/vt-d/irq_remapping: Add support for IRQCHIP_CAN_DELIVER_AS_NMI Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 05/23] x86/msi: " Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 06/23] x86/ioapic: Add support for IRQCHIP_CAN_DELIVER_AS_NMI with interrupt remapping Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 07/23] x86/hpet: Expose more functions to read and write registers Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 08/23] x86/hpet: Calculate ticks-per-second in a separate function Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 09/23] x86/hpet: Reserve timer for the HPET hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 10/23] x86/hpet: Relocate flag definitions to a header file Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 11/23] x86/hpet: Configure the timer used by the hardlockup detector Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations Ricardo Neri
2018-06-13  7:41   ` Nicholas Piggin
2018-06-13  8:42     ` Peter Zijlstra
2018-06-13  9:26       ` Thomas Gleixner
2018-06-13 11:52         ` Nicholas Piggin
2018-06-14  1:31           ` Ricardo Neri
2018-06-14  2:32             ` Nicholas Piggin
2018-06-14  8:32               ` Thomas Gleixner
2018-06-15  2:21               ` Ricardo Neri
2018-06-14  1:26       ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 13/23] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2018-06-13  8:43   ` Peter Zijlstra
2018-06-14  1:19     ` Ricardo Neri
2018-06-14  1:41       ` Nicholas Piggin
2018-06-15  2:23         ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 15/23] kernel/watchdog: Add a function to obtain the watchdog_allowed_mask Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 16/23] watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2018-06-13  5:23   ` Randy Dunlap
2018-06-14  1:00     ` Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI Ricardo Neri
2018-06-13  9:07   ` Peter Zijlstra
2018-06-15  2:07     ` Ricardo Neri
2018-06-13  9:40   ` Thomas Gleixner
2018-06-15  2:03     ` Ricardo Neri
2018-06-15  9:19       ` Thomas Gleixner
2018-06-16  0:51         ` Ricardo Neri
2018-06-16 13:24           ` Thomas Gleixner
2018-06-20  0:15             ` Ricardo Neri
2018-06-20  0:25               ` Randy Dunlap [this message]
2018-06-21  0:25                 ` Ricardo Neri
2018-06-20  7:47               ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 18/23] watchdog/hardlockup/hpet: Add the NMI watchdog operations Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 19/23] watchdog/hardlockup: Make arch_touch_nmi_watchdog() to hpet-based implementation Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Ricardo Neri
2018-06-13  9:48   ` Thomas Gleixner
2018-06-15  2:16     ` Ricardo Neri
2018-06-15 10:29       ` Thomas Gleixner
2018-06-16  0:46         ` Ricardo Neri
2018-06-16 13:27           ` Thomas Gleixner
2018-06-13  0:57 ` [RFC PATCH 21/23] watchdog/hardlockup/hpet: Adjust timer expiration on the number of " Ricardo Neri
2018-06-13  0:57 ` [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2018-06-13  5:26   ` Randy Dunlap
2018-06-14  0:58     ` Ricardo Neri
2018-06-14  3:30       ` Randy Dunlap
2018-06-13  0:57 ` [RFC PATCH 23/23] watchdog/hardlockup: Activate the HPET-based lockup detector Ricardo Neri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c040d4e7-f33b-9c93-6f5a-6bf960943e36@infradead.org \
    --to=rdunlap@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=ashok.raj@intel.com \
    --cc=ast@kernel.org \
    --cc=babu.moger@oracle.com \
    --cc=bp@suse.de \
    --cc=byungchul.park@lge.com \
    --cc=cdall@linaro.org \
    --cc=colin.king@canonical.com \
    --cc=dave@stgolabs.net \
    --cc=dzickus@redhat.com \
    --cc=frederic@kernel.org \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jpoimboe@redhat.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=longman@redhat.com \
    --cc=marc.zyngier@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=rientjes@google.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).