From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
x86@kernel.org, Andi Kleen <ak@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Lu Baolu <baolu.lu@linux.intel.com>,
David Woodhouse <dwmw2@infradead.org>,
Stephane Eranian <eranian@google.com>,
iommu@lists.linux-foundation.org, Joerg Roedel <joro@8bytes.org>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
Ricardo Neri <ricardo.neri@intel.com>,
Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>,
Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH v6 28/29] x86/tsc: Restart NMI watchdog after refining tsc_khz
Date: Tue, 17 May 2022 15:08:10 -0700 [thread overview]
Message-ID: <20220517220810.GB6711@ranerica-svr.sc.intel.com> (raw)
In-Reply-To: <1652180070.1r874kr0tg.astroid@bobo.none>
On Tue, May 10, 2022 at 09:16:21PM +1000, Nicholas Piggin wrote:
> Excerpts from Ricardo Neri's message of May 6, 2022 10:00 am:
> > The HPET hardlockup detector relies on tsc_khz to estimate the value of
> > that the TSC will have when its HPET channel fires. A refined tsc_khz
> > helps to estimate better the expected TSC value.
> >
> > Using the early value of tsc_khz may lead to a large error in the expected
> > TSC value. Restarting the NMI watchdog detector has the effect of kicking
> > its HPET channel and make use of the refined tsc_khz.
> >
> > When the HPET hardlockup is not in use, restarting the NMI watchdog is
> > a noop.
> >
> > Cc: Andi Kleen <ak@linux.intel.com>
> > Cc: Stephane Eranian <eranian@google.com>
> > Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
> > Cc: iommu@lists.linux-foundation.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: x86@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> > Changes since v5:
> > * Introduced this patch
> >
> > Changes since v4
> > * N/A
> >
> > Changes since v3
> > * N/A
> >
> > Changes since v2:
> > * N/A
> >
> > Changes since v1:
> > * N/A
> > ---
> > arch/x86/kernel/tsc.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index cafacb2e58cc..cc1843044d88 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1386,6 +1386,12 @@ static void tsc_refine_calibration_work(struct work_struct *work)
> > /* Inform the TSC deadline clockevent devices about the recalibration */
> > lapic_update_tsc_freq();
> >
> > + /*
> > + * If in use, the HPET hardlockup detector relies on tsc_khz.
> > + * Reconfigure it to make use of the refined tsc_khz.
> > + */
> > + lockup_detector_reconfigure();
>
> I don't know if the API is conceptually good.
>
> You change something that the lockup detector is currently using,
> *while* the detector is running asynchronously, and then reconfigure
> it.
Yes, this is what happens.
> What happens in the window? If this code is only used for small
> adjustments maybe it does not really matter
Please see my comment
> but in principle it's a bad API to export.
>
> lockup_detector_reconfigure as an internal API is okay because it
> reconfigures things while the watchdog is stopped
I see.
> [actually that looks untrue for soft dog which uses watchdog_thresh in
> is_softlockup(), but that should be fixed].
Perhaps there should be a watchdog_thresh_user. When the user updates it,
the detector is stopped, watchdog_thresh is updated, and then the detector
is resumed.
>
> You're the arch so you're allowed to stop the watchdog and configure
> it, e.g., hardlockup_detector_perf_stop() is called in arch/.
I had it like this but it did not look right to me. You are right, however,
I can stop/restart the watchdog when needed.
Thanks and BR,
Ricardo
next prev parent reply other threads:[~2022-05-17 22:04 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-05 23:59 [PATCH v6 00/29] x86: Implement an HPET-based hardlockup detector Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 01/29] irq/matrix: Expose functions to allocate the best CPU for new vectors Ricardo Neri
2022-05-06 19:48 ` Thomas Gleixner
2022-05-12 0:09 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 02/29] x86/apic: Add irq_cfg::delivery_mode Ricardo Neri
2022-05-06 19:53 ` Thomas Gleixner
2022-05-12 0:26 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 03/29] x86/apic/msi: Set the delivery mode individually for each IRQ Ricardo Neri
2022-05-06 20:05 ` Thomas Gleixner
2022-05-12 0:38 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 04/29] x86/apic: Add the X86_IRQ_ALLOC_AS_NMI irq allocation flag Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 05/29] x86/apic/vector: Do not allocate vectors for NMIs Ricardo Neri
2022-05-06 21:12 ` Thomas Gleixner
2022-05-13 18:03 ` Ricardo Neri
2022-05-13 20:50 ` Thomas Gleixner
2022-05-13 23:45 ` Ricardo Neri
2022-05-14 8:15 ` Thomas Gleixner
2022-05-05 23:59 ` [PATCH v6 06/29] x86/apic/vector: Implement support for NMI delivery mode Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 07/29] iommu/vt-d: Clear the redirection hint when the destination mode is physical Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 08/29] iommu/vt-d: Rework prepare_irte() to support per-IRQ delivery mode Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 09/29] iommu/vt-d: Set the IRTE delivery mode individually for each IRQ Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 10/29] iommu/vt-d: Implement minor tweaks for NMI irqs Ricardo Neri
2022-05-06 21:23 ` Thomas Gleixner
2022-05-13 18:07 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 11/29] iommu/amd: Expose [set|get]_dev_entry_bit() Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 12/29] iommu/amd: Enable NMIPass when allocating an NMI irq Ricardo Neri
2022-05-06 21:26 ` Thomas Gleixner
2022-05-13 19:01 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 13/29] iommu/amd: Compose MSI messages for NMI irqs in non-IR format Ricardo Neri
2022-05-06 21:31 ` Thomas Gleixner
2022-05-13 19:03 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 14/29] x86/hpet: Expose hpet_writel() in header Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 15/29] x86/hpet: Add helper function hpet_set_comparator_periodic() Ricardo Neri
2022-05-06 21:41 ` Thomas Gleixner
2022-05-06 21:51 ` Thomas Gleixner
2022-05-13 21:29 ` Ricardo Neri
2022-05-13 21:19 ` Ricardo Neri
2022-05-14 8:17 ` Thomas Gleixner
2022-05-17 22:54 ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 16/29] x86/hpet: Prepare IRQ assignments to use the X86_ALLOC_AS_NMI flag Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 17/29] x86/hpet: Reserve an HPET channel for the hardlockup detector Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 18/29] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 19/29] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 20/29] init/main: Delay initialization of the lockup detector after smp_init() Ricardo Neri
2022-05-10 10:38 ` Nicholas Piggin
2022-05-13 23:16 ` Ricardo Neri
2022-05-20 0:25 ` Nicholas Piggin
2022-05-06 0:00 ` [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category Ricardo Neri
2022-05-09 13:59 ` Thomas Gleixner
2022-05-17 18:41 ` Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 22/29] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2022-05-09 14:03 ` Thomas Gleixner
2022-05-13 22:16 ` Ricardo Neri
2022-05-14 14:04 ` Thomas Gleixner
2022-05-06 0:00 ` [PATCH v6 23/29] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 24/29] watchdog/hardlockup: Use parse_option_str() to handle "nmi_watchdog" Ricardo Neri
2022-05-10 10:46 ` Nicholas Piggin
2022-05-13 23:17 ` Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 25/29] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 26/29] x86/watchdog: Add a shim hardlockup detector Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 27/29] watchdog: Expose lockup_detector_reconfigure() Ricardo Neri
2022-05-06 0:00 ` [PATCH v6 28/29] x86/tsc: Restart NMI watchdog after refining tsc_khz Ricardo Neri
2022-05-10 11:16 ` Nicholas Piggin
2022-05-10 11:44 ` Thomas Gleixner
2022-05-17 22:53 ` Ricardo Neri
2022-05-17 22:08 ` Ricardo Neri [this message]
2022-05-06 0:00 ` [PATCH v6 29/29] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable Ricardo Neri
2022-05-10 12:14 ` Nicholas Piggin
2022-05-17 3:09 ` Ricardo Neri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220517220810.GB6711@ranerica-svr.sc.intel.com \
--to=ricardo.neri-calderon@linux.intel.com \
--cc=Suravee.Suthikulpanit@amd.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=baolu.lu@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=eranian@google.com \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@gmail.com \
--cc=ravi.v.shankar@intel.com \
--cc=ricardo.neri@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).