linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Sumit Garg <sumit.garg@linaro.org>
To: Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	 Jian-Lin Chen <lecopzer.chen@mediatek.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Douglas Anderson <dianders@chromium.org>,
	Stephen Boyd <swboyd@chromium.org>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v5] arm64: Enable perf events based hard lockup detector
Date: Fri, 19 Feb 2021 15:07:57 +0530	[thread overview]
Message-ID: <CAFA6WYOL3m6UspT1QG8_DAEFpGxtX=7aT_zTAdntmuUCcBvg5A@mail.gmail.com> (raw)
In-Reply-To: <1610712101-14929-1-git-send-email-sumit.garg@linaro.org>

Hi Will, Mark,

On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
>
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
>
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
>
> Changes in v5:
> - Fix lockup_detector_init() invocation to be rather invoked from CPU
>   binded context as it makes heavy use of per-cpu variables and shouldn't
>   be invoked from preemptible context.
>

Do you have any further comments on this?

Lecopzer,

Does this feature work fine for you now?

-Sumit

> Changes in v4:
> - Rebased to latest pmu v7 NMI patch-set [1] and in turn use "has_nmi"
>   hook to know if PMU IRQ has been requested as an NMI.
> - Add check for return value prior to initializing hard-lockup detector.
>
> [1] https://lkml.org/lkml/2020/9/24/458
>
> Changes in v3:
> - Rebased to latest pmu NMI patch-set [1].
> - Addressed misc. comments from Stephen.
>
> [1] https://lkml.org/lkml/2020/8/19/671
>
> Changes since RFC:
> - Rebased on top of Alex's WIP-pmu-nmi branch.
> - Add comment for safe max. CPU frequency.
> - Misc. cleanup.
>
>  arch/arm64/Kconfig             |  2 ++
>  arch/arm64/kernel/perf_event.c | 48 ++++++++++++++++++++++++++++++++++++++++--
>  drivers/perf/arm_pmu.c         |  5 +++++
>  include/linux/perf/arm_pmu.h   |  2 ++
>  4 files changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f39568b..05e1735 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -174,6 +174,8 @@ config ARM64
>         select HAVE_NMI
>         select HAVE_PATA_PLATFORM
>         select HAVE_PERF_EVENTS
> +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI && HW_PERF_EVENTS
> +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
>         select HAVE_PERF_REGS
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_REGS_AND_STACK_ACCESS_API
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 3605f77a..bafb7c8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,8 @@
>  #include <linux/platform_device.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
>
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
>         .probe          = armv8_pmu_device_probe,
>  };
>
> +static int __init lockup_detector_init_fn(void *data)
> +{
> +       lockup_detector_init();
> +       return 0;
> +}
> +
>  static int __init armv8_pmu_driver_init(void)
>  {
> +       int ret;
> +
>         if (acpi_disabled)
> -               return platform_driver_register(&armv8_pmu_driver);
> +               ret = platform_driver_register(&armv8_pmu_driver);
>         else
> -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> +       /*
> +        * Try to re-initialize lockup detector after PMU init in
> +        * case PMU events are triggered via NMIs.
> +        */
> +       if (ret == 0 && arm_pmu_irq_is_nmi())
> +               smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> +                               NULL, false);
> +
> +       return ret;
>  }
>  device_initcall(armv8_pmu_driver_init)
>
> @@ -1309,3 +1329,27 @@ void arch_perf_update_userpage(struct perf_event *event,
>         userpg->cap_user_time_zero = 1;
>         userpg->cap_user_time_short = 1;
>  }
> +
> +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> +/*
> + * Safe maximum CPU frequency in case a particular platform doesn't implement
> + * cpufreq driver. Although, architecture doesn't put any restrictions on
> + * maximum frequency but 5 GHz seems to be safe maximum given the available
> + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> + * hand, we can't make it much higher as it would lead to a large hard-lockup
> + * detection timeout on parts which are running slower (eg. 1GHz on
> + * Developerbox) and doesn't possess a cpufreq driver.
> + */
> +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> +{
> +       unsigned int cpu = smp_processor_id();
> +       unsigned long max_cpu_freq;
> +
> +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> +       if (!max_cpu_freq)
> +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> +
> +       return (u64)max_cpu_freq * watchdog_thresh;
> +}
> +#endif
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index cb2f55f..794a37d 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -726,6 +726,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
>         return per_cpu(hw_events->irq, cpu);
>  }
>
> +bool arm_pmu_irq_is_nmi(void)
> +{
> +       return has_nmi;
> +}
> +
>  /*
>   * PMU hardware loses all context when a CPU goes offline.
>   * When a CPU is hotplugged back in, since some hardware registers are
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 5054802..bf79667 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
>  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
>  #endif
>
> +bool arm_pmu_irq_is_nmi(void);
> +
>  /* Internal functions only for core arm_pmu code */
>  struct arm_pmu *armpmu_alloc(void);
>  struct arm_pmu *armpmu_alloc_atomic(void);
> --
> 2.7.4
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-02-19  9:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15 12:01 [PATCH v5] arm64: Enable perf events based hard lockup detector Sumit Garg
2021-01-26 14:18 ` Will Deacon
2021-01-28  7:07   ` Sumit Garg
2021-02-19  9:37 ` Sumit Garg [this message]
2021-03-30  8:06   ` Lecopzer Chen
2021-03-30  8:32     ` Lecopzer Chen
2021-03-30 12:30       ` Sumit Garg
2021-04-12 12:01         ` Sumit Garg
2021-04-19 17:03           ` Will Deacon
     [not found] ` <CAFA6WYOygwhhH4fuB8DFPHWF5KkxORH0E0AKL8Xp0y1jNuQr-w@mail.gmail.com>
2021-07-19  6:35   ` Fwd: " Huang Shijie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFA6WYOL3m6UspT1QG8_DAEFpGxtX=7aT_zTAdntmuUCcBvg5A@mail.gmail.com' \
    --to=sumit.garg@linaro.org \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=lecopzer.chen@mediatek.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=swboyd@chromium.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).