All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephane Eranian <eranian@google.com>
To: "Liang, Kan" <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vince Weaver <vincent.weaver@maine.edu>,
	"ak@linux.intel.com" <ak@linux.intel.com>
Subject: Re: [PATCH 2/2] perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86
Date: Mon, 22 May 2017 11:20:27 -0700	[thread overview]
Message-ID: <CABPqkBR2A99N+66bw8s6-xZbBtazdEfsWdiaVbL1aT4r0+wkJQ@mail.gmail.com> (raw)
In-Reply-To: <1495213582-3635-2-git-send-email-kan.liang@intel.com>

Andi,

On Fri, May 19, 2017 at 10:06 AM,  <kan.liang@intel.com> wrote:
> From: Kan Liang <Kan.liang@intel.com>
>
> The NMI watchdog uses either the fixed cycles or a generic cycles
> counter. This causes a lot of conflicts with users of the PMU who want
> to run a full group including the cycles fixed counter, for example the
> --topdown support recently added to perf stat. The code needs to fall
> back to not use groups, which can cause measurement inaccuracy due to
> multiplexing errors.
>
> This patch switches the NMI watchdog to use reference cycles on Intel
> systems. This is actually more accurate than cycles, because cycles can
> tick faster than the measured CPU Frequency due to Turbo mode.
>
You have not addressed why you need that accuracy?
This is about detecting hard deadlocks, so you don't care about a few seconds
accuracy. Instead of introducing all this complexity, why not simply extend the
period of the watchdog to be more tolerant to Turbo scaling t o avoid
false positive
and continue to use core-cycles, an event universally available.


> The ref cycles always tick at their frequency, or slower when the system
> is idling. That means the NMI watchdog can never expire too early,
> unlike with cycles.
>
Just make the period longer, like 30% longer. Take the max turbo factor you can
get and use that. It is okay if it takes longer of machine with
smaller max Turbo ratios.

What is the problem with this approach instead?

> The reference cycles tick roughly at the frequency of the TSC, so the
> same period computation can be used.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>
> This patch was once merged, but reverted later.
> Because ref-cycles can not be used anymore when watchdog is enabled.
> The commit is 44530d588e142a96cf0cd345a7cb8911c4f88720
>
> The patch 1/2 has extended the ref-cycles to GP counter. The concern
> should be gone.
>
> Rebased the patch and repost.
>
>
>  arch/x86/kernel/apic/hw_nmi.c | 8 ++++++++
>  include/linux/nmi.h           | 1 +
>  kernel/watchdog_hld.c         | 7 +++++++
>  3 files changed, 16 insertions(+)
>
> diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
> index c73c9fb..acd21dc 100644
> --- a/arch/x86/kernel/apic/hw_nmi.c
> +++ b/arch/x86/kernel/apic/hw_nmi.c
> @@ -18,8 +18,16 @@
>  #include <linux/nmi.h>
>  #include <linux/init.h>
>  #include <linux/delay.h>
> +#include <linux/perf_event.h>
>
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> +int hw_nmi_get_event(void)
> +{
> +       if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> +               return PERF_COUNT_HW_REF_CPU_CYCLES;
> +       return PERF_COUNT_HW_CPU_CYCLES;
> +}
> +
>  u64 hw_nmi_get_sample_period(int watchdog_thresh)
>  {
>         return (u64)(cpu_khz) * 1000 * watchdog_thresh;
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index aa3cd08..b2fa444 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -141,6 +141,7 @@ static inline bool trigger_single_cpu_backtrace(int cpu)
>
>  #ifdef CONFIG_LOCKUP_DETECTOR
>  u64 hw_nmi_get_sample_period(int watchdog_thresh);
> +int hw_nmi_get_event(void);
>  extern int nmi_watchdog_enabled;
>  extern int soft_watchdog_enabled;
>  extern int watchdog_user_enabled;
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 54a427d..f899766 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -70,6 +70,12 @@ void touch_nmi_watchdog(void)
>  }
>  EXPORT_SYMBOL(touch_nmi_watchdog);
>
> +/* Can be overridden by architecture */
> +__weak int hw_nmi_get_event(void)
> +{
> +       return PERF_COUNT_HW_CPU_CYCLES;
> +}
> +
>  static struct perf_event_attr wd_hw_attr = {
>         .type           = PERF_TYPE_HARDWARE,
>         .config         = PERF_COUNT_HW_CPU_CYCLES,
> @@ -165,6 +171,7 @@ int watchdog_nmi_enable(unsigned int cpu)
>
>         wd_attr = &wd_hw_attr;
>         wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
> +       wd_attr->config = hw_nmi_get_event();
>
>         /* Try to register using hardware perf events */
>         event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
> --
> 2.7.4
>

  parent reply	other threads:[~2017-05-22 18:20 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-19 17:06 [PATCH 1/2] perf/x86/intel: enable CPU ref_cycles for GP counter kan.liang
2017-05-19 17:06 ` [PATCH 2/2] perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86 kan.liang
2017-05-22 12:03   ` Peter Zijlstra
2017-05-22 12:04     ` Peter Zijlstra
2017-05-22 16:58     ` Liang, Kan
2017-05-22 19:24       ` Peter Zijlstra
2017-05-22 18:20   ` Stephane Eranian [this message]
2017-05-22 20:01     ` Andi Kleen
2017-05-22  8:30 ` [PATCH 1/2] perf/x86/intel: enable CPU ref_cycles for GP counter Peter Zijlstra
2017-05-22 18:15   ` Stephane Eranian
2017-05-22  9:19 ` Peter Zijlstra
2017-05-22 12:22   ` Peter Zijlstra
2017-05-22 16:59     ` Liang, Kan
2017-05-22 16:55   ` Liang, Kan
2017-05-22 19:23     ` Peter Zijlstra
2017-05-22 19:28       ` Stephane Eranian
2017-05-22 21:51         ` Liang, Kan
2017-05-23  6:39         ` Peter Zijlstra
2017-05-23  6:42           ` Stephane Eranian
2017-05-24 15:45             ` Andi Kleen
2017-05-24 16:01               ` Vince Weaver
2017-05-24 16:55                 ` Andi Kleen
2017-05-28 20:31                 ` Stephane Eranian
2017-05-30  9:25                   ` Peter Zijlstra
2017-05-30 13:51                     ` Andi Kleen
2017-05-30 16:28                       ` Peter Zijlstra
2017-05-30 16:41                         ` Stephane Eranian
2017-05-30 17:22                         ` Andi Kleen
2017-05-30 17:40                           ` Peter Zijlstra
2017-05-30 17:51                             ` Andi Kleen
2017-05-30 18:59                               ` Peter Zijlstra
2017-05-30 19:40                                 ` Andi Kleen
2017-05-30 16:39                     ` Stephane Eranian
2017-05-30 16:55                       ` Thomas Gleixner
2017-05-30 17:25                 ` Peter Zijlstra
2017-05-31 20:57                   ` Vince Weaver
2017-05-28  2:56 ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPqkBR2A99N+66bw8s6-xZbBtazdEfsWdiaVbL1aT4r0+wkJQ@mail.gmail.com \
    --to=eranian@google.com \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.