All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Lin Ming <ming.m.lin@intel.com>, Ingo Molnar <mingo@elte.hu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>,
	Yinghai Lu <yinghai@kernel.org>, Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
Date: Wed, 11 Aug 2010 04:44:55 +0200	[thread overview]
Message-ID: <20100811024451.GA26835@nowhere> (raw)
In-Reply-To: <20100810204856.GA16571@redhat.com>

On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> > On 06.08.10 10:21:31, Don Zickus wrote:
> > > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> > 
> > > > I was playing around with it yesterday trying to fix this. My idea is
> > > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> > > 
> > > You might want to add a little more logic that says *handled* _and_ had
> > > more than one perfctr trigger.  Most of the time only one perfctr is
> > > probably triggering, so you might be eating unknown_nmi's needlessly.
> > > 
> > > Just a thought.
> > 
> > Yes, that's true. It could be implemented on top of the patch below.
> 
> I did, but the changes basically revert the bulk of your patch.
> 
> > 
> > > 
> > > > nmi. I will probably post an rfc patch early next week.
> > 
> > Here it comes:
> > 
> > From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> > From: Robert Richter <robert.richter@amd.com>
> > Date: Thu, 5 Aug 2010 16:19:59 +0200
> > Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
> 
> On top of Robert's patch:
> (compiled tested only because I don't have a fancy button to trigger
> unknown nmis)
> 
> From 548cf5148f47618854a0eff22b1d55db71b6f8fc Mon Sep 17 00:00:00 2001
> From: Don Zickus <dzickus@redhat.com>
> Date: Tue, 10 Aug 2010 16:40:03 -0400
> Subject: [PATCH] perf, x86: only skip NMIs when multiple perfctrs trigger
> 
> A small optimization on top of Robert's patch that limits the
> skipping of NMI's to cases where we detect multiple perfctr events
> have happened.



Yeah, I think that's more reasonable. This lowers even more the chances of
losing important hardware errors.

One comment though:


> 
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> 
> ---
>  arch/x86/kernel/cpu/perf_event.c |   34 ++++++++++++++++++++--------------
>  1 files changed, 20 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index c3cd159..066046d 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
>  		/*
>  		 * event overflow
>  		 */
> -		handled		= 1;
> +		handled		+= 1;
>  		data.period	= event->hw.last_period;
>  
>  		if (!x86_perf_event_set_period(event))
> @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
>  	apic_write(APIC_LVTPC, APIC_DM_NMI);
>  }
>  
> -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>  
>  static int __kprobes
>  perf_event_nmi_handler(struct notifier_block *self,
> @@ -1208,8 +1208,7 @@ perf_event_nmi_handler(struct notifier_block *self,
>  {
>  	struct die_args *args = __args;
>  	struct pt_regs *regs;
> -	unsigned int this_nmi;
> -	unsigned int prev_nmi;
> +	int handled = 0;
>  
>  	if (!atomic_read(&active_events))
>  		return NOTIFY_DONE;
> @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
>  		 * was handling a perfctr. Otherwise we pass it and
>  		 * let the kernel handle the unknown nmi.
>  		 *
> -		 * Note: this could be improved if we drop unknown
> -		 * NMIs only if we handled more than one perfctr in
> -		 * the previous NMI.
>  		 */
> -		this_nmi = percpu_read(irq_stat.__nmi_count);
> -		prev_nmi = __get_cpu_var(perfctr_handled);
> -		if (this_nmi == prev_nmi + 1)
> +		if (__get_cpu_var(perfctr_skip)){
> +			__get_cpu_var(perfctr_skip) -=1;
>  			return NOTIFY_STOP;
> +		}
>  		return NOTIFY_DONE;
>  	default:
>  		return NOTIFY_DONE;
> @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
>  
>  	apic_write(APIC_LVTPC, APIC_DM_NMI);
>  
> -	if (!x86_pmu.handle_irq(regs))
> +	handled = x86_pmu.handle_irq(regs);
> +	if (!handled)
> +		/* not our NMI */
>  		return NOTIFY_DONE;
> -
> -	/* handled */
> -	__get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> +	else if (handled > 1)
> +		/*
> +		 * More than one perfctr triggered.  This could have
> +		 * caused a second NMI that we must now skip because
> +		 * we have already handled it.  Remember it.
> +		 *
> +		 * NOTE: We have no way of knowing if a second NMI was
> +		 * actually triggered, so we may accidentally skip a valid
> +		 * unknown nmi later.
> +		 */
> +		__get_cpu_var(perfctr_skip) +=1;



May be make it just a pending bit. I mean not something that can
go further 1, because you can't have more than 1 pending anyway. I don't
know how that could happen you get accidental perctr_skip > 1, may be
expected pending NMIs that don't happen somehow, but better be paranoid with
that, as it's about trying not to miss hardware errors.

Thanks.


  reply	other threads:[~2010-08-11  2:45 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-04  9:21 A question of perf NMI handler Lin Ming
2010-08-04  9:50 ` Peter Zijlstra
2010-08-04 10:01 ` Robert Richter
2010-08-04 10:24   ` Peter Zijlstra
2010-08-04 10:29     ` Robert Richter
2010-08-04 14:00   ` Don Zickus
2010-08-04 14:11     ` Peter Zijlstra
2010-08-04 14:52       ` Don Zickus
2010-08-04 15:02         ` Peter Zijlstra
2010-08-04 15:18           ` Cyrill Gorcunov
2010-08-04 15:50             ` Don Zickus
2010-08-04 16:10               ` Cyrill Gorcunov
2010-08-04 16:20                 ` Don Zickus
2010-08-04 16:39                   ` Cyrill Gorcunov
2010-08-04 18:48                     ` Robert Richter
2010-08-04 19:22                       ` Andi Kleen
2010-08-04 19:26                       ` Cyrill Gorcunov
2010-08-06  6:52                         ` Robert Richter
2010-08-06 14:21                           ` Don Zickus
2010-08-09 19:48                             ` [PATCH] perf, x86: try to handle unknown nmis with running perfctrs Robert Richter
2010-08-09 20:02                               ` Cyrill Gorcunov
2010-08-10  7:42                                 ` Robert Richter
2010-08-10 16:16                                   ` Cyrill Gorcunov
2010-08-10 16:41                                     ` Robert Richter
2010-08-10 17:24                                       ` Cyrill Gorcunov
2010-08-10 19:05                                         ` Robert Richter
2010-08-10 19:24                                           ` Cyrill Gorcunov
2010-08-12 13:24                                             ` Robert Richter
2010-08-12 14:31                                               ` Cyrill Gorcunov
2010-08-10 20:48                               ` Don Zickus
2010-08-11  2:44                                 ` Frederic Weisbecker [this message]
2010-08-11 11:10                                   ` Robert Richter
2010-08-11 12:44                                     ` Don Zickus
2010-08-11 14:03                                       ` Robert Richter
2010-08-11 14:32                                         ` Don Zickus
2010-08-13  4:37                                     ` Frederic Weisbecker
2010-08-13  8:22                                       ` Robert Richter
2010-08-14  1:28                                         ` Frederic Weisbecker
2010-08-14  2:29                                           ` Robert Richter
2010-08-11 12:39                                   ` Don Zickus
2010-08-11  3:19                                 ` Huang Ying
2010-08-11 12:36                                   ` Don Zickus
2010-08-16 14:37                                     ` Peter Zijlstra
2010-08-11 22:00                               ` [PATCH -v2] " Robert Richter
2010-08-12 13:10                                 ` Robert Richter
2010-08-12 18:21                                   ` Don Zickus
2010-08-16  7:37                                     ` Robert Richter
2010-08-12 13:52                                 ` Don Zickus
2010-08-13  4:25                                 ` Frederic Weisbecker
2010-08-16 14:48                                 ` Peter Zijlstra
2010-08-16 16:27                                   ` Cyrill Gorcunov
2010-08-16 17:16                                     ` Robert Richter
2010-08-16 19:06                                       ` Cyrill Gorcunov
2010-08-16 19:13                                         ` Peter Zijlstra
2010-08-16 19:18                                           ` Cyrill Gorcunov
2010-08-16 22:55                                         ` Robert Richter
2010-08-17 15:23                                           ` Cyrill Gorcunov
2010-08-17 15:22                               ` [PATCH -v3] " Robert Richter
2010-08-17 16:17                                 ` Cyrill Gorcunov
2010-08-19 10:45                                 ` Peter Zijlstra
2010-08-19 12:39                                   ` Robert Richter
2010-08-19 14:12                                   ` Don Zickus
2010-08-19 14:27                                     ` Peter Zijlstra
2010-08-19 15:20                                       ` Don Zickus
2010-08-19 17:43                                       ` Cyrill Gorcunov
2010-08-19 17:53                                         ` Peter Zijlstra
2010-08-19 21:58                                       ` Don Zickus
2010-08-20  8:50                                         ` Peter Zijlstra
2010-08-20  1:50                                       ` Don Zickus
2010-08-20  8:16                                         ` Ingo Molnar
2010-08-20 10:04                                           ` Peter Zijlstra
2010-08-20 10:30                                             ` Cyrill Gorcunov
2010-08-20 12:39                                             ` Don Zickus
2010-08-20 13:27                                               ` Ingo Molnar
2010-08-20 13:51                                                 ` Don Zickus
2010-08-20 14:17                                                   ` Ingo Molnar
2010-08-20 20:45                                                     ` Cyrill Gorcunov
2010-08-24 21:48                                                     ` Don Zickus
2010-08-20  8:36                                         ` Robert Richter
2010-08-20 14:17                                       ` [tip:perf/urgent] perf, x86: Fix handle_irq return values tip-bot for Peter Zijlstra
2010-08-20 14:17                                 ` [tip:perf/urgent] perf, x86: Try to handle unknown nmis with an enabled PMU tip-bot for Robert Richter
2010-08-06 15:35                   ` A question of perf NMI handler Andi Kleen
2010-08-04 15:45           ` Don Zickus
2010-08-06 15:37           ` Andi Kleen
2010-08-04 13:54 ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100811024451.GA26835@nowhere \
    --to=fweisbec@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=ying.huang@intel.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.