From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S965184AbdACPMC (ORCPT <rfc822;w@1wt.eu>);
        Tue, 3 Jan 2017 10:12:02 -0500
Received: from merlin.infradead.org ([205.233.59.134]:35260 "EHLO
        merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S965118AbdACPJz (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 3 Jan 2017 10:09:55 -0500
Date: Tue, 3 Jan 2017 16:09:46 +0100
From: Peter Zijlstra <peterz@infradead.org>
To: Jiri Olsa <jolsa@kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Andi Kleen <andi@firstfloor.org>,
        Alexander Shishkin <alexander.shishkin@linux.intel.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Vince Weaver <vince@deater.net>, Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 2/4] perf/x86: Fix period for non sampling events
Message-ID: <20170103150946.GX3107@twins.programming.kicks-ass.net>
References: <1482931866-6018-1-git-send-email-jolsa@kernel.org>
 <1482931866-6018-3-git-send-email-jolsa@kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1482931866-6018-3-git-send-email-jolsa@kernel.org>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Dec 28, 2016 at 02:31:04PM +0100, Jiri Olsa wrote:
> When in counting mode we setup the counter with the
> longest possible period and read the value with read
> syscall.
> 
> We also still setup the PMI to be triggered when such
> counter overflow to reconfigure it.
> 
> We also get PEBS interrupt if such counter has precise_ip
> set (which makes no sense, but it's possible).
> 
> Having such counter with:
>   - counting mode
>   - precise_ip set
> 
> I watched my server to get stuck serving PEBS interrupt
> again and again because of following (AFAICS):
> 
>   - PEBS interrupt is triggered before PMI

Slightly confused, the PEBS interrupt _is_ the PMI. And how can we get
an interrupt before the counter overflows?

>   - when PEBS handling path reconfigured counter it
>     had remaining value of -256

You're talking about the actual counter value, right, not @left?

>   - the x86_perf_event_set_period does not consider this
>     as an extreme value, so it's configured back as the
>     new counter value

Right, a counter value of -256 would result in @left being 256 which is
positive and not too large, so we 'retain' the value.

>   - this makes the PEBS interrupt to be triggered right
>     away again

So I'm curious how this is even possible. The normal described way of
things is:

	- we program the counter with a negative value
	- each 'event' does a counter increment
	- if we 'overflow' (turn positive) we start to arm the PEBS
	  assist
	- once the assist is armed, the next 'event' triggers a PEBS
	  record.
	- if the amount of PEBS records exceeds the DS threshold, we
	  set bit 62 in GLOBAL_STATUS and raise the PMI.

At which point the actual counter value should be at the very least 1
(for having counted the event that triggers the PEBS assist into
creating the record).


Did your kernel include commit:

  daa864b8f8e3 ("perf/x86/pebs: Fix handling of PEBS buffer overflows")

?