From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751232AbdBWQR4 convert rfc822-to-8bit (ORCPT <rfc822;w@1wt.eu>);
        Thu, 23 Feb 2017 11:17:56 -0500
Received: from mga03.intel.com ([134.134.136.65]:43655 "EHLO mga03.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751098AbdBWQRx (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 23 Feb 2017 11:17:53 -0500
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.35,198,1484035200"; 
   d="scan'208";a="69345572"
From: "Liang, Kan" <kan.liang@intel.com>
To: Vince Weaver <vincent.weaver@maine.edu>
CC: Peter Zijlstra <peterz@infradead.org>,
        "Odzioba, Lukasz" <lukasz.odzioba@intel.com>,
        Stephane Eranian <eranian@google.com>,
        "mingo@redhat.com" <mingo@redhat.com>,
        LKML <linux-kernel@vger.kernel.org>,
        Alexander Shishkin <alexander.shishkin@linux.intel.com>,
        "ak@linux.intel.com" <ak@linux.intel.com>
Subject: RE: [PATCH] perf/x86: fix event counter update issue
Thread-Topic: [PATCH] perf/x86: fix event counter update issue
Thread-Index: AQHSSa1zo+RK785pc06zvVqph8o/bKDvK4oAgACEqwCAAAMBgIAAnoVQ//+DUYCAAJTSIIADtCqAgASMNoCAfHIigIAAiBgQgAEPCICAAIhZQA==
Date: Thu, 23 Feb 2017 16:14:11 +0000
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F077536A9F55@SHSMSX103.ccr.corp.intel.com>
References: <1480361206-1702-1-git-send-email-kan.liang@intel.com>
 <20161129092520.GB3092@twins.programming.kicks-ass.net>
 <CABPqkBQPRBZZB8wbhYGLoS9ww_0vQrt=nkqgDC_fFgG99cqdCg@mail.gmail.com>
 <20161129173055.GP3092@twins.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07750CA4225@SHSMSX103.ccr.corp.intel.com>
 <20161129193201.GE3045@worktop.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07750CA42A3@SHSMSX103.ccr.corp.intel.com>
 <D6EDEBF1F91015459DB866AC4EE162CC024DCFE5@IRSMSX103.ger.corp.intel.com>
 <20161205102509.GH3124@twins.programming.kicks-ass.net>
 <alpine.DEB.2.20.1702220941570.24020@macbook-air>
 <37D7C6CF3E00A74B8858931C1DB2F077536A9963@SHSMSX103.ccr.corp.intel.com>
 <alpine.DEB.2.20.1702231000230.15726@macbook-air>
In-Reply-To: <alpine.DEB.2.20.1702231000230.15726@macbook-air>
Accept-Language: zh-CN, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZGNlY2Y1YjUtOTVjZS00MWFjLWIyNmItMWQ3OGU4Nzk3MzFhIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IlN6SGRWdjJzU1FQVjJLbnFIU21nUldmNWVnZVI4bVoyV0FMeWNnb25ocEE9In0=
x-ctpclassification: CTP_IC
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
MIME-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


> 
> On Wed, 22 Feb 2017, Liang, Kan wrote:
> 
> > > So from what I understand, the issue is if we have an architecture
> > > with full- width counters and we trigger a x86_perf_event_update()
> > > when bit
> > > 47 is set?
> >
> > No. It related to the counter width. The number of bits we can use
> > should be
> > 1 bit less than the total width. Otherwise, there will be problem.
> > For big cores such as haswell, broadwell, skylake, the counter width is 48
> bit.
> > So we can only use 47 bits.
> > For Silvermont and KNL, the counter width is only 32 bit I think. So
> > we can only use 31 bits.
> 
> So on a machine with 48-bit counters I should just have a counting event
> that counts to somewhere above 0x8000 0000 0001 and it should show
> problems?
Yes

> Because I am unable to trigger this.
> 
> But I guess if anywhere along the line x86_perf_event_update() is run then
> you start over?
> 

Probably. It depends on the left.

> I noticed your original reproducer bound the event to a core, is that needed
> to trigger this?

I don't think it's needed. But I didn't try anything without bound.

> 
> Can it happen on a fixed event or only a genearl purpose event?

I think it can happens on both. Because fixed counter and GP counter have
same counter width and code path.

> 
> > > So if I have a test that runs in a loop for 2^48 retired
> > > instructions (which takes ~12 hours on a recent machine) and then
> > > reads the results, they might be wrong?
> >
> > It only needs several minutes to reproduce the issue on SLM/KNL.
> 
> Yes, but I only have machines with 48-bit counters.  So it's going to take
> 256 times as long as on a machine with 40-bit counters.
> 
> I have an assembly loop that can consistently generate 2 instructions/cycle
> (I'd be glad to hear suggestions for events that count faster) and on a
> broadwell-ep machine it still takes at least 7 hours or so to get up to
> 0x800000000000.

I think you may use MSR tool to write a big number into IA32_PMC0
during your test. 
The writable IA32_PMC0 alias is 0x4C1.


Thanks,
Kan