From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751363AbdBWPHM (ORCPT ); Thu, 23 Feb 2017 10:07:12 -0500 Received: from mail-it0-f44.google.com ([209.85.214.44]:37918 "EHLO mail-it0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750916AbdBWPHL (ORCPT ); Thu, 23 Feb 2017 10:07:11 -0500 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Thu, 23 Feb 2017 10:07:01 -0500 (EST) X-X-Sender: vince@macbook-air To: "Liang, Kan" cc: Peter Zijlstra , "Odzioba, Lukasz" , Stephane Eranian , "mingo@redhat.com" , LKML , Alexander Shishkin , "ak@linux.intel.com" Subject: RE: [PATCH] perf/x86: fix event counter update issue In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F077536A9963@SHSMSX103.ccr.corp.intel.com> Message-ID: References: <1480361206-1702-1-git-send-email-kan.liang@intel.com> <20161129092520.GB3092@twins.programming.kicks-ass.net> <20161129173055.GP3092@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07750CA4225@SHSMSX103.ccr.corp.intel.com> <20161129193201.GE3045@worktop.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07750CA42A3@SHSMSX103.ccr.corp.intel.com> <20161205102509.GH3124@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F077536A9963@SHSMSX103.ccr.corp.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 22 Feb 2017, Liang, Kan wrote: > > So from what I understand, the issue is if we have an architecture with full- > > width counters and we trigger a x86_perf_event_update() when bit > > 47 is set? > > No. It related to the counter width. The number of bits we can use should be > 1 bit less than the total width. Otherwise, there will be problem. > For big cores such as haswell, broadwell, skylake, the counter width is 48 bit. > So we can only use 47 bits. > For Silvermont and KNL, the counter width is only 32 bit I think. So we can only > use 31 bits. So on a machine with 48-bit counters I should just have a counting event that counts to somewhere above 0x8000 0000 0001 and it should show problems? Because I am unable to trigger this. But I guess if anywhere along the line x86_perf_event_update() is run then you start over? I noticed your original reproducer bound the event to a core, is that needed to trigger this? Can it happen on a fixed event or only a genearl purpose event? > > So if I have a test that runs in a loop for 2^48 retired instructions (which > > takes ~12 hours on a recent machine) and then reads the results, they > > might be wrong? > > It only needs several minutes to reproduce the issue on SLM/KNL. Yes, but I only have machines with 48-bit counters. So it's going to take 256 times as long as on a machine with 40-bit counters. I have an assembly loop that can consistently generate 2 instructions/cycle (I'd be glad to hear suggestions for events that count faster) and on a broadwell-ep machine it still takes at least 7 hours or so to get up to 0x800000000000. Vince