From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760707AbZFIMPl (ORCPT ); Tue, 9 Jun 2009 08:15:41 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757647AbZFIMPb (ORCPT ); Tue, 9 Jun 2009 08:15:31 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:43070 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760505AbZFIMP3 (ORCPT ); Tue, 9 Jun 2009 08:15:29 -0400 Date: Tue, 9 Jun 2009 14:15:17 +0200 From: Ingo Molnar To: Peter Zijlstra Cc: mingo@redhat.com, hpa@zytor.com, paulus@samba.org, acme@redhat.com, linux-kernel@vger.kernel.org, efault@gmx.de, mtosatti@redhat.com, tglx@linutronix.de, cjashfor@linux.vnet.ibm.com, linux-tip-commits@vger.kernel.org Subject: Re: [tip:perfcounters/core] perf_counter: Implement generalized cache event types Message-ID: <20090609121517.GC25586@elte.hu> References: <1244535326.13761.10021.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1244535326.13761.10021.camel@twins> User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.5 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Peter Zijlstra wrote: > On Sat, 2009-06-06 at 11:16 +0000, tip-bot for Ingo Molnar wrote: > > Commit-ID: 8326f44da090d6d304d29b9fdc7fb3e20889e329 > > Gitweb: http://git.kernel.org/tip/8326f44da090d6d304d29b9fdc7fb3e20889e329 > > Author: Ingo Molnar > > AuthorDate: Fri, 5 Jun 2009 20:22:46 +0200 > > Committer: Ingo Molnar > > CommitDate: Sat, 6 Jun 2009 13:14:47 +0200 > > > > perf_counter: Implement generalized cache event types > > > > Extend generic event enumeration with the PERF_TYPE_HW_CACHE > > method. > > > > This is a 3-dimensional space: > > > > { L1-D, L1-I, L2, ITLB, DTLB, BPU } x > > { load, store, prefetch } x > > { accesses, misses } > > > > User-space passes in the 3 coordinates and the kernel provides > > a counter. (if the hardware supports that type and if the > > combination makes sense.) > > > > Combinations that make no sense produce a -EINVAL. > > Combinations that are not supported by the hardware produce -ENOTSUP. > > > > Extend the tools to deal with this, and rewrite the event symbol > > parsing code with various popular aliases for the units and > > access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are > > both valid aliases. > > > > ( x86 is supported for now, with the Nehalem event table filled in, > > and with Core2 and Atom having placeholder tables. ) > > > > > +++ b/include/linux/perf_counter.h > > @@ -28,6 +28,7 @@ enum perf_event_types { > > PERF_TYPE_HARDWARE = 0, > > PERF_TYPE_SOFTWARE = 1, > > PERF_TYPE_TRACEPOINT = 2, > > + PERF_TYPE_HW_CACHE = 3, > > > > /* > > * available TYPE space, raw is the max value. > > @@ -56,6 +57,39 @@ enum attr_ids { > > }; > > > > /* > > + * Generalized hardware cache counters: > > + * > > + * { L1-D, L1-I, L2, LLC, ITLB, DTLB, BPU } x > > + * { read, write, prefetch } x > > + * { accesses, misses } > > + */ > > +enum hw_cache_id { > > + PERF_COUNT_HW_CACHE_L1D, > > + PERF_COUNT_HW_CACHE_L1I, > > + PERF_COUNT_HW_CACHE_L2, > > + PERF_COUNT_HW_CACHE_DTLB, > > + PERF_COUNT_HW_CACHE_ITLB, > > + PERF_COUNT_HW_CACHE_BPU, > > + > > + PERF_COUNT_HW_CACHE_MAX, > > +}; > > + > > +enum hw_cache_op_id { > > + PERF_COUNT_HW_CACHE_OP_READ, > > + PERF_COUNT_HW_CACHE_OP_WRITE, > > + PERF_COUNT_HW_CACHE_OP_PREFETCH, > > + > > + PERF_COUNT_HW_CACHE_OP_MAX, > > +}; > > + > > +enum hw_cache_op_result_id { > > + PERF_COUNT_HW_CACHE_RESULT_ACCESS, > > + PERF_COUNT_HW_CACHE_RESULT_MISS, > > + > > + PERF_COUNT_HW_CACHE_RESULT_MAX, > > +}; > > May I suggest we do the below instead? Some hardware doesn't make the > read/write distinction and would therefore have an utterly empty table. > > Furthermore, also splitting the hit/miss into a bitfield allows us to > have hit/miss and the combined value. > > --- > diff --git a/include/linux/perf_counter.h b/include/linux/perf_counter.h > index 3586df8..1fb72fc 100644 > --- a/include/linux/perf_counter.h > +++ b/include/linux/perf_counter.h > @@ -64,29 +64,32 @@ enum attr_ids { > * { accesses, misses } > */ > enum hw_cache_id { > - PERF_COUNT_HW_CACHE_L1D, > - PERF_COUNT_HW_CACHE_L1I, > - PERF_COUNT_HW_CACHE_L2, > - PERF_COUNT_HW_CACHE_DTLB, > - PERF_COUNT_HW_CACHE_ITLB, > - PERF_COUNT_HW_CACHE_BPU, > + PERF_COUNT_HW_CACHE_L1D = 0, > + PERF_COUNT_HW_CACHE_L1I = 1, > + PERF_COUNT_HW_CACHE_L2 = 2, > + PERF_COUNT_HW_CACHE_DTLB = 3, > + PERF_COUNT_HW_CACHE_ITLB = 4, > + PERF_COUNT_HW_CACHE_BPU = 5, Could you please also rename 'L2' to LLC (last level cache)? We want to know about the fastest and the 'largest' caches. Intermediate caches are a lot less interesting in practice, and we dont really want to enumerate a variable number of cache levels. > PERF_COUNT_HW_CACHE_MAX, > }; > > enum hw_cache_op_id { > - PERF_COUNT_HW_CACHE_OP_READ, > - PERF_COUNT_HW_CACHE_OP_WRITE, > - PERF_COUNT_HW_CACHE_OP_PREFETCH, > + PERF_COUNT_HW_CACHE_OP_READ = 0x1, > + PERF_COUNT_HW_CACHE_OP_WRITE = 0x2, > + PERF_COUNT_HW_CACHE_OP_ACCESS = 0x3, /* either READ or WRITE */ > + PERF_COUNT_HW_CACHE_OP_PREFETCH = 0x4, /* XXX should we qualify this with either READ/WRITE? */ Btw., could you please also rename the constants to LOAD/STORE? That's the proper PMU terminology. Prefetches are basically almost always reads. That comes from the physical fact that they can be done speculatively without modifying memory state. A 'speculative write', while possible in theory, would have so many side effects, and would complicate the SMP caching algorithm and an in-order execution model enormously, so i doubt it will be done in any widespread way anytime soon. Nevertheless, turning it into a bit does make sense, from an ABI cleanliness POV. > > - PERF_COUNT_HW_CACHE_OP_MAX, > + > + PERF_COUNT_HW_CACHE_OP_MAX = 0x8, > }; > > enum hw_cache_op_result_id { > - PERF_COUNT_HW_CACHE_RESULT_ACCESS, > - PERF_COUNT_HW_CACHE_RESULT_MISS, > + PERF_COUNT_HW_CACHE_RESULT_HIT = 0x1, > + PERF_COUNT_HW_CACHE_RESULT_MISS = 0x2, > + PERF_COUNT_HW_CACHE_RESULT_SUM = 0x3, RESULT_SUM sounds a bit weird - perhaps RESULT_ANY or RESULT_ALL? Ingo