linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [perfmon] Re: quick overview of the perfmon2 interface
@ 2006-01-20 18:37 Truong, Dan
  2006-01-20 22:22 ` Andrew Morton
  2006-01-25 20:33 ` Bryan O'Sullivan
  0 siblings, 2 replies; 20+ messages in thread
From: Truong, Dan @ 2006-01-20 18:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eranian, Stephane, perfmon, linux-ia64, linux-kernel, perfctr-devel

Would you want Stephane to guard the extended
functionalities with tunables or something to
Disable their regular use and herd enterprise
Tools into a standard mold... yet allow R&D to
Move on by enabling the extentions?



Just crippling flexibility/cutting functionality
is like removing words out of a dictionary to
prevent people from thinking different.

It would restrict the R&D mindset, and new ideas.
The field hasn't grown yet to a stable mature form.
It is just beginning: profiling, monitoring, tuning,
compilers, JIT...

Flexibility is/was needed because:
- Tools need to port to Perfmon with min cost.
- Ability to support novel R&D ideas.
- Ability to support growth beyond just PMU data
- Allows early data aggregation
- Allow OS data correlated to PMU

What standardization adds:
- Coordinated access to PMU rssources from all tools
- All tools/formats etc all plug into the same OS framework.
- The interface gets ported across multiple platforms.
- The functionality is rich for all (fast data transfers,
  multiplexing, system vs thead, etc.)

Dan-

> -----Original Message-----
> From: perfmon-bounces@napali.hpl.hp.com [mailto:perfmon-
> bounces@napali.hpl.hp.com] On Behalf Of Andrew Morton
> Sent: Thursday, December 22, 2005 5:47 AM
> To: Truong, Dan
> Cc: Eranian, Stephane; perfmon@napali.hpl.hp.com; linux-
> ia64@vger.kernel.org; linux-kernel@vger.kernel.org; perfctr-
> devel@lists.sourceforge.net
> Subject: Re: [perfmon] Re: quick overview of the perfmon2 interface
> 
> "Truong, Dan" <dan.truong@hp.com> wrote:
> >
> > The PMU is becoming a standard commodity. Once Perfmon is
> > "the" Linux interface, all the tools can align on it and
> > coexist, push their R&D forward, and more importantly become
> > fully productized for businesses usage.
> >
> 
> The apparently-extreme flexibility of the perfmon interfaces would
tend to
> militate against that, actually.  It'd become better productised if it
had
> one interface and stuck to it.
> 
> (I haven't processed Stephane's reply yet - will get there)
> 
> _______________________________________________
> perfmon mailing list
> perfmon@linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-20 18:37 [perfmon] Re: quick overview of the perfmon2 interface Truong, Dan
@ 2006-01-20 22:22 ` Andrew Morton
  2006-01-25 20:33 ` Bryan O'Sullivan
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2006-01-20 22:22 UTC (permalink / raw)
  To: Truong, Dan
  Cc: stephane.eranian, perfmon, linux-ia64, linux-kernel, perfctr-devel

"Truong, Dan" <dan.truong@hp.com> wrote:
>
> Would you want Stephane to guard the extended
> functionalities with tunables or something to
> Disable their regular use and herd enterprise
> Tools into a standard mold... yet allow R&D to
> Move on by enabling the extentions?

argh.  I'd prefer to avoid one-month gaps in the conversation, so we don't
all forget what we were talking about.

Look, we just need to get these patches on the wire so we can all look at
them, see what they do, understand what decisions were taken and why.

The conciseness and completeness of those patches' covering descriptions
will be key to helping this process along.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-20 18:37 [perfmon] Re: quick overview of the perfmon2 interface Truong, Dan
  2006-01-20 22:22 ` Andrew Morton
@ 2006-01-25 20:33 ` Bryan O'Sullivan
  2006-01-25 22:28   ` [Perfctr-devel] " Stephane Eranian
  1 sibling, 1 reply; 20+ messages in thread
From: Bryan O'Sullivan @ 2006-01-25 20:33 UTC (permalink / raw)
  To: Truong, Dan
  Cc: Andrew Morton, Eranian, Stephane, perfmon, linux-ia64,
	linux-kernel, perfctr-devel

On Fri, 2006-01-20 at 10:37 -0800, Truong, Dan wrote:
> Would you want Stephane to guard the extended
> functionalities with tunables or something to
> Disable their regular use and herd enterprise
> Tools into a standard mold... yet allow R&D to
> Move on by enabling the extentions?

I'd prefer to see all of the extended stuff left out entirely for now.
The mainline kernel has no PMU support for any popular architecture,
even though external patches have existed in stable form for years.
Filling that gap ought to be the priority; the interface can be extended
when actual users of new features show up and ask for them.

> It would restrict the R&D mindset, and new ideas.
> The field hasn't grown yet to a stable mature form.

The place for flailing around with uncooked ideas is arguably not the
mainline kernel.

> Flexibility is/was needed because:
> - Tools need to port to Perfmon with min cost.
> - Ability to support novel R&D ideas.
> - Ability to support growth beyond just PMU data
> - Allows early data aggregation
> - Allow OS data correlated to PMU

Speculatively adding complicated and unused interfaces to the kernel in
the hope that some wild-eyed visionary might eventually up and use them
helps nobody.

	<b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] RE: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-25 20:33 ` Bryan O'Sullivan
@ 2006-01-25 22:28   ` Stephane Eranian
  2006-01-25 22:46     ` Bryan O'Sullivan
       [not found]     ` <1138649612.4077.50.camel@localhost.localdomain>
  0 siblings, 2 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-01-25 22:28 UTC (permalink / raw)
  To: Bryan O'Sullivan
  Cc: Truong, Dan, Andrew Morton, Eranian, Stephane, perfmon,
	linux-ia64, linux-kernel, perfctr-devel

Bryan,

On Wed, Jan 25, 2006 at 12:33:32PM -0800, Bryan O'Sullivan wrote:
> On Fri, 2006-01-20 at 10:37 -0800, Truong, Dan wrote:
> > Would you want Stephane to guard the extended
> > functionalities with tunables or something to
> > Disable their regular use and herd enterprise
> > Tools into a standard mold... yet allow R&D to
> > Move on by enabling the extentions?
> 
> I'd prefer to see all of the extended stuff left out entirely for now.

I usually don't add things to the interface just because they are cool
ideas but rather because there is a need expressed by some tool
developer or system person. So it would help if you could
name the extended features you referring to. 

The problem with an incremental approach is to maintained backward compatibility
for existing applications. I have had to deal with this on IA-64. For instance
moving from a single syscall to multiple syscall. Similarly, when passing
data structures, you have to provision some reserved fields for potential
extensions. You don't really want to add more system call if you need to
to add a feature.

> The mainline kernel has no PMU support for any popular architecture,
> even though external patches have existed in stable form for years.

You do not count Oprofile. I think this is a fine tool. And perfmon
does allow it to continue working using almost all of its kernel code.
This is leveraging the custom sampling buffer format support in perfmon.
So you can say this is an extended feature that adds complexity.
But OTOH, this is one elegant way of supporting an existing interface
without breaking all the tools.

Take another example, suppose some tool comes along and say: "I would
like to add in each recorded sample the kernel call stack at the point
of the counter overflow". How would you do this without having to hack
kernel code? With the buffer format, you simply insert of module that
does what you want. There are hundreds of things you can include in your
samples. I don't think that we can come up with a very generic sampling
buffer format.

Sometimes, it is not so much what is recorded but how it is recorded.
Some tool may prefer to have samples aggreagated in the kernel, other
would like to use a double-buffer approach to minimize blind spots.
All are valid requests. Our infrastructure allows this without modification
to the core interface nor core kernel code. I believe this is a very strong
value-add.

Without this infrastructure, it would have been pretty difficult to add
support for the P4 Precise Event Based Sampling (PEBS) which by the way,
nobody was able to offer so far. We were able to proide this support
with a few hundred lines of code without hacking the regular sampling
format. Instead we simply created a dedicated PEBS format as a kernel module.


> Filling that gap ought to be the priority; the interface can be extended
> when actual users of new features show up and ask for them.
> 
Again that is fine as long as you can keep backward complexity and a clean
interface.

> > It would restrict the R&D mindset, and new ideas.
> > The field hasn't grown yet to a stable mature form.
> 
I would agree with you, that people have not yet realized the potential
of those performance counters. But this maybe in part a chicken and egg
problem.  People cannot take full advantage because they don't have
a generic interface on any platform.

Designing a generic perfmon interface is hard because:
	- the hardware is extremely diverse
	- there are so many things you can measure
-- 
-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] RE: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-25 22:28   ` [Perfctr-devel] " Stephane Eranian
@ 2006-01-25 22:46     ` Bryan O'Sullivan
  2006-01-26  7:48       ` Stephane Eranian
       [not found]     ` <1138649612.4077.50.camel@localhost.localdomain>
  1 sibling, 1 reply; 20+ messages in thread
From: Bryan O'Sullivan @ 2006-01-25 22:46 UTC (permalink / raw)
  To: eranian
  Cc: perfctr-devel, linux-kernel, linux-ia64, perfmon, Eranian,
	Stephane, Andrew Morton, Truong, Dan

On Wed, 2006-01-25 at 14:28 -0800, Stephane Eranian wrote:

> So it would help if you could
> name the extended features you referring to. 

I'm dubious about the hands-off buffer format in general.  Does this
mean that userspace needs to modprobe a specific set of modules in order
to do normal sampling?  If so, how do you work around the need for users
to be root in order to use these interfaces?

> And perfmon
> does allow it to continue working using almost all of its kernel code.
> This is leveraging the custom sampling buffer format support in perfmon.
> So you can say this is an extended feature that adds complexity.
> But OTOH, this is one elegant way of supporting an existing interface
> without breaking all the tools.

So are you saying that part of the existing oprofile code can be deleted
if perfmon is merged, and that userspace won't notice?

> We were able to proide this support
> with a few hundred lines of code without hacking the regular sampling
> format. Instead we simply created a dedicated PEBS format as a kernel module.

Does this mean I can't sample the PMCs on a P4 if I don't have the
special PEBS module loaded?  Do I need to be root to do that?

	<b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] RE: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-25 22:46     ` Bryan O'Sullivan
@ 2006-01-26  7:48       ` Stephane Eranian
  2006-01-26 18:26         ` Bryan O'Sullivan
  0 siblings, 1 reply; 20+ messages in thread
From: Stephane Eranian @ 2006-01-26  7:48 UTC (permalink / raw)
  To: Bryan O'Sullivan
  Cc: perfctr-devel, linux-kernel, linux-ia64, perfmon, Eranian,
	Stephane, Andrew Morton, Truong, Dan

Bryan,

On Wed, Jan 25, 2006 at 02:46:43PM -0800, Bryan O'Sullivan wrote:
> On Wed, 2006-01-25 at 14:28 -0800, Stephane Eranian wrote:
> 
> > So it would help if you could
> > name the extended features you referring to. 
> 
> I'm dubious about the hands-off buffer format in general.  Does this
> mean that userspace needs to modprobe a specific set of modules in order
> to do normal sampling?  If so, how do you work around the need for users
> to be root in order to use these interfaces?

As I said, there is a builtin default format that is fairly generic. It does
work for HP Caliper, pfmon, q-tools. I suspect it is good enough for VTUNE.

You need to be root to insert the module. But I believe that for many user
environments, this is more practical than having to recompile a custom kernel.
You can imagine the format being shipped with the tool, when the sysadmin
installs the tool it also installs the module.

> 
> > And perfmon
> > does allow it to continue working using almost all of its kernel code.
> > This is leveraging the custom sampling buffer format support in perfmon.
> > So you can say this is an extended feature that adds complexity.
> > But OTOH, this is one elegant way of supporting an existing interface
> > without breaking all the tools.
> 
> So are you saying that part of the existing oprofile code can be deleted
> if perfmon is merged, and that userspace won't notice?
> 
The part of Oprofile that does actual programming of the PMU can be removed.
The part that stays is the one that deals with recording samples, exporting
samples,  and collecting OS events such as exit, mmap, exec. As the user
level, they need to migrated from the Oprofile way of programming counters
to the perfmon way. This has been done many years ago on Itanium and did
not cause any major problems.

> > We were able to proide this support
> > with a few hundred lines of code without hacking the regular sampling
> > format. Instead we simply created a dedicated PEBS format as a kernel module.
> 
> Does this mean I can't sample the PMCs on a P4 if I don't have the
> special PEBS module loaded?  Do I need to be root to do that?

PEBS is a P4 feature that has two advantages:
	- record the exact IP of where a counter overflows (no skid)
	- the CPU directly record the samples into a memory area designated
	  by the kernel. As such, you only get a PMU when that area fills up.

There are some limitations:
	- you cannot sample on any event
	- the format of a sample is fixed, it does not contain extra PMDs, just
	  IP and some general registers. The process id is not recorded
	  so it is not well suited for system-wide monitoring.
	- it appears to broken for HyperThreading setups.

So, it all depends on what you are after. Some people do care about avoiding
the skid of regular sampling and they want they like PEBS just for that. Others
would like to record a set of extra PMDs (PERFCTR) and they are willing to
compromise a bit on the skid of IP, so they can live with the default format.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] RE: [perfmon] Re: quick overview of the perfmon2 interface
  2006-01-26  7:48       ` Stephane Eranian
@ 2006-01-26 18:26         ` Bryan O'Sullivan
  0 siblings, 0 replies; 20+ messages in thread
From: Bryan O'Sullivan @ 2006-01-26 18:26 UTC (permalink / raw)
  To: eranian
  Cc: perfctr-devel, linux-kernel, linux-ia64, perfmon, Eranian,
	Stephane, Andrew Morton, Truong, Dan

On Wed, 2006-01-25 at 23:48 -0800, Stephane Eranian wrote:

> You need to be root to insert the module. But I believe that for many user
> environments, this is more practical than having to recompile a custom kernel.

Clearly.

> You can imagine the format being shipped with the tool, when the sysadmin
> installs the tool it also installs the module.

In that case, you need some kind of per-distro cruft to make sure the
module gets loaded at every boot, or a setuid program that can install
the module, right?.  Neither of these approaches works well in a cluster
environment where you're running your tools from a shared directory.

I'd really like the default mode of operation for users to not require
root privileges to get at normal functionality.  This is something
perfctr makes possible, for example.

	<b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* perfmon2 code review: 32-bit ABI on 64-bit OS
       [not found]           ` <1139245253.27739.8.camel@camp4.serpentine.com>
@ 2006-02-10 15:36             ` Stephane Eranian
  2006-02-10 18:27               ` Bryan O'Sullivan
  0 siblings, 1 reply; 20+ messages in thread
From: Stephane Eranian @ 2006-02-10 15:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: perfmon, perfctr-devel, linux-ia64

Heelo,

Here is another topic of the interface I'd like to see discussed on this list
because it was raised in the past and I am not necessarily satisfied with
the current solution.

Tools can program the PMU by passing structures to read/write the PMC/PMD
registers. Some of those structures contain bitmasks. For instance, 
When sampling, one can load a PMD (counter) value and indicate which 
other PMD registers must be included in the samples using the bitmask.

The interface supports the same fixed number of PMD registers on
all platforms. As such the number of bits in the bitmask is fixed
and we have arrange for it to be multiple of 4 and 8. The structure
looks as follows:

typedef struct {
        u16 reg_num;            /* which register */
        u16 reg_set;            /* event set for this register */
        pfm_flags_t reg_flags;  /* input: flags, return: reg error */
        u64 reg_value;          /* initial pmc/pmd value */
        u64 reg_long_reset;     /* value to reload after notification */
        u64 reg_short_reset;    /* reset after counter overflow */
        u64 reg_last_reset_val; /* return: PMD last reset value */
        u64 reg_ovfl_switch_cnt;/* #overflows before switch */
        unsigned long reg_reset_pmds[PFM_PMD_BV]; /* reset on overflow */
        unsigned long reg_smpl_pmds[PFM_PMD_BV];  /* record in sample */
        u64 reg_smpl_eventid;   /* opaque event identifier */
        u64 reg_random_mask;    /* bitmask used to limit random value */
        u32 reg_random_seed;    /* seed for randomization */
        u32 reg_reserved2[7];   /* for future use */
} pfarg_pmd_t;

We use unsigned long for bitmask because we can leverage the
kernel bitmap.h interface which is nice. All the others fields
in the struct have fixed size. If it was not for the bitmask
the structure would be identical in 32-bit or 64-bit mode. 

Why does it matter?

Many 64-bit Linux kernel do support running 32-bit native applications.
That is the case on PPC64, MIPS64K, X86-64, for instance. One could well
write a 32-bit monitoring tool on top of a 64-bit OS. If we want to avoid
the emulation trampoline in the kernel, we need to ensure that the 32-bit
applications use the same structure layout as the 64-bit OS.

In our particular case we rely on the fact that the number of bits is fixed.
We use 320 bits, which is either 10x32 bits or 5x64 bits. Either way,
the sizeof() is the same. As such the struct is identical. Internally,
the kernel will take the bitmask as u64. The only issue would be the
alignment of the bitmasks. The 32-bit version must be aligned on 64-bit
boundary. Structures are aligned on the size of their largest member, 
here 64-bit. The layout is such that the bitmask are always aligned.
We have verified that this does indeed work on MIPS with Phil Mucci.

The alternative approach would be to hardcode bitmask to be u64.
But that would require extra casting in the kernel to get to the
bitmap interface and may cause extra overhead in the 32-bit user programs.
Yet I don't anticipate that programming those bitmask be in the critical
path of monitoring tools.

The second alternative looks cleaner and safer in a sense and I am certainly
willing to make the change but I would like to get everybody else's opinion
as well.

Note that there are similar issues with the remapped sampling buffer.
There, you need to explicitly compile your tool with a special option
to force certain types to be 64-bit (size_t, void *). Unless someone
tell me how to tell the compiler we're compiling 32-bit to execute
on an 64-bit OS ABI.

Thanks.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: perfmon2 code review: 32-bit ABI on 64-bit OS
  2006-02-10 15:36             ` perfmon2 code review: 32-bit ABI on 64-bit OS Stephane Eranian
@ 2006-02-10 18:27               ` Bryan O'Sullivan
       [not found]                 ` <1139681785.4316.33.camel@localhost.localdomain>
  2006-02-13 20:34                 ` Stephane Eranian
  0 siblings, 2 replies; 20+ messages in thread
From: Bryan O'Sullivan @ 2006-02-10 18:27 UTC (permalink / raw)
  To: eranian; +Cc: linux-kernel, perfmon, perfctr-devel, linux-ia64

On Fri, 2006-02-10 at 07:36 -0800, Stephane Eranian wrote:

> Many 64-bit Linux kernel do support running 32-bit native applications.
> That is the case on PPC64, MIPS64K, X86-64, for instance.

And sparc64 and s390.

>  One could well
> write a 32-bit monitoring tool on top of a 64-bit OS.

On some 64-bit arches (e.g. x86_64), most userspace code is 64-bit,
while on others (e.g. powerpc), most is 32-bit.  Reducing the number of
things that a userspace tool or library writer can trip over seems like
a good thing here, even if it slightly complicates perfmon's internals.

> Note that there are similar issues with the remapped sampling buffer.
> There, you need to explicitly compile your tool with a special option
> to force certain types to be 64-bit (size_t, void *).

It's pretty normal to just use 64-bit quantities in these cases, and
cast appropriately.

	<b


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [perfmon] perfmon2 code review: 32-bit ABI on 64-bit OS
       [not found]                 ` <1139681785.4316.33.camel@localhost.localdomain>
@ 2006-02-11 22:33                   ` Stephane Eranian
  2006-02-12 23:46                     ` [Perfctr-devel] " David Gibson
       [not found]                     ` <1139857076.4342.10.camel@localhost.localdomain>
  0 siblings, 2 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-02-11 22:33 UTC (permalink / raw)
  To: Philip Mucci; +Cc: Bryan O'Sullivan, perfmon, perfctr-devel, linux-kernel

Hello,

On Sun, Feb 12, 2006 at 12:16:25AM +0600, Philip Mucci wrote:
> > 
> > On some 64-bit arches (e.g. x86_64), most userspace code is 64-bit,
> > while on others (e.g. powerpc), most is 32-bit.  Reducing the number of
> > things that a userspace tool or library writer can trip over seems like
> > a good thing here, even if it slightly complicates perfmon's internals.
> > 
> > > Note that there are similar issues with the remapped sampling buffer.
> > > There, you need to explicitly compile your tool with a special option
> > > to force certain types to be 64-bit (size_t, void *).
> > 
> > It's pretty normal to just use 64-bit quantities in these cases, and
> > cast appropriately.
> 
> I agree with Bryan. Stephane, do you have any quantitative data for how
> much more expensive going to 64 bit quantities would be? Which
> performance critical operations access this structure? AFAIK, any
> performance monitoring system call is already slow by nature...and thus
> an additional dozen cycles isn't going to make a difference. Of course,
> if this structure needs to be read/written by get_pmd, including the
> userspace version (+ mmap offset), then the extra overhead should be
> considered. 
> 
I think I can easily convert the bitmasks to be u64 on all platforms.
I don't think it will negatively impact performance on 32-bit applications.

The sampling buffer is another matter. It is directly remapped. The default
format, exposes size_t and void *. The size_t is not on the critical
path, it is used to specify the buffer size. If we expose as 64-bit,
we need to check on 32-bit system that the value is below 4GB and cast
to size_t.

The most challenging piece is the IP (program pointer) that is in every
sample. Today it is defined as unsigned long because this is fairly
natural for a code address. The 64bit OS captures addresses as 64-bit,
the 32-bit monitoring tool running on top has to consume them as 64-bit
addresses, so u64 would be fine. 

But not on a 32-bit kernel with a 32-bit tool, addresses exported as u64
would certainly work but consume double to buffer space, and that is a
more serious issue in my mind.

What we need is:
	1/ 32-bit OS: IP is 32-bit in the sampling buffer
	2/ 64-bit OS: IP is 64-bit in the sampling buffer

Because of 32-bit ABI tool running on 2/, the IP would have
to be defined as u64. But then it would be overkill on 1/.

The problem is in the user level header file for the sampling buffer.
We would need a data type that is 64-bit for IP if the host OS is 64-bit
(regardless of the ABI used by the tool, i.e., the compiler). And a data
type that is 32-bit on 32-bit OS. The problem is that there is no compiler
flag or header flag somewhere that could guide the compiler. In the case
of MIPS, we have defined a libpfm compile flags that indicates we want
the 64-bit OS definition when compiling for a 32-bit application.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] Re: [perfmon] perfmon2 code review: 32-bit ABI on 64-bit OS
  2006-02-11 22:33                   ` [perfmon] " Stephane Eranian
@ 2006-02-12 23:46                     ` David Gibson
  2006-02-13  0:03                       ` Eric Gouriou
       [not found]                     ` <1139857076.4342.10.camel@localhost.localdomain>
  1 sibling, 1 reply; 20+ messages in thread
From: David Gibson @ 2006-02-12 23:46 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Philip Mucci, Bryan O'Sullivan, perfmon, perfctr-devel, linux-kernel

On Sat, Feb 11, 2006 at 02:33:54PM -0800, Stephane Eranian wrote:
> Hello,
> 
> On Sun, Feb 12, 2006 at 12:16:25AM +0600, Philip Mucci wrote:
> > > 
> > > On some 64-bit arches (e.g. x86_64), most userspace code is 64-bit,
> > > while on others (e.g. powerpc), most is 32-bit.  Reducing the number of
> > > things that a userspace tool or library writer can trip over seems like
> > > a good thing here, even if it slightly complicates perfmon's internals.
> > > 
> > > > Note that there are similar issues with the remapped sampling buffer.
> > > > There, you need to explicitly compile your tool with a special option
> > > > to force certain types to be 64-bit (size_t, void *).
> > > 
> > > It's pretty normal to just use 64-bit quantities in these cases, and
> > > cast appropriately.
> > 
> > I agree with Bryan. Stephane, do you have any quantitative data for how
> > much more expensive going to 64 bit quantities would be? Which
> > performance critical operations access this structure? AFAIK, any
> > performance monitoring system call is already slow by nature...and thus
> > an additional dozen cycles isn't going to make a difference. Of course,
> > if this structure needs to be read/written by get_pmd, including the
> > userspace version (+ mmap offset), then the extra overhead should be
> > considered. 
> > 
> I think I can easily convert the bitmasks to be u64 on all platforms.
> I don't think it will negatively impact performance on 32-bit applications.
> 
> The sampling buffer is another matter. It is directly remapped. The default
> format, exposes size_t and void *. The size_t is not on the critical
> path, it is used to specify the buffer size. If we expose as 64-bit,
> we need to check on 32-bit system that the value is below 4GB and cast
> to size_t.
> 
> The most challenging piece is the IP (program pointer) that is in every
> sample. Today it is defined as unsigned long because this is fairly
> natural for a code address. The 64bit OS captures addresses as 64-bit,
> the 32-bit monitoring tool running on top has to consume them as 64-bit
> addresses, so u64 would be fine. 
> 
> But not on a 32-bit kernel with a 32-bit tool, addresses exported as u64
> would certainly work but consume double to buffer space, and that is a
> more serious issue in my mind.

Hmm.. does the sampling buffer collect on userspace PC values, or
kernel ones as well?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] Re: [perfmon] perfmon2 code review: 32-bit ABI on 64-bit OS
  2006-02-12 23:46                     ` [Perfctr-devel] " David Gibson
@ 2006-02-13  0:03                       ` Eric Gouriou
  2006-02-13 20:31                         ` Stephane Eranian
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Gouriou @ 2006-02-13  0:03 UTC (permalink / raw)
  To: David Gibson, Stephane Eranian, Philip Mucci,
	Bryan O'Sullivan, perfmon, perfctr-devel, linux-kernel

David Gibson wrote:
> On Sat, Feb 11, 2006 at 02:33:54PM -0800, Stephane Eranian wrote:
[...]
>> The most challenging piece is the IP (program pointer) that is in every
>> sample. Today it is defined as unsigned long because this is fairly
>> natural for a code address. The 64bit OS captures addresses as 64-bit,
>> the 32-bit monitoring tool running on top has to consume them as 64-bit
>> addresses, so u64 would be fine. 
>>
>> But not on a 32-bit kernel with a 32-bit tool, addresses exported as u64
>> would certainly work but consume double to buffer space, and that is a
>> more serious issue in my mind.
> 
> Hmm.. does the sampling buffer collect on userspace PC values, or
> kernel ones as well?

  Either, or both, depending on the measurement settings.

  I live in a 64-bit world, so my take on this issue would be to expose
the PC as a uint64_t, always. There is already so much overhead in the
default per-sample header that I wouldn't worry about it.

  Now 64 bit might not always be enough. E.g., on PA-RISC. But _I_ do
not care much about Linux on PA.

   Eric

-- 
Eric Gouriou                                         eric.gouriou@hp.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [perfmon] perfmon2 code review: 32-bit ABI on 64-bit OS
  2006-02-13  0:03                       ` Eric Gouriou
@ 2006-02-13 20:31                         ` Stephane Eranian
  0 siblings, 0 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-02-13 20:31 UTC (permalink / raw)
  To: Eric Gouriou
  Cc: David Gibson, Bryan O'Sullivan, perfmon, perfctr-devel, linux-kernel

David,

On Sun, Feb 12, 2006 at 04:03:44PM -0800, Eric Gouriou wrote:
> David Gibson wrote:
> >On Sat, Feb 11, 2006 at 02:33:54PM -0800, Stephane Eranian wrote:
> [...]
> >>The most challenging piece is the IP (program pointer) that is in every
> >>sample. Today it is defined as unsigned long because this is fairly
> >>natural for a code address. The 64bit OS captures addresses as 64-bit,
> >>the 32-bit monitoring tool running on top has to consume them as 64-bit
> >>addresses, so u64 would be fine. 
> >>
> >>But not on a 32-bit kernel with a 32-bit tool, addresses exported as u64
> >>would certainly work but consume double to buffer space, and that is a
> >>more serious issue in my mind.
> >
> >Hmm.. does the sampling buffer collect on userspace PC values, or
> >kernel ones as well?
> 
>  Either, or both, depending on the measurement settings.
> 
>  I live in a 64-bit world, so my take on this issue would be to expose
> the PC as a uint64_t, always. There is already so much overhead in the
> default per-sample header that I wouldn't worry about it.
> 
Eric is right, on many architectures, incl. PPC64 I am sure, you can easily
configure a counter to measure at any priv levels including at the kernel level.
As such a 32-bt monitoring tool could see 64-bit generated samples. Similarly,
I don't think it would be unreasonable to have a 32-bit tool monitor 64-bit
applications.

The question is whether hardcoding the IP to always be u64 is a valid choice.
Eric's comment about overhead is based on the current default sampling format
which systematically adds a fixed size header to each sample. That header
contains the IP. So adding 4 bytes to this header is not a big deal.

However, because we can define virtual PMDs that map to software resources, it
is likely that the default format will evolve to allow an application to specify
everything it needs for each sample. For instance, you can have a PMD
that maps to the current PID, another one that maps to the interrupt IP. Then
you can chose to include those into the sample and you would nto need a fixed size
header anymore.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: perfmon2 code review: 32-bit ABI on 64-bit OS
  2006-02-10 18:27               ` Bryan O'Sullivan
       [not found]                 ` <1139681785.4316.33.camel@localhost.localdomain>
@ 2006-02-13 20:34                 ` Stephane Eranian
  1 sibling, 0 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-02-13 20:34 UTC (permalink / raw)
  To: Bryan O'Sullivan; +Cc: linux-kernel, perfmon, perfctr-devel, linux-ia64

Bryan,

On Fri, Feb 10, 2006 at 10:27:02AM -0800, Bryan O'Sullivan wrote:
> 
> On some 64-bit arches (e.g. x86_64), most userspace code is 64-bit,
> while on others (e.g. powerpc), most is 32-bit.  Reducing the number of
> things that a userspace tool or library writer can trip over seems like
> a good thing here, even if it slightly complicates perfmon's internals.
> 
> > Note that there are similar issues with the remapped sampling buffer.
> > There, you need to explicitly compile your tool with a special option
> > to force certain types to be 64-bit (size_t, void *).
> 
> It's pretty normal to just use 64-bit quantities in these cases, and
> cast appropriately.
> 

So if I understand you correctly, you are saying it is best to have bitmasks
hardcoded to u64 and have the kernel cast to match the bitmap_*() interface.
This would not cause any alignment problems on neither 32-bit nor 64-bit system.


-- 

-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Perfctr-devel] Re: [perfmon] perfmon2 code review: 32-bit ABI on 64-bit OS
       [not found]                     ` <1139857076.4342.10.camel@localhost.localdomain>
@ 2006-02-14 23:41                       ` Stephane Eranian
  2006-02-20 17:54                       ` Stephane Eranian
  1 sibling, 0 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-02-14 23:41 UTC (permalink / raw)
  To: Philip Mucci; +Cc: perfmon, Bryan O'Sullivan, perfctr-devel, linux-kernel

Phil,

On Tue, Feb 14, 2006 at 12:57:55AM +0600, Philip Mucci wrote:
> Stefane,
> 
> I know this is ugly, but what about in the user code checking the arch?
> Is it not true that if a kernel is running 64 bit, it has the 64 tagged
> on the the end of the arch? i.e. mips64, ppc64, x86_64?
> 
> The other solution would be to add am information call to the API, like
> perfctr has. This would export the processor type along with other
> feature bits, including the number of bits of the IP. 
> 

The problem is at compile time and not so much at runtime.
Take a kernel struct that is shared with user (i.e., passed through syscall)
that has the following layout:

	struct foo {
		unsigned long bar;
		int dummy;
	};

For a 64-bit app on a 64-bit OS OR a 32-bit app on a 32-bit OS, this works
perfectly.

For a 32-bit on a 64-bit OS, there is a problem, the 32-bit app must be compiled
with the following definition instead:
	struct foo {
		unsigned long long bar;
		int dummy;
	};

to share the struct with the 64-bit OS. An application compile with the above
struct, would not work when run on a 32-bit OS.

So I think that we need to replace all unsigned long, size_t, void * by uint64_t
to make sure this works either way. It is overkill on pure 32-bit but ensures
that the application can be migrated over to a 64-bit OS without the need for
special recompilation.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: perfmon2 code review: 32-bit ABI on 64-bit OS
       [not found]                     ` <1139857076.4342.10.camel@localhost.localdomain>
  2006-02-14 23:41                       ` [Perfctr-devel] " Stephane Eranian
@ 2006-02-20 17:54                       ` Stephane Eranian
  1 sibling, 0 replies; 20+ messages in thread
From: Stephane Eranian @ 2006-02-20 17:54 UTC (permalink / raw)
  To: Philip Mucci; +Cc: Bryan O'Sullivan, perfmon, perfctr-devel, linux-kernel

Hello,

So I worked some more on these 32bit vs 64bit ABI issues. I came
to the conclusion that the only way to make this work cleanly
without any special compilation flags or runtime support is
to simply ensure that all data structures exchanged/shared between
the monitoring tool  and the kernel use only fixed size types.
This way, the sizeof struct  and fields offset remains constant
in ILP32 or LP64 mode. As a consequence I now have:
	- all bitmasks use u64
	- all size_t are replaced with u64
	- the default format IP (instruction pointer) is u64

For the IP, the most significant bits contain all zeroes in
ILP32 mode. With the default sampling buffer format, you add
4 bytes to the sample header but then you have nothing special
to do with user level code. I think this is also inline with
the fact that 64-bit computing is becoming widely available
for desktop and laptop computers.

As an example, I can now compile a tool for P4 in 32-bit and
run it on my 32-bit Xeon. But I can also run the same *binary*
on my EM64T and it works as expected including for the sampling
buffer. I believe the same would be possible with Opteron, although
I have not tried.

I hope this closes another sticking point about the interface.


On Tue, Feb 14, 2006 at 12:57:55AM +0600, Philip Mucci wrote:
> 
> I know this is ugly, but what about in the user code checking the arch?
> Is it not true that if a kernel is running 64 bit, it has the 64 tagged
> on the the end of the arch? i.e. mips64, ppc64, x86_64?
> 
> The other solution would be to add am information call to the API, like
> perfctr has. This would export the processor type along with other
> feature bits, including the number of bits of the IP. 
> 
> Either of these combined with compile time ABI/bitness constants should
> cover the cases no?
> 
> Phil
> 
> 
> > The problem is in the user level header file for the sampling buffer.
> > We would need a data type that is 64-bit for IP if the host OS is 64-bit
> > (regardless of the ABI used by the tool, i.e., the compiler). And a data
> > type that is 32-bit on 32-bit OS. The problem is that there is no compiler
> > flag or header flag somewhere that could guide the compiler. In the case
> > of MIPS, we have defined a libpfm compile flags that indicates we want
> > the 64-bit OS definition when compiling for a 32-bit application.
> > 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Perfctr-devel mailing list
> Perfctr-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfctr-devel

-- 

-Stephane

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [perfmon] Re: quick overview of the perfmon2 interface
  2005-12-20 10:51 ` Andrew Morton
@ 2005-12-22 18:48   ` Stephane Eranian
  0 siblings, 0 replies; 20+ messages in thread
From: Stephane Eranian @ 2005-12-22 18:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: perfmon, linux-ia64, linux-kernel, perfctr-devel

Andrew,

> > 6/ PMU DESCRIPTION MODULES
> >    -----------------------
> > 
> > 	The logical PMU is driven by a PMU description table. The table
> > 	is implemented by a kernel pluggable module. As such, it can be
> > 	updated at will without recompiling the kernel, waiting for the next
> > 	release of a Linux kernel or distribution, and without rebooting the
> > 	machine as long as the PMU model belongs to the same PMU family. For
> > 	instance, for the Itanium Processor Family, the architecture specifies
> > 	the framework for the PMU. Thus the Itanium PMU specific code is common
> > 	across all processor implementations. This is not the case for IA-32.
> 
> I think the usefulness of this needs justification.  CPUs are updated all
> the time, and we release new kernels all the time to exploit the new CPU
> features.  What's so special about performance counters that they need such
> special treatment?
> 
Given the discussion we are having, I thought it would be useful to take
a concrete example to try and clarify what I am talking about here. I chose
to use the PMU description module/table of the Pentium M because this is
a very common platform supported by all interfaces. The actual module contains
the following (arch/i386/perfmon/perfmon_pm.c) information:

	- desciption of the PMU register: where they are, their type
	- a callback for an option PMC write checker.
	- a probe routine (not shown)
	- an module_init/module_exit (not shown)

Let's look at the informaiton in more details:

The first information is architecture specific structure
used by the architecture specific code (arch/i386/perfmon/perfmon.c).
It contains the information about the MSR addresses for each register
that we want to access. Let's look at PMC0:

	{{MSR_P6_EVNTSEL0, 0}, 0, PFM_REGT_PERFSEL},

   - field 0=MSR_P6_EVNTSEL0: PMC0 is mapped onto MSR EVENTSEL0 (for thread 0)
   - field 1=0: unused Pentium M does not support Hyperthreading (no thread 1)
   - field 2=0: PMC0 is controlling PMD 0
   - field 3=PFM_REGT_PERFSEL: this is a PMU control register

The business about HT is due to the fact that the i386 code is shared
with P4/Xeon.

struct pfm_arch_pmu_info pfm_pm_pmu_info={
	.pmc_addrs = {
		{{MSR_P6_EVNTSEL0, 0}, 0, PFM_REGT_PERFSEL},
                    
		{{MSR_P6_EVNTSEL1, 0}, 1, PFM_REGT_PERFSEL}
	},
	.pmd_addrs = {
		{{MSR_P6_PERFCTR0, 0}, 0, PFM_REGT_CTR},
		{{MSR_P6_PERFCTR1, 0}, 0, PFM_REGT_CTR}
	},
	.pmu_style = PFM_I386_PMU_P6,
	.lps_per_core = 1
};

Now let's look at the mapping table. It contains the following information:
	- attribute of the register
	- logical name
	- default value
	- reserved bitfield

The mapping table describes the very basic and generic properties of a register and
is using the same structure for all PMU models. In contrast the first structure
is totally architecture specific.

static struct pfm_reg_desc pfm_pm_pmc_desc[PFM_MAX_PMCS+1]={
/* pmc0  */ { PFM_REG_W, "PERFSEL0", PFM_PM_PMC_VAL, PFM_PM_PMC_RSVD},
/* pmc1  */ { PFM_REG_W, "PERFSEL1", PFM_PM_PMC_VAL, PFM_PM_PMC_RSVD},
	    { PFM_REG_END} /* end marker */
};

static struct pfm_reg_desc pfm_pm_pmd_desc[PFM_MAX_PMDS+1]={
/* pmd0  */ { PFM_REG_C  , "PERFCTR0", 0x0, -1},
/* pmd1  */ { PFM_REG_C  , "PERFCTR1", 0x0, -1},
	    { PFM_REG_END} /* end marker */
};

Now the write checker. It is used to intervene on the value passed by 
the user when it programs a PMC register. The role of the function is
to ensure that the reserved bitfields retains their default value.
It can be used to verify that a PMC value is actually authorized and
sane. PMU may disallowd certain combination of values. The checker is
optional. On Pentium M we simply enforce resreved bitfields.

static int pfm_pm_pmc_check(struct pfm_context *ctx, struct pfm_event_set *set,
			    u16 cnum, u32 flags, u64 *val)
{
	u64 tmpval, tmp1, tmp2;
	u64 rsvd_mask, dfl_value;

	tmpval = *val;
	rsvd_mask = pfm_pm_pmc_desc[cnum].reserved_mask;
	dfl_value = pfm_pm_pmc_desc[cnum].default_value;

	if (flags & PFM_REGFL_NO_EMUL64)
		dfl_value &= ~(1ULL << 20);

	/* remove reserved areas from user value */
	tmp1 = tmpval & rsvd_mask;

	/* get reserved fields values */
	tmp2 = dfl_value & ~rsvd_mask;
	*val = tmp1 | tmp2;

	return 0;
}

And finally the structure that we register with the core of perfmon.
It includes among other things the actual width of the counters as this
is useful for sampling and 64-bit virtualization of counters.

static struct pfm_pmu_config pfm_pm_pmu_conf={
	.pmu_name = "Intel Pentium M Processor",
	.counter_width = 31,
	.pmd_desc = pfm_pm_pmd_desc,
	.pmc_desc = pfm_pm_pmc_desc,
	.pmc_write_check = pfm_pm_pmc_check,
	.probe_pmu = pfm_pm_probe_pmu,
	.version = "1.0",
	.flags = PMU_FLAGS,
	.owner = THIS_MODULE,
	.arch_info = &pfm_pm_pmu_info
};

This is not much information.

If this is not implemented as a kernel module, it would have to be integrated into
the kernel no matter what. This is very basic information that perfmon needs to operate
on the PMU registers. I prefer the table driven approach to the hardcoding and checking
everywhere. I hope you agree with me here.

The PMU description module is simply a way to separate this information from the
core. Note that the modules can, of course, be compiled in.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [perfmon] Re: quick overview of the perfmon2 interface
  2005-12-22 13:31 [perfmon] Re: quick overview of the perfmon2 interface Truong, Dan
@ 2005-12-22 13:46 ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2005-12-22 13:46 UTC (permalink / raw)
  To: Truong, Dan
  Cc: stephane.eranian, perfmon, linux-ia64, linux-kernel, perfctr-devel

"Truong, Dan" <dan.truong@hp.com> wrote:
>
> The PMU is becoming a standard commodity. Once Perfmon is
> "the" Linux interface, all the tools can align on it and
> coexist, push their R&D forward, and more importantly become
> fully productized for businesses usage.
>

The apparently-extreme flexibility of the perfmon interfaces would tend to
militate against that, actually.  It'd become better productised if it had
one interface and stuck to it.

(I haven't processed Stephane's reply yet - will get there)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [perfmon] Re: quick overview of the perfmon2 interface
@ 2005-12-22 13:31 Truong, Dan
  2005-12-22 13:46 ` Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Truong, Dan @ 2005-12-22 13:31 UTC (permalink / raw)
  To: Eranian, Stephane, Andrew Morton
  Cc: perfmon, linux-ia64, linux-kernel, perfctr-devel

> Thanks to David, Dan and Phil for their comments.

Another note on the urgency of standardizing Perfmon:

Anarchy is not a good breeding ground for tools that need a
stable infrastructure to mature. Being "there" is what made
PAPI and perfctr popular and somewhat standard infrastructure.

Compilers, tools, JVMs... -you name it- are all moving
fast towards using hardware counters to get feedback,
tune, monitor or measure application behavior.

The PMU is becoming a standard commodity. Once Perfmon is
"the" Linux interface, all the tools can align on it and
coexist, push their R&D forward, and more importantly become
fully productized for businesses usage. Hopefully Perfmon's
interface is powerful enough to support future needs.

Good luck Stephane :)

Cheers,

Dan-

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [perfmon] Re: quick overview of the perfmon2 interface
@ 2005-12-21 22:39 Truong, Dan
  0 siblings, 0 replies; 20+ messages in thread
From: Truong, Dan @ 2005-12-21 22:39 UTC (permalink / raw)
  To: Andrew Morton, Eranian, Stephane
  Cc: perfmon, linux-ia64, linux-kernel, perfctr-devel

Just a couple of cents here to support Stephane's design :)


> Why would one want to change the format of the sampling buffer?

The idea is to allow user custom formats for one, and allow Exotic
architectures for second.

Say you want to reduce the volume of data passed to the application And
stored in the buffers, you can then do some pre-processing Inside the
kernel via a custom module.

You can also return more complex data than just PMU registers, think
Call stacks or other OS related information that can complement the The
PMU data.

Some data can be returned vis pseudo PMU registers (i.e. extentions),
Like the interval timer, PID/TID, etc. but for more complex and less
Synchronous data you may end up needed a more powerfull buffer format
With headers etc.

Another issue can be if hardware buffer support is provided. The
hardware Buffer support will not allow collection of pseudo-counters
which are Supported by software, so again the packaging may not be as
linear as A repeated sequence of counters...

With Stephane we had discussed this buffer format, and came to the
Conclusion that flexibility there will avoid hitting the wall.

You don't know what tomorrow is made of (yet)...



> > 	The PMU register description is implemented by a kernel
pluggable
> 
> Is that option important, or likely to be useful?  Are you sure there 
> isn't some overdesign here?

It will allow bringup of new PMUs on new architectures more easily, and
simpler distribution of support. It will also allow CPU designers to
create custom drivers that support non-public features to debug the
CPUs.



> hm.  I'm surprised at such a CPU-centric approach.  I'd have expected 
> to see a more task-centric model.

Both per-thread and system-wide measurments are useful.

Systemwide is used to evaluate the beavior of the whole system when
tuning a large load (think TPC-C, SpecWeb, SpecJapp...) Per thread is
used for specific application/thread tuning and self monitoring. Also
per-thread monitoring is not always adviseable, for example when there
Are a large number of threads loading the system, adding that many
monitors will impact the system performance, so you will want to measure
per CPU.

> So the kernel buffers these messages for the read()er.  How does it 
> handle the case of a process which requests the messages but never 
> gets around to read()ing them?

Stephane, I would assume that the monitoring session attached to
The buffer returning the message just stalls. If there is multiplexing,
Will coming back to that stalled buffer stall all the multiplexed
Sessions? I would assume so.



> Why would one want to randomise the PMD after an overflow?

Everybody does it :) Helps generate an un-biased picture.



> I think the usefulness of this needs justification.  CPUs are updated 
> all the time, and we release new kernels all the time to exploit the 
> new CPU features.  What's so special about performance counters that 
> they need such special treatment?

The PMU is not fully architected usually. Nothing prevents a
Model to be shipped with PMU upgrades.
Also the PMU can be used by architects for validation of the designs.
Easier early access to the functionalities helps.

The PMU is a direct evolution of the debug counters that were used
To debug CPUs but not available for general use. They are still used
In that fashion too, and a main reason they exist.


Cheers,

Dan-

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2006-02-20 17:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-20 18:37 [perfmon] Re: quick overview of the perfmon2 interface Truong, Dan
2006-01-20 22:22 ` Andrew Morton
2006-01-25 20:33 ` Bryan O'Sullivan
2006-01-25 22:28   ` [Perfctr-devel] " Stephane Eranian
2006-01-25 22:46     ` Bryan O'Sullivan
2006-01-26  7:48       ` Stephane Eranian
2006-01-26 18:26         ` Bryan O'Sullivan
     [not found]     ` <1138649612.4077.50.camel@localhost.localdomain>
     [not found]       ` <1138651545.4487.13.camel@camp4.serpentine.com>
     [not found]         ` <1139155731.4279.0.camel@localhost.localdomain>
     [not found]           ` <1139245253.27739.8.camel@camp4.serpentine.com>
2006-02-10 15:36             ` perfmon2 code review: 32-bit ABI on 64-bit OS Stephane Eranian
2006-02-10 18:27               ` Bryan O'Sullivan
     [not found]                 ` <1139681785.4316.33.camel@localhost.localdomain>
2006-02-11 22:33                   ` [perfmon] " Stephane Eranian
2006-02-12 23:46                     ` [Perfctr-devel] " David Gibson
2006-02-13  0:03                       ` Eric Gouriou
2006-02-13 20:31                         ` Stephane Eranian
     [not found]                     ` <1139857076.4342.10.camel@localhost.localdomain>
2006-02-14 23:41                       ` [Perfctr-devel] " Stephane Eranian
2006-02-20 17:54                       ` Stephane Eranian
2006-02-13 20:34                 ` Stephane Eranian
  -- strict thread matches above, loose matches on Subject: below --
2005-12-22 13:31 [perfmon] Re: quick overview of the perfmon2 interface Truong, Dan
2005-12-22 13:46 ` Andrew Morton
2005-12-21 22:39 Truong, Dan
2005-12-19 11:31 Stephane Eranian
2005-12-20 10:51 ` Andrew Morton
2005-12-22 18:48   ` [perfmon] " Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).