All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-26 14:34 ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-26 14:34 UTC (permalink / raw)
  To: perfmon; +Cc: perfctr-devel, linux-ia64, linux-kernel, oprofile-list

Hello,

I have released another version of the perfmon new code base package.
This version of the kernel patch is relative to 2.6.18. This longer
than usual delay between releases comes from the large amount of changes
than went into this new release following the feedback I got on LKML. As
a result, the code has improved.

The kernel patch is split between infrastructure work and perfmon2 proper.
The infrastructure contains the following changes which will be integrated
into mainline before perfmon2 is. They are provided in the base.diff file.
The infrastructure changes include:
	- x86_64: fix idle_notifier to cover all interrupts entry/exit for idle thread
	- i386  : add idle notifier (copy of x86_64 notifier)
	- ia64  : idle notifier (copy of x86_64 notifier)
	- i386  : add smp_call_function_single()
	- i386  : add CPU_HAS_PEBS to cpufeature.h
	- x86_64: add CPU_HAS_PEBS to cpufeature.h
	- x86_64: use TIF flags for debug registers + IO bitmap lazy context switching
	- ia64  : add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)
	- i386  : add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)
	- x86-64: add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)


The perfmon2 kernel patch includes:
	- updated to 2.6.18. System call numbers have once again changed (non IA-64)
	- leverage TIF mechanism for context switch on i386/x86-64/MIPS
	- leverage TIF mechanism for kernel exit work (TIF_PERFMON_WORK)
	- system-wide monitoring is turned off when entering the low level idle loop
	- removed PFM_EXCL_IDLE (use idle notifier)
	- removed perfmon_kapi.c, need to make a stronger case or find alternatives
	- improved checking on vector arguments to avoid overflows
	- corrected RLIMIT_MEMLOCK accounting
	- using sched_clock() for all duration calculation and default format timestamp
	- merged 32-bit and 64-bit PEBS support for P4 (two distinct UUID still)
	- removed insecure mode (cr4.pce) for i386, x86_64
	- fixed P4 ESCR/CCCR register mappings when HyperThreading is enabled (Kevin Corry)
	- added license on all files
	- reinforced error checking in sysfs related code
	- rewritten sysfs register mapping, now have  one file = one value
	- PMU description table exposes hardware address/idx through sysfs
	- enforced use of ptrace to block a thread in per-thread mode
	- added CPU HOTPLUG support (system-wide mode)
	- MIPS updates by Phil Mucci
	- fixed locking issues
	- renamed perfmon_amd64.c to perfmon_k8.c
	- Support sharing of PMU with NMI watchdog (AMD K8 processors ONLY)
	- split perfmon.c some more: read/write support in perfmon_rw.c
	- changed API for pfm_load_context(): load_pid indicated CPU to monitor in system-wide
	- Kconfig perfmon2 options turned off by default
	- Intel architectural PMU supported in 32 and 64bit mode
	- many other changes and cleanups

I have also released a new libpfm, libpfm-3.2-060926, with lots of
changes. Here are some of the most important ones:

	- added support for Intel Core Duo/Core Solo
	- added support for MIPS 20KC, 24K, 25KF, 34K, 5KC, 74K, R10000, R12000, RM7000, RM9000, SB1/SB1A, VR5432, VR5500 (Phil Mucci)
	- merged 32-bit and 64-bit PEBS support (mtach 2.6.18 kernel)
	- rewritten most CPU detection routines to /proc/cpuinfo instead of cpuid instruction
	- updated i386_p6 event table to expose unit masks
	- added a few i386_p6 events to exploit unit masks mechansim
	- modified pfm_load_context() load_pid to match 2.6.18 kernel expectation
	- removed EXCL_IDLE (flag is gone from 2.6.18 interface)
	- move all destination directories variable setup to config.mk (Phil Mucci)
	- split destination directories between lib, inc, and man (Phil Mucci)
	- added pfm_get_full_event_name() to build event+unit mask string. pfm_get_event_name() is deprecated
	- added pfm_find_full_event() to return event+unit masks descriptors. pfm_find_event() is deprecated
	- pfm_dispatch_events() returns list of PMC AND list of PMD registers used
	- pfm_dispatch_events() returns perfmon2 logical PMC/PMD indexes AND hardware address/index
	- fixed ESCR/CCCR mapping issues in Pentium 4 support
	- changed interface for pfm_get_cycle_event() and pfm_get_inst_retired_event() to use pfmlib_event_t
	- added support for MESI as unit masks for Montecito cache events
	- added PFMLIB_I386_PM_PMU to separate P6 from Pentium M as they use different event tables
	- Intel architectural perfmon (GEN_IA32) compiles in x86-64 mode
	- modified noploop() to avoid being optimized away by compiler (e.g., gcc-4.1)
	- updated man pages
	- rewritten showreginfo.c to match new /sysfs layout for register mappings
	- updated all examples to use pfm_get_full_event_name() and pfm_find_full_event()

With those changes, all common exampes now work for Pentium 4 (HT restrictions may caused
certain examples to fail, though). This version of the library works only when 2.6.18.

Also a new version of pfmon, pfmon-3.2-060926, to take advantage of the
update in libpfm:
	- added Intel Core Duo/Core Solo support
	- imporved Pentium 4 support due to libpfm changes
	- replaced sysconf(_SC_CLK_TCK) with clock_getres() to report correct clock tick
	- removed exclude idle support (gone from 2.6.18 kernel)
	- updated to use the new interface for pfm_get_cycle_event()
	- uses new pfm_get_full_event_name(), pfm_find_full_event() interfaces
	- changed pfmon_system.c to gracefully deal with hotplug CPU support
	- print maximum number of counters per set with -I option
	- fixed bugs with --inv and --edge (pfmlib_gen_ia32.c) option whereby this option would not be
	- added PREFIX, BINDIR, MANDIR, DATADIR to Makefil (Phil Mucci)

This version requires libpfm-3.2-060926.

You can get a more detailed log of changes the the CVS tree.

You can grab the new packages at our web site:

	 http://perfmon2.sf.net

Enjoy,

PS: I will post a kernel patch to LKML and a diffstat on the perfmon mailing list.
-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-26 14:34 ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-26 14:34 UTC (permalink / raw)
  To: perfmon; +Cc: perfctr-devel, linux-ia64, linux-kernel, oprofile-list

Hello,

I have released another version of the perfmon new code base package.
This version of the kernel patch is relative to 2.6.18. This longer
than usual delay between releases comes from the large amount of changes
than went into this new release following the feedback I got on LKML. As
a result, the code has improved.

The kernel patch is split between infrastructure work and perfmon2 proper.
The infrastructure contains the following changes which will be integrated
into mainline before perfmon2 is. They are provided in the base.diff file.
The infrastructure changes include:
	- x86_64: fix idle_notifier to cover all interrupts entry/exit for idle thread
	- i386  : add idle notifier (copy of x86_64 notifier)
	- ia64  : idle notifier (copy of x86_64 notifier)
	- i386  : add smp_call_function_single()
	- i386  : add CPU_HAS_PEBS to cpufeature.h
	- x86_64: add CPU_HAS_PEBS to cpufeature.h
	- x86_64: use TIF flags for debug registers + IO bitmap lazy context switching
	- ia64  : add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)
	- i386  : add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)
	- x86-64: add is_multithreading_enabled() to detect whether threads could be activated (HOTPLUG)


The perfmon2 kernel patch includes:
	- updated to 2.6.18. System call numbers have once again changed (non IA-64)
	- leverage TIF mechanism for context switch on i386/x86-64/MIPS
	- leverage TIF mechanism for kernel exit work (TIF_PERFMON_WORK)
	- system-wide monitoring is turned off when entering the low level idle loop
	- removed PFM_EXCL_IDLE (use idle notifier)
	- removed perfmon_kapi.c, need to make a stronger case or find alternatives
	- improved checking on vector arguments to avoid overflows
	- corrected RLIMIT_MEMLOCK accounting
	- using sched_clock() for all duration calculation and default format timestamp
	- merged 32-bit and 64-bit PEBS support for P4 (two distinct UUID still)
	- removed insecure mode (cr4.pce) for i386, x86_64
	- fixed P4 ESCR/CCCR register mappings when HyperThreading is enabled (Kevin Corry)
	- added license on all files
	- reinforced error checking in sysfs related code
	- rewritten sysfs register mapping, now have  one file = one value
	- PMU description table exposes hardware address/idx through sysfs
	- enforced use of ptrace to block a thread in per-thread mode
	- added CPU HOTPLUG support (system-wide mode)
	- MIPS updates by Phil Mucci
	- fixed locking issues
	- renamed perfmon_amd64.c to perfmon_k8.c
	- Support sharing of PMU with NMI watchdog (AMD K8 processors ONLY)
	- split perfmon.c some more: read/write support in perfmon_rw.c
	- changed API for pfm_load_context(): load_pid indicated CPU to monitor in system-wide
	- Kconfig perfmon2 options turned off by default
	- Intel architectural PMU supported in 32 and 64bit mode
	- many other changes and cleanups

I have also released a new libpfm, libpfm-3.2-060926, with lots of
changes. Here are some of the most important ones:

	- added support for Intel Core Duo/Core Solo
	- added support for MIPS 20KC, 24K, 25KF, 34K, 5KC, 74K, R10000, R12000, RM7000, RM9000, SB1/SB1A, VR5432, VR5500 (Phil Mucci)
	- merged 32-bit and 64-bit PEBS support (mtach 2.6.18 kernel)
	- rewritten most CPU detection routines to /proc/cpuinfo instead of cpuid instruction
	- updated i386_p6 event table to expose unit masks
	- added a few i386_p6 events to exploit unit masks mechansim
	- modified pfm_load_context() load_pid to match 2.6.18 kernel expectation
	- removed EXCL_IDLE (flag is gone from 2.6.18 interface)
	- move all destination directories variable setup to config.mk (Phil Mucci)
	- split destination directories between lib, inc, and man (Phil Mucci)
	- added pfm_get_full_event_name() to build event+unit mask string. pfm_get_event_name() is deprecated
	- added pfm_find_full_event() to return event+unit masks descriptors. pfm_find_event() is deprecated
	- pfm_dispatch_events() returns list of PMC AND list of PMD registers used
	- pfm_dispatch_events() returns perfmon2 logical PMC/PMD indexes AND hardware address/index
	- fixed ESCR/CCCR mapping issues in Pentium 4 support
	- changed interface for pfm_get_cycle_event() and pfm_get_inst_retired_event() to use pfmlib_event_t
	- added support for MESI as unit masks for Montecito cache events
	- added PFMLIB_I386_PM_PMU to separate P6 from Pentium M as they use different event tables
	- Intel architectural perfmon (GEN_IA32) compiles in x86-64 mode
	- modified noploop() to avoid being optimized away by compiler (e.g., gcc-4.1)
	- updated man pages
	- rewritten showreginfo.c to match new /sysfs layout for register mappings
	- updated all examples to use pfm_get_full_event_name() and pfm_find_full_event()

With those changes, all common exampes now work for Pentium 4 (HT restrictions may caused
certain examples to fail, though). This version of the library works only when 2.6.18.

Also a new version of pfmon, pfmon-3.2-060926, to take advantage of the
update in libpfm:
	- added Intel Core Duo/Core Solo support
	- imporved Pentium 4 support due to libpfm changes
	- replaced sysconf(_SC_CLK_TCK) with clock_getres() to report correct clock tick
	- removed exclude idle support (gone from 2.6.18 kernel)
	- updated to use the new interface for pfm_get_cycle_event()
	- uses new pfm_get_full_event_name(), pfm_find_full_event() interfaces
	- changed pfmon_system.c to gracefully deal with hotplug CPU support
	- print maximum number of counters per set with -I option
	- fixed bugs with --inv and --edge (pfmlib_gen_ia32.c) option whereby this option would not be
	- added PREFIX, BINDIR, MANDIR, DATADIR to Makefil (Phil Mucci)

This version requires libpfm-3.2-060926.

You can get a more detailed log of changes the the CVS tree.

You can grab the new packages at our web site:

	 http://perfmon2.sf.net

Enjoy,

PS: I will post a kernel patch to LKML and a diffstat on the perfmon mailing list.
-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-26 14:34 ` Stephane Eranian
@ 2006-09-27  5:09   ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-27  5:09 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, perfctr-devel, linux-ia64, linux-kernel, oprofile-list

On Tue, 26 Sep 2006 07:34:20 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> I have released another version of the perfmon new code base package.
> This version of the kernel patch is relative to 2.6.18. This longer
> than usual delay between releases comes from the large amount of changes
> than went into this new release following the feedback I got on LKML. As
> a result, the code has improved.

Thanks for doing that.

Would it be possible to get an accounting of the disposition of the various
review comments?  Of the "yes I did that" or "OK, but I'll do it later" or
"no, you're an idiot" form?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-27  5:09   ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-27  5:09 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, perfctr-devel, linux-ia64, linux-kernel, oprofile-list

On Tue, 26 Sep 2006 07:34:20 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> I have released another version of the perfmon new code base package.
> This version of the kernel patch is relative to 2.6.18. This longer
> than usual delay between releases comes from the large amount of changes
> than went into this new release following the feedback I got on LKML. As
> a result, the code has improved.

Thanks for doing that.

Would it be possible to get an accounting of the disposition of the various
review comments?  Of the "yes I did that" or "OK, but I'll do it later" or
"no, you're an idiot" form?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-27  5:09   ` Andrew Morton
@ 2006-09-27 22:48     ` Stephane Eranian
  -1 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-27 22:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: perfmon, linux-ia64, linux-kernel

Andrew,

Here is the summary of the various point raised by your review and the current
status. I am hoping to close all points by next release.

[ak] : use add_timer()/del_timer() instead of hook into smp_local_timer_interrupt() set timeout based set 
	- started looking into this, not confident it could be done and be sa simple as what we have today

[ak] : put set_intr_gate() all in one place for each arch()
	- done

[ak] : suggest use upcoming exit_thread/copy_thread notifiers instead of hooks
	- abandonned

[ak] : separate patch for _TIF_WORK_CTXSW
	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline

[ak] : clarify pfm_handle_work(), use TIF flags to trigger call
	- trigger by TIF_PERFMON_WORK now. Optimization as it is called only for tasks 
	  that have it set, not all tasks with TIF_PERFMON_CTXSW

[ak] : x86-64 FIRST_SYSTEM_VECTOR does not need to be adjusted
	- done

[ak] : CONFIG_PERFMON* not on by default
	- done

[ak] : rename *_AMD64 to *_K8
	- done

[ak] : rename EM64T to P4
	- done

[ak] : explain what PEBS does
	- done

[ak] : merge 64-bit and 32-bit PEBS code
	- done

[ak] : what about UUID for sampling format?
	- could try to replace UUID with plain old string. work-in-progress

[ak] : check alignment of shared struct for 32-it and 64-bit ABIs
	- done. structures shared with user have no padding

[ak] : had comment about lazy save/restore strategy
	- not yet done

[ak] : may have to add __kprobes to some functions
	- started doing this on some functions. Need better understanding on when to use this

[ak] : use sched_clock() to get timing
	- done

[ak] : explain context locking
	- not yet done

[ak] : explain statistics gathering
	- done. This is mostly for debugging purposes

[ak] : justify why multiple sampling formats are needed, instead of just hardcoding
	- done. Just think of Intel PEBS support without it

[ak] : i386 should have wrmsrl()
	- modified the code to avoid having a #define

[ak] : cleaner integration with NMI watchdog
	- integration done on AMD K8. Issues on P4, P6, due to PMU design

[ak] : drop cr4.pce management, will be one by default with PERFCTR0 as a replacement or TSC
	- done

[ak] : add exit_idle/enter_idle to smp_pmu_interrupt()
	- done

[ak] : what is the difference between masking/unmasking monitoring and start/stop
	- need to provide explanation. In short: on Itanium start/stop may be controlled
	  from userland, kernel need another way to guarantee monitoring is stopped

[ak] : check that CPUID instruction is supported before using it
	- done

[ak] : fix helper function to module autoloader to better check for architectural PMU on Intel
	- done

[ak] : add synthetic flag to cpufeature to simplify detection of architectural PMU and features
	- done. Added PEB. May be able to do more. Andi added ARCH_PERFMON in 2.6.18

[ak] : simplify model/vendor/family detection in PMU description modules 
	- done

[ak] : ignore registers if find more than what is supported by generic architectural PMU description module
	- done

[ak] : removed duplicate APIC enable detction
	- done

[ak] : add a force module option on P4 (just too many variations)
	- done

[ak] : may need some apic workaround for older i386 CPU (pfm_arch_resend_irq())
	- not yet investigated

[ak] : pfm_serialize() : should this be sync_core() on x86?
	- not yet investigated

[ak] : over-abstraction in pfm_arch_read_pmd()
	- not yet done

[ak] : security fix in syscall with vector argument (overflow)
	- done


[akpm]: lots of poking into task lifetime internals
	- simplified by using ptrace_check_attach(). Need to check if more can be done

[akpm]: checks running atomic in some of the task checks
	- yes

[akpm]: fix return value for pfm_get_args() 
	- done

[akpm]: use fget_light() in some place instead of fget()
	- not sure understand when to use one versus the other

[akpm]: check integer multiply overflow in syscall with vector arguments
	- done

[akpm] : add ptr_to_free to syscall with vector arguments to make logic more explicit
	- done

[akpm] : make sure we do not leak kernel information through copy_to_user() when returning with errors
	- done

[akpm]: documentation for syscall? Is there an API specification?
	- answered. In short, there exists a specification but it needs to be updated

[akpm]: check structure shared between kernel and user do not have padding inserted by compiler (leak)
	- done

[akpm]: move pfm_sysfs_add_pmu() declaration to header file if necessary
	- not yet done

[akpm: change return value for sysfs debug_store()  function to return sz
	- done

[akpm]: use goto instead of multiple 'return ret;'
	- done

[akpm]: does this code support CPU HOTPLUG?
	- yes, as of 2.6.18 perfmon2 supoprts CPU hotplug. This affects system-wide measurements
	  and /sysfs stuff.

[akpm]: what use UUID for sampling formats?
	- switch to clear text string, not yet done

[akpm]: pfm_sysfs_remove_fmt(), pfm_sysfs_add_fmt() add comments to explain goal
	- not yet done

[akpm]: pfm_sysfs_remove_pmu(), pfm_sysfs_add_pmu() add comments to explain goal
	- not yet done

[akpm]: sysfs uses one file = one value, fix for register mapping view
	- done, register mapping files use one file per value, 2.6.18 introduces a new subtree in /sys/kernel/perfmon/pmu_desc

[akpm]: is pfm_smpl_fmt_lock really needed?
	- I think so

[akpm]: check and handle sysfs errors
	- done

[akpm]: get_cpu() not needed in pfm_interrupt_handler()
	- done
[akpm]: SLAB_ATOMIC is unreliable, use SLAB_KERNEL if possible?
	- started looking into this, not completely solved

[akpm]: use kmem_cache_zalloc()
	- done

[akpm]: careful vmalloc() sleeps
	- yes, I think I switched to kmalloc()

[akpm]: pfm_view_map_pagefault() test should use >=
	- done

[akpm]: justify perfmon_kapi.c
	- removed for now

[akpm]: replace PFM_LAST_CPU() with actual code, this is less confusing
	- done

[akpm]: __pfmk_read() check return value from wait_completion_interruptible()
	- due to removal of kapi, this problem is gone

[akpm]: use EXPORT_SYMBOL_GPL() 
	- not yet done.

[akpm]: extraenous white space  in __pfm_read()
	- done

[akpm]: simplify __pfm_read() using a while loop
	- not yet done

[akpm]: pfm_poll() test on filp is useless as it can never be true
	- not yet done

[akpm]: pfm_poll() is context locking needed?
	- not yet investigated

[akpm]: explain why cannot use relayfs instead of buffer remapping scheme
	- need to exlain. In short, for per-thread monitoring, the buffer is managed
	   per thread and follows the thrad as it migrates from one CPU to another.

[akpm]: explicit licensing
	- done

[akpm]: add comment to show which structure are shared with users
	- not yet done

[akpm]: alignment of structures shared with users
	- done

[akpm]: don't use type for pfm_flags_t
	- not yet done
[akpm]: maybe use packed on structures shared with user
	- may cause unaligned problems

[akpm]: reserved for future is useless unless there is version info somewhere
	- yes there is a version in /sys/kernel/perfmon/version

[akpm]: reserved for future fields must be guaranteed zero when kernel provides them
	- need to check this
[akpm]: does kernel check reserved fields are zero?
	- no, need to be done
[akpm]: why microseconds for set timeout, nanoseconds is typical
	- could be nanoseconf to be uniform. No real compatiblity issue. The point
	  was that the timeout granularity is limited by timer tick granularity, which is
	  more in the order of micro-seconds than nanoseconds. We can still make the change.

[akpm]: what about the volatile in pfm_set_view
	- pfm_set_view is a structure shared with user via remapping.

[akpm]: convert pfm_arch_context() to inline 
	- not yet done

[akpm]: pfm_modview*() need locking and comments
	- set_view may be shared with user via remapping. This can ONLY happen when
	  self-monitoring. There is no concurrency possible with another thread and
	  interrupts are masked when kernel modifies fields.

[akpm]: protoypes need argument identifiers
	- not yet done

[akpm]: carta_random32() should be in another header file
	- yes, I know. Should I create a specific header file? I don't think random.h
	  is meant for this.

[akpm]: replace pfm_put_ctx() wrapper with explicit call
	- not yet done

[akpm]: add bitmap operation for bv_set, bv_isset() to bitmap.h
	- not yet done
	
[akpm]: fix bitmap.h shortcomings with data types
	- not sure how to solve this? Force to u64, use void *

[akpm]: change the way we manage the ringbuffer for messages
	- not yet done

[akpm]: pfm_setup_smpl_fmt() check for duplicate tests
	- need to investigate this

[akpm]: get CONFIG_IA64_PERFMON_COMPAT out of perfmon.c using helper functions
	- good idea. Will do in next release

[akpm]: avoid unreliable SLAB_ATOMIC
	- need to verify locking issue if switch to other SLAB mode

[akpm]: pfm_smpl_buffer_alloc() add newline to if()
	- not yet done

[akpm]: avoid multiple consecutive empty lines
	- not yet done

[akpm]: remove unneeded braces in pfm_reset_pmds()
	- not yet done

[akpm]: use assert_spin_locked() in pfm_resume_after_ovfl()
	- not yet done

[akpm]: remove reference to IA-64 code in comments to pfm_handle_work()
	- done

[akpm]: pfm_handle_work() unpleasing handling of local_irq
	- not sure how to deal with this in a better way

[akpm]: switch() statement indent
	- not yet done

[akpm]: __pfm_init() may leak cachep for context
	- kmem_cache_create is done only once. Need to check if we can add destroy call

[akpm]: add comments in pfm_bad_permissions()
	- not yet done

[akpm]: pfm_task_incompatible() task state in task_struct.exit_state, not task->state
	- not yet done

[akpm]: pfm_get_task() what is it doing exactly?
	- need to add comment

[akpm]: extra braces in pfm_get_task()
	- not yet done

[akpm]: pfm_check_task_exist()  braces, and expensive
	- not yet done

[akpm]: set_view locking
	- explained above

[akpm]: check locking for functions using smp_processor_id() (preemption)
	- need more investigation

On Tue, Sep 26, 2006 at 10:09:51PM -0700, Andrew Morton wrote:
> On Tue, 26 Sep 2006 07:34:20 -0700
> Stephane Eranian <eranian@hpl.hp.com> wrote:
> 
> > I have released another version of the perfmon new code base package.
> > This version of the kernel patch is relative to 2.6.18. This longer
> > than usual delay between releases comes from the large amount of changes
> > than went into this new release following the feedback I got on LKML. As
> > a result, the code has improved.
> 
> Thanks for doing that.
> 
> Would it be possible to get an accounting of the disposition of the various
> review comments?  Of the "yes I did that" or "OK, but I'll do it later" or
> "no, you're an idiot" form?
> 
> 
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> oprofile-list mailing list
> oprofile-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oprofile-list

-- 

-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-27 22:48     ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-27 22:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: perfmon, linux-ia64, linux-kernel

Andrew,

Here is the summary of the various point raised by your review and the current
status. I am hoping to close all points by next release.

[ak] : use add_timer()/del_timer() instead of hook into smp_local_timer_interrupt() set timeout based set 
	- started looking into this, not confident it could be done and be sa simple as what we have today

[ak] : put set_intr_gate() all in one place for each arch()
	- done

[ak] : suggest use upcoming exit_thread/copy_thread notifiers instead of hooks
	- abandonned

[ak] : separate patch for _TIF_WORK_CTXSW
	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline

[ak] : clarify pfm_handle_work(), use TIF flags to trigger call
	- trigger by TIF_PERFMON_WORK now. Optimization as it is called only for tasks 
	  that have it set, not all tasks with TIF_PERFMON_CTXSW

[ak] : x86-64 FIRST_SYSTEM_VECTOR does not need to be adjusted
	- done

[ak] : CONFIG_PERFMON* not on by default
	- done

[ak] : rename *_AMD64 to *_K8
	- done

[ak] : rename EM64T to P4
	- done

[ak] : explain what PEBS does
	- done

[ak] : merge 64-bit and 32-bit PEBS code
	- done

[ak] : what about UUID for sampling format?
	- could try to replace UUID with plain old string. work-in-progress

[ak] : check alignment of shared struct for 32-it and 64-bit ABIs
	- done. structures shared with user have no padding

[ak] : had comment about lazy save/restore strategy
	- not yet done

[ak] : may have to add __kprobes to some functions
	- started doing this on some functions. Need better understanding on when to use this

[ak] : use sched_clock() to get timing
	- done

[ak] : explain context locking
	- not yet done

[ak] : explain statistics gathering
	- done. This is mostly for debugging purposes

[ak] : justify why multiple sampling formats are needed, instead of just hardcoding
	- done. Just think of Intel PEBS support without it

[ak] : i386 should have wrmsrl()
	- modified the code to avoid having a #define

[ak] : cleaner integration with NMI watchdog
	- integration done on AMD K8. Issues on P4, P6, due to PMU design

[ak] : drop cr4.pce management, will be one by default with PERFCTR0 as a replacement or TSC
	- done

[ak] : add exit_idle/enter_idle to smp_pmu_interrupt()
	- done

[ak] : what is the difference between masking/unmasking monitoring and start/stop
	- need to provide explanation. In short: on Itanium start/stop may be controlled
	  from userland, kernel need another way to guarantee monitoring is stopped

[ak] : check that CPUID instruction is supported before using it
	- done

[ak] : fix helper function to module autoloader to better check for architectural PMU on Intel
	- done

[ak] : add synthetic flag to cpufeature to simplify detection of architectural PMU and features
	- done. Added PEB. May be able to do more. Andi added ARCH_PERFMON in 2.6.18

[ak] : simplify model/vendor/family detection in PMU description modules 
	- done

[ak] : ignore registers if find more than what is supported by generic architectural PMU description module
	- done

[ak] : removed duplicate APIC enable detction
	- done

[ak] : add a force module option on P4 (just too many variations)
	- done

[ak] : may need some apic workaround for older i386 CPU (pfm_arch_resend_irq())
	- not yet investigated

[ak] : pfm_serialize() : should this be sync_core() on x86?
	- not yet investigated

[ak] : over-abstraction in pfm_arch_read_pmd()
	- not yet done

[ak] : security fix in syscall with vector argument (overflow)
	- done


[akpm]: lots of poking into task lifetime internals
	- simplified by using ptrace_check_attach(). Need to check if more can be done

[akpm]: checks running atomic in some of the task checks
	- yes

[akpm]: fix return value for pfm_get_args() 
	- done

[akpm]: use fget_light() in some place instead of fget()
	- not sure understand when to use one versus the other

[akpm]: check integer multiply overflow in syscall with vector arguments
	- done

[akpm] : add ptr_to_free to syscall with vector arguments to make logic more explicit
	- done

[akpm] : make sure we do not leak kernel information through copy_to_user() when returning with errors
	- done

[akpm]: documentation for syscall? Is there an API specification?
	- answered. In short, there exists a specification but it needs to be updated

[akpm]: check structure shared between kernel and user do not have padding inserted by compiler (leak)
	- done

[akpm]: move pfm_sysfs_add_pmu() declaration to header file if necessary
	- not yet done

[akpm: change return value for sysfs debug_store()  function to return sz
	- done

[akpm]: use goto instead of multiple 'return ret;'
	- done

[akpm]: does this code support CPU HOTPLUG?
	- yes, as of 2.6.18 perfmon2 supoprts CPU hotplug. This affects system-wide measurements
	  and /sysfs stuff.

[akpm]: what use UUID for sampling formats?
	- switch to clear text string, not yet done

[akpm]: pfm_sysfs_remove_fmt(), pfm_sysfs_add_fmt() add comments to explain goal
	- not yet done

[akpm]: pfm_sysfs_remove_pmu(), pfm_sysfs_add_pmu() add comments to explain goal
	- not yet done

[akpm]: sysfs uses one file = one value, fix for register mapping view
	- done, register mapping files use one file per value, 2.6.18 introduces a new subtree in /sys/kernel/perfmon/pmu_desc

[akpm]: is pfm_smpl_fmt_lock really needed?
	- I think so

[akpm]: check and handle sysfs errors
	- done

[akpm]: get_cpu() not needed in pfm_interrupt_handler()
	- done
[akpm]: SLAB_ATOMIC is unreliable, use SLAB_KERNEL if possible?
	- started looking into this, not completely solved

[akpm]: use kmem_cache_zalloc()
	- done

[akpm]: careful vmalloc() sleeps
	- yes, I think I switched to kmalloc()

[akpm]: pfm_view_map_pagefault() test should use >	- done

[akpm]: justify perfmon_kapi.c
	- removed for now

[akpm]: replace PFM_LAST_CPU() with actual code, this is less confusing
	- done

[akpm]: __pfmk_read() check return value from wait_completion_interruptible()
	- due to removal of kapi, this problem is gone

[akpm]: use EXPORT_SYMBOL_GPL() 
	- not yet done.

[akpm]: extraenous white space  in __pfm_read()
	- done

[akpm]: simplify __pfm_read() using a while loop
	- not yet done

[akpm]: pfm_poll() test on filp is useless as it can never be true
	- not yet done

[akpm]: pfm_poll() is context locking needed?
	- not yet investigated

[akpm]: explain why cannot use relayfs instead of buffer remapping scheme
	- need to exlain. In short, for per-thread monitoring, the buffer is managed
	   per thread and follows the thrad as it migrates from one CPU to another.

[akpm]: explicit licensing
	- done

[akpm]: add comment to show which structure are shared with users
	- not yet done

[akpm]: alignment of structures shared with users
	- done

[akpm]: don't use type for pfm_flags_t
	- not yet done
[akpm]: maybe use packed on structures shared with user
	- may cause unaligned problems

[akpm]: reserved for future is useless unless there is version info somewhere
	- yes there is a version in /sys/kernel/perfmon/version

[akpm]: reserved for future fields must be guaranteed zero when kernel provides them
	- need to check this
[akpm]: does kernel check reserved fields are zero?
	- no, need to be done
[akpm]: why microseconds for set timeout, nanoseconds is typical
	- could be nanoseconf to be uniform. No real compatiblity issue. The point
	  was that the timeout granularity is limited by timer tick granularity, which is
	  more in the order of micro-seconds than nanoseconds. We can still make the change.

[akpm]: what about the volatile in pfm_set_view
	- pfm_set_view is a structure shared with user via remapping.

[akpm]: convert pfm_arch_context() to inline 
	- not yet done

[akpm]: pfm_modview*() need locking and comments
	- set_view may be shared with user via remapping. This can ONLY happen when
	  self-monitoring. There is no concurrency possible with another thread and
	  interrupts are masked when kernel modifies fields.

[akpm]: protoypes need argument identifiers
	- not yet done

[akpm]: carta_random32() should be in another header file
	- yes, I know. Should I create a specific header file? I don't think random.h
	  is meant for this.

[akpm]: replace pfm_put_ctx() wrapper with explicit call
	- not yet done

[akpm]: add bitmap operation for bv_set, bv_isset() to bitmap.h
	- not yet done
	
[akpm]: fix bitmap.h shortcomings with data types
	- not sure how to solve this? Force to u64, use void *

[akpm]: change the way we manage the ringbuffer for messages
	- not yet done

[akpm]: pfm_setup_smpl_fmt() check for duplicate tests
	- need to investigate this

[akpm]: get CONFIG_IA64_PERFMON_COMPAT out of perfmon.c using helper functions
	- good idea. Will do in next release

[akpm]: avoid unreliable SLAB_ATOMIC
	- need to verify locking issue if switch to other SLAB mode

[akpm]: pfm_smpl_buffer_alloc() add newline to if()
	- not yet done

[akpm]: avoid multiple consecutive empty lines
	- not yet done

[akpm]: remove unneeded braces in pfm_reset_pmds()
	- not yet done

[akpm]: use assert_spin_locked() in pfm_resume_after_ovfl()
	- not yet done

[akpm]: remove reference to IA-64 code in comments to pfm_handle_work()
	- done

[akpm]: pfm_handle_work() unpleasing handling of local_irq
	- not sure how to deal with this in a better way

[akpm]: switch() statement indent
	- not yet done

[akpm]: __pfm_init() may leak cachep for context
	- kmem_cache_create is done only once. Need to check if we can add destroy call

[akpm]: add comments in pfm_bad_permissions()
	- not yet done

[akpm]: pfm_task_incompatible() task state in task_struct.exit_state, not task->state
	- not yet done

[akpm]: pfm_get_task() what is it doing exactly?
	- need to add comment

[akpm]: extra braces in pfm_get_task()
	- not yet done

[akpm]: pfm_check_task_exist()  braces, and expensive
	- not yet done

[akpm]: set_view locking
	- explained above

[akpm]: check locking for functions using smp_processor_id() (preemption)
	- need more investigation

On Tue, Sep 26, 2006 at 10:09:51PM -0700, Andrew Morton wrote:
> On Tue, 26 Sep 2006 07:34:20 -0700
> Stephane Eranian <eranian@hpl.hp.com> wrote:
> 
> > I have released another version of the perfmon new code base package.
> > This version of the kernel patch is relative to 2.6.18. This longer
> > than usual delay between releases comes from the large amount of changes
> > than went into this new release following the feedback I got on LKML. As
> > a result, the code has improved.
> 
> Thanks for doing that.
> 
> Would it be possible to get an accounting of the disposition of the various
> review comments?  Of the "yes I did that" or "OK, but I'll do it later" or
> "no, you're an idiot" form?
> 
> 
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CIDÞVDEV
> _______________________________________________
> oprofile-list mailing list
> oprofile-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oprofile-list

-- 

-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-27 22:48     ` Stephane Eranian
@ 2006-09-27 23:31       ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-27 23:31 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel

On Wed, 27 Sep 2006 15:48:32 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> Andrew,
> 
> Here is the summary of the various point raised by your review and the current
> status. I am hoping to close all points by next release.
> 
> ...
> 
> [akpm]: use fget_light() in some place instead of fget()
> 	- not sure understand when to use one versus the other
>

They are always interchangeable.  fget_light() is simply an optimised,
messier-to-use version.

>
> ..
>
> [akpm]: carta_random32() should be in another header file
> 	- yes, I know. Should I create a specific header file? I don't think random.h
> 	  is meant for this.

I suppose so.  Or just stick the declaration into kernel.h.

I had a patch go past the other day which had a hand-rolled
fast-but-not-very-good pseudo random number generator in it.  I couldn't
remember where I'd seen one, and now I can't remember what patch it was
that needed it.  Sigh.

Anyway, a standalone patch which adds that function into lib/whatever.c
would be nice.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-27 23:31       ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-27 23:31 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel

On Wed, 27 Sep 2006 15:48:32 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> Andrew,
> 
> Here is the summary of the various point raised by your review and the current
> status. I am hoping to close all points by next release.
> 
> ...
> 
> [akpm]: use fget_light() in some place instead of fget()
> 	- not sure understand when to use one versus the other
>

They are always interchangeable.  fget_light() is simply an optimised,
messier-to-use version.

>
> ..
>
> [akpm]: carta_random32() should be in another header file
> 	- yes, I know. Should I create a specific header file? I don't think random.h
> 	  is meant for this.

I suppose so.  Or just stick the declaration into kernel.h.

I had a patch go past the other day which had a hand-rolled
fast-but-not-very-good pseudo random number generator in it.  I couldn't
remember where I'd seen one, and now I can't remember what patch it was
that needed it.  Sigh.

Anyway, a standalone patch which adds that function into lib/whatever.c
would be nice.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-27 23:31       ` Andrew Morton
@ 2006-09-28  6:49         ` Stephane Eranian
  -1 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28  6:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: perfmon, linux-ia64, linux-kernel

Andrew,

On Wed, Sep 27, 2006 at 04:31:00PM -0700, Andrew Morton wrote:
> > 
> > Here is the summary of the various point raised by your review and the current
> > status. I am hoping to close all points by next release.
> > 
> > ...
> > 
> > [akpm]: use fget_light() in some place instead of fget()
> > 	- not sure understand when to use one versus the other
> >
> 
> They are always interchangeable.  fget_light() is simply an optimised,
> messier-to-use version.

What are exactly the assumptions of fget_light()?

> 
> >
> > ..
> >
> > [akpm]: carta_random32() should be in another header file
> > 	- yes, I know. Should I create a specific header file? I don't think random.h
> > 	  is meant for this.
> 
> I suppose so.  Or just stick the declaration into kernel.h.
> 
> I had a patch go past the other day which had a hand-rolled
> fast-but-not-very-good pseudo random number generator in it.  I couldn't
> remember where I'd seen one, and now I can't remember what patch it was
> that needed it.  Sigh.
> 
> Anyway, a standalone patch which adds that function into lib/whatever.c
> would be nice.

I will post a standalone patch for carta random. I can provide a standalone header
file in include/linux/carta_random.h. 

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28  6:49         ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28  6:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: perfmon, linux-ia64, linux-kernel

Andrew,

On Wed, Sep 27, 2006 at 04:31:00PM -0700, Andrew Morton wrote:
> > 
> > Here is the summary of the various point raised by your review and the current
> > status. I am hoping to close all points by next release.
> > 
> > ...
> > 
> > [akpm]: use fget_light() in some place instead of fget()
> > 	- not sure understand when to use one versus the other
> >
> 
> They are always interchangeable.  fget_light() is simply an optimised,
> messier-to-use version.

What are exactly the assumptions of fget_light()?

> 
> >
> > ..
> >
> > [akpm]: carta_random32() should be in another header file
> > 	- yes, I know. Should I create a specific header file? I don't think random.h
> > 	  is meant for this.
> 
> I suppose so.  Or just stick the declaration into kernel.h.
> 
> I had a patch go past the other day which had a hand-rolled
> fast-but-not-very-good pseudo random number generator in it.  I couldn't
> remember where I'd seen one, and now I can't remember what patch it was
> that needed it.  Sigh.
> 
> Anyway, a standalone patch which adds that function into lib/whatever.c
> would be nice.

I will post a standalone patch for carta random. I can provide a standalone header
file in include/linux/carta_random.h. 

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-28  6:49         ` Stephane Eranian
@ 2006-09-28  7:05           ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-28  7:05 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel

On Wed, 27 Sep 2006 23:49:49 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> Andrew,
> 
> On Wed, Sep 27, 2006 at 04:31:00PM -0700, Andrew Morton wrote:
> > > 
> > > Here is the summary of the various point raised by your review and the current
> > > status. I am hoping to close all points by next release.
> > > 
> > > ...
> > > 
> > > [akpm]: use fget_light() in some place instead of fget()
> > > 	- not sure understand when to use one versus the other
> > >
> > 
> > They are always interchangeable.  fget_light() is simply an optimised,
> > messier-to-use version.
> 
> What are exactly the assumptions of fget_light()?

Just the ones in the comment, really - it assumes that the caller has a
single ref on the file.  ie: we're within a system call which passed in an
fd, or within a pagefault and this file* is current->mm->some_vma->vm_file.

You _might_ get into trouble if using it against some file* which belongs
to a different process, and if you don't already have a ref on that file*. 
But that's usually a bug anyway, unless the code is aware of and is
exploiting the RCU management of these things.

There are lots of example usages around the place.

> > 
> > >
> > > ..
> > >
> > > [akpm]: carta_random32() should be in another header file
> > > 	- yes, I know. Should I create a specific header file? I don't think random.h
> > > 	  is meant for this.
> > 
> > I suppose so.  Or just stick the declaration into kernel.h.
> > 
> > I had a patch go past the other day which had a hand-rolled
> > fast-but-not-very-good pseudo random number generator in it.  I couldn't
> > remember where I'd seen one, and now I can't remember what patch it was
> > that needed it.  Sigh.
> > 
> > Anyway, a standalone patch which adds that function into lib/whatever.c
> > would be nice.
> 
> I will post a standalone patch for carta random. I can provide a standalone header
> file in include/linux/carta_random.h. 

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28  7:05           ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2006-09-28  7:05 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel

On Wed, 27 Sep 2006 23:49:49 -0700
Stephane Eranian <eranian@hpl.hp.com> wrote:

> Andrew,
> 
> On Wed, Sep 27, 2006 at 04:31:00PM -0700, Andrew Morton wrote:
> > > 
> > > Here is the summary of the various point raised by your review and the current
> > > status. I am hoping to close all points by next release.
> > > 
> > > ...
> > > 
> > > [akpm]: use fget_light() in some place instead of fget()
> > > 	- not sure understand when to use one versus the other
> > >
> > 
> > They are always interchangeable.  fget_light() is simply an optimised,
> > messier-to-use version.
> 
> What are exactly the assumptions of fget_light()?

Just the ones in the comment, really - it assumes that the caller has a
single ref on the file.  ie: we're within a system call which passed in an
fd, or within a pagefault and this file* is current->mm->some_vma->vm_file.

You _might_ get into trouble if using it against some file* which belongs
to a different process, and if you don't already have a ref on that file*. 
But that's usually a bug anyway, unless the code is aware of and is
exploiting the RCU management of these things.

There are lots of example usages around the place.

> > 
> > >
> > > ..
> > >
> > > [akpm]: carta_random32() should be in another header file
> > > 	- yes, I know. Should I create a specific header file? I don't think random.h
> > > 	  is meant for this.
> > 
> > I suppose so.  Or just stick the declaration into kernel.h.
> > 
> > I had a patch go past the other day which had a hand-rolled
> > fast-but-not-very-good pseudo random number generator in it.  I couldn't
> > remember where I'd seen one, and now I can't remember what patch it was
> > that needed it.  Sigh.
> > 
> > Anyway, a standalone patch which adds that function into lib/whatever.c
> > would be nice.
> 
> I will post a standalone patch for carta random. I can provide a standalone header
> file in include/linux/carta_random.h. 

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-27 22:48     ` Stephane Eranian
@ 2006-09-28  7:32       ` Andi Kleen
  -1 siblings, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2006-09-28  7:32 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Stephane Eranian <eranian@hpl.hp.com> writes:
> 
> [ak] : separate patch for _TIF_WORK_CTXSW
> 	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline

If it's not in mainline yet I lost it somehow and you should resubmit.

> [ak] : may have to add __kprobes to some functions
> 	- started doing this on some functions. Need better understanding on when to use this

Basically when you could recurse in kprobes. 

> [ak] : cleaner integration with NMI watchdog
> 	- integration done on AMD K8. Issues on P4, P6, due to PMU design

What are the issues?

> [akpm]: documentation for syscall? Is there an API specification?
> 	- answered. In short, there exists a specification but it needs to be updated

Probably you should have man pages ready for submission to the manpage maintainer.
That might also the second review pass on l-k easier if you supply
them in the description.

-Andi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28  7:32       ` Andi Kleen
  0 siblings, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2006-09-28  7:32 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Stephane Eranian <eranian@hpl.hp.com> writes:
> 
> [ak] : separate patch for _TIF_WORK_CTXSW
> 	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline

If it's not in mainline yet I lost it somehow and you should resubmit.

> [ak] : may have to add __kprobes to some functions
> 	- started doing this on some functions. Need better understanding on when to use this

Basically when you could recurse in kprobes. 

> [ak] : cleaner integration with NMI watchdog
> 	- integration done on AMD K8. Issues on P4, P6, due to PMU design

What are the issues?

> [akpm]: documentation for syscall? Is there an API specification?
> 	- answered. In short, there exists a specification but it needs to be updated

Probably you should have man pages ready for submission to the manpage maintainer.
That might also the second review pass on l-k easier if you supply
them in the description.

-Andi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-28  7:32       ` Andi Kleen
@ 2006-09-28  7:56         ` Stephane Eranian
  -1 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28  7:56 UTC (permalink / raw)
  To: Andi Kleen; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Andi,

On Thu, Sep 28, 2006 at 09:32:39AM +0200, Andi Kleen wrote:
> Stephane Eranian <eranian@hpl.hp.com> writes:
> > 
> > [ak] : separate patch for _TIF_WORK_CTXSW
> > 	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline
> 
> If it's not in mainline yet I lost it somehow and you should resubmit.
> 
Will do.

> > [ak] : may have to add __kprobes to some functions
> > 	- started doing this on some functions. Need better understanding on when to use this
> 
> Basically when you could recurse in kprobes. 
> 
My understanding is that kprobes are triggered by breakpoints, so I am think that any 
perfmon function that can be called along the same path, i.e., traps, needs to have the
__kprobes prefix.

> > [ak] : cleaner integration with NMI watchdog
> > 	- integration done on AMD K8. Issues on P4, P6, due to PMU design
> 
> What are the issues?

This is ugly!

The P6 PMU actually has only one enable bit for all counters and it is in PERFEVTSEL0 which
you are using for NMI. Thus counters are NOT independent. Architectural perfmon looks like
it is fixing this issue.  I am not sure this is actually true based on the findings of the
PAPI people for instance.

The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
requires clearing the CCCR which also contains the overflow information (has the counter
overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
You need some save area that you can safely access without grabbing any lock (because you
are in the NMI handler). I cannot use the perfmon context because it could be accessed from
other processors, and I would need to grab the context lock. I need to investigate how to
do this in a different way. Maybe change the logic used to detect which counters overflowed
by not using CCCR.

> 
> > [akpm]: documentation for syscall? Is there an API specification?
> > 	- answered. In short, there exists a specification but it needs to be updated
> 
> Probably you should have man pages ready for submission to the manpage maintainer.
> That might also the second review pass on l-k easier if you supply
> them in the description.

I don't have the man pages ready yet.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28  7:56         ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28  7:56 UTC (permalink / raw)
  To: Andi Kleen; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Andi,

On Thu, Sep 28, 2006 at 09:32:39AM +0200, Andi Kleen wrote:
> Stephane Eranian <eranian@hpl.hp.com> writes:
> > 
> > [ak] : separate patch for _TIF_WORK_CTXSW
> > 	- I think I submitted a TIF patch for x86-64, but unlike i386 it is not yet in mainline
> 
> If it's not in mainline yet I lost it somehow and you should resubmit.
> 
Will do.

> > [ak] : may have to add __kprobes to some functions
> > 	- started doing this on some functions. Need better understanding on when to use this
> 
> Basically when you could recurse in kprobes. 
> 
My understanding is that kprobes are triggered by breakpoints, so I am think that any 
perfmon function that can be called along the same path, i.e., traps, needs to have the
__kprobes prefix.

> > [ak] : cleaner integration with NMI watchdog
> > 	- integration done on AMD K8. Issues on P4, P6, due to PMU design
> 
> What are the issues?

This is ugly!

The P6 PMU actually has only one enable bit for all counters and it is in PERFEVTSEL0 which
you are using for NMI. Thus counters are NOT independent. Architectural perfmon looks like
it is fixing this issue.  I am not sure this is actually true based on the findings of the
PAPI people for instance.

The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
requires clearing the CCCR which also contains the overflow information (has the counter
overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
You need some save area that you can safely access without grabbing any lock (because you
are in the NMI handler). I cannot use the perfmon context because it could be accessed from
other processors, and I would need to grab the context lock. I need to investigate how to
do this in a different way. Maybe change the logic used to detect which counters overflowed
by not using CCCR.

> 
> > [akpm]: documentation for syscall? Is there an API specification?
> > 	- answered. In short, there exists a specification but it needs to be updated
> 
> Probably you should have man pages ready for submission to the manpage maintainer.
> That might also the second review pass on l-k easier if you supply
> them in the description.

I don't have the man pages ready yet.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-28  7:56         ` Stephane Eranian
@ 2006-09-28  8:05           ` Andi Kleen
  -1 siblings, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2006-09-28  8:05 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel, akpm

On Thursday 28 September 2006 09:56, Stephane Eranian wrote:
> Andi,
> 
> 
> > > [ak] : may have to add __kprobes to some functions
> > > 	- started doing this on some functions. Need better understanding on when to use this
> > 
> > Basically when you could recurse in kprobes. 
> > 
> My understanding is that kprobes are triggered by breakpoints, so I am think that any 
> perfmon function that can be called along the same path, i.e., traps, needs to have the
> __kprobes prefix.

Only when you call perfmon from the int3 path or any code that is shared.

But it is actually more complicated because page faults can be used by
kprobes too.

> > > [ak] : cleaner integration with NMI watchdog
> > > 	- integration done on AMD K8. Issues on P4, P6, due to PMU design
> > 
> > What are the issues?
> 
> This is ugly!
> 
> The P6 PMU actually has only one enable bit for all counters and it is in PERFEVTSEL0 which
> you are using for NMI. Thus counters are NOT independent. Architectural perfmon looks like
> it is fixing this issue.  I am not sure this is actually true based on the findings of the
> PAPI people for instance.
> 
> The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
> requires clearing the CCCR which also contains the overflow information (has the counter
> overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
> You need some save area that you can safely access without grabbing any lock (because you
> are in the NMI handler).

Not sure what the lock would be needed for. It is only a per CPU variable that doesn't
need synchronization no? 

-Andi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28  8:05           ` Andi Kleen
  0 siblings, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2006-09-28  8:05 UTC (permalink / raw)
  To: eranian; +Cc: perfmon, linux-ia64, linux-kernel, akpm

On Thursday 28 September 2006 09:56, Stephane Eranian wrote:
> Andi,
> 
> 
> > > [ak] : may have to add __kprobes to some functions
> > > 	- started doing this on some functions. Need better understanding on when to use this
> > 
> > Basically when you could recurse in kprobes. 
> > 
> My understanding is that kprobes are triggered by breakpoints, so I am think that any 
> perfmon function that can be called along the same path, i.e., traps, needs to have the
> __kprobes prefix.

Only when you call perfmon from the int3 path or any code that is shared.

But it is actually more complicated because page faults can be used by
kprobes too.

> > > [ak] : cleaner integration with NMI watchdog
> > > 	- integration done on AMD K8. Issues on P4, P6, due to PMU design
> > 
> > What are the issues?
> 
> This is ugly!
> 
> The P6 PMU actually has only one enable bit for all counters and it is in PERFEVTSEL0 which
> you are using for NMI. Thus counters are NOT independent. Architectural perfmon looks like
> it is fixing this issue.  I am not sure this is actually true based on the findings of the
> PAPI people for instance.
> 
> The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
> requires clearing the CCCR which also contains the overflow information (has the counter
> overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
> You need some save area that you can safely access without grabbing any lock (because you
> are in the NMI handler).

Not sure what the lock would be needed for. It is only a per CPU variable that doesn't
need synchronization no? 

-Andi

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-27 22:48     ` Stephane Eranian
@ 2006-09-28 13:41       ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2006-09-28 13:41 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Andrew Morton, perfmon, linux-ia64, linux-kernel

this completly leaves out my comments, which you also seemed to ignore before
:)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28 13:41       ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2006-09-28 13:41 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Andrew Morton, perfmon, linux-ia64, linux-kernel

this completly leaves out my comments, which you also seemed to ignore before
:)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-28 13:41       ` Christoph Hellwig
@ 2006-09-28 14:04         ` Stephane Eranian
  -1 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28 14:04 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, perfmon, linux-ia64, linux-kernel

Chritoph,

On Thu, Sep 28, 2006 at 02:41:26PM +0100, Christoph Hellwig wrote:
> this completly leaves out my comments, which you also seemed to ignore before
> :)

It is true I forgot to list your comments in the summary I sent out yesterday.
However rest assured that I took them into account. If you noticed in my annoucement
for 2.6.18, I have removed the perfmon KAPI interface, for instance. As for the UUID
 to string conversion, this was also pointed out by Andrew and it is work-in-progress.

So here is the additional feedback I will put in the summary:

[hch]: remove perfmon kernel-level API (perfmon_kapi.c), it is not justified
	- done

[hch]: use strings insead of UUID
	- work in progress

In general, I do listen to comments. The summary of changes is a good proof
of that, I think.

Thanks.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-28 14:04         ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-28 14:04 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, perfmon, linux-ia64, linux-kernel

Chritoph,

On Thu, Sep 28, 2006 at 02:41:26PM +0100, Christoph Hellwig wrote:
> this completly leaves out my comments, which you also seemed to ignore before
> :)

It is true I forgot to list your comments in the summary I sent out yesterday.
However rest assured that I took them into account. If you noticed in my annoucement
for 2.6.18, I have removed the perfmon KAPI interface, for instance. As for the UUID
 to string conversion, this was also pointed out by Andrew and it is work-in-progress.

So here is the additional feedback I will put in the summary:

[hch]: remove perfmon kernel-level API (perfmon_kapi.c), it is not justified
	- done

[hch]: use strings insead of UUID
	- work in progress

In general, I do listen to comments. The summary of changes is a good proof
of that, I think.

Thanks.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
  2006-09-28  8:05           ` Andi Kleen
@ 2006-09-29  9:30             ` Stephane Eranian
  -1 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-29  9:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Andi,

On Thu, Sep 28, 2006 at 10:05:08AM +0200, Andi Kleen wrote:
> > PAPI people for instance.
> > 
> > The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
> > requires clearing the CCCR which also contains the overflow information (has the counter
> > overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
> > You need some save area that you can safely access without grabbing any lock (because you
> > are in the NMI handler).
> 
> Not sure what the lock would be needed for. It is only a per CPU variable that doesn't
> need synchronization no?

The CCCR register is by definition a per-CPU entity. However, the perfmon context where the
CCCR is saved is not. Any thread with access to the file descriptor can gain access to the
context. This can occur from any CPU. Of course, we do have checks in place but they
run after the context lock is held. The main restriction is that a thread cannot operate on
a context attached to another thread unless that thread is stopped (checked via ptrace_check_attach).
Yet there are a few perfmon syscalls, for which this restriction does not apply because they
do not need to touch the PMU hardware, yet they modify the context state.

I think this could work on P4, if we could clear the CCCR and yet retain enough state to detect
which counter(s) overflowed. On P6, where there is not overflow bit in the control register, we
simply check the upper bits on the counters. We could probably do the same for P4.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.18 perfmon new code base + libpfm + pfmon
@ 2006-09-29  9:30             ` Stephane Eranian
  0 siblings, 0 replies; 24+ messages in thread
From: Stephane Eranian @ 2006-09-29  9:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: perfmon, linux-ia64, linux-kernel, akpm

Andi,

On Thu, Sep 28, 2006 at 10:05:08AM +0200, Andi Kleen wrote:
> > PAPI people for instance.
> > 
> > The P4 PMU has independent counters, i.e., enable bits. The issue is that to stop a counter
> > requires clearing the CCCR which also contains the overflow information (has the counter
> > overflowed?). So you need to read the CCCR, save the value somewhere, clear the CCCR.
> > You need some save area that you can safely access without grabbing any lock (because you
> > are in the NMI handler).
> 
> Not sure what the lock would be needed for. It is only a per CPU variable that doesn't
> need synchronization no?

The CCCR register is by definition a per-CPU entity. However, the perfmon context where the
CCCR is saved is not. Any thread with access to the file descriptor can gain access to the
context. This can occur from any CPU. Of course, we do have checks in place but they
run after the context lock is held. The main restriction is that a thread cannot operate on
a context attached to another thread unless that thread is stopped (checked via ptrace_check_attach).
Yet there are a few perfmon syscalls, for which this restriction does not apply because they
do not need to touch the PMU hardware, yet they modify the context state.

I think this could work on P4, if we could clear the CCCR and yet retain enough state to detect
which counter(s) overflowed. On P6, where there is not overflow bit in the control register, we
simply check the upper bits on the counters. We could probably do the same for P4.

-- 
-Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2006-09-29  9:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-26 14:34 2.6.18 perfmon new code base + libpfm + pfmon Stephane Eranian
2006-09-26 14:34 ` Stephane Eranian
2006-09-27  5:09 ` Andrew Morton
2006-09-27  5:09   ` Andrew Morton
2006-09-27 22:48   ` Stephane Eranian
2006-09-27 22:48     ` Stephane Eranian
2006-09-27 23:31     ` Andrew Morton
2006-09-27 23:31       ` Andrew Morton
2006-09-28  6:49       ` Stephane Eranian
2006-09-28  6:49         ` Stephane Eranian
2006-09-28  7:05         ` Andrew Morton
2006-09-28  7:05           ` Andrew Morton
2006-09-28  7:32     ` Andi Kleen
2006-09-28  7:32       ` Andi Kleen
2006-09-28  7:56       ` Stephane Eranian
2006-09-28  7:56         ` Stephane Eranian
2006-09-28  8:05         ` Andi Kleen
2006-09-28  8:05           ` Andi Kleen
2006-09-29  9:30           ` Stephane Eranian
2006-09-29  9:30             ` Stephane Eranian
2006-09-28 13:41     ` Christoph Hellwig
2006-09-28 13:41       ` Christoph Hellwig
2006-09-28 14:04       ` Stephane Eranian
2006-09-28 14:04         ` Stephane Eranian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.