All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Avi Kivity <avi@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>,
	Benjamin Block <bebl@mageta.org>,
	Hans Rosenfeld <hans.rosenfeld@amd.com>,
	hpa@zytor.com, tglx@linutronix.de, suresh.b.siddha@intel.com,
	eranian@google.com, brgerst@gmail.com, Andreas.Herrmann3@amd.com,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Benjamin Block <benjamin.block@amd.com>
Subject: Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1)
Date: Mon, 19 Dec 2011 12:40:23 +0100	[thread overview]
Message-ID: <20111219114023.GB29855@elte.hu> (raw)
In-Reply-To: <4EEF1C3B.3010307@redhat.com>


* Avi Kivity <avi@redhat.com> wrote:

> On 12/19/2011 12:54 PM, Ingo Molnar wrote:
> > * Robert Richter <robert.richter@amd.com> wrote:
> >
> > > On 19.12.11 00:43:10, Ingo Molnar wrote:
> > >
> > > > So the question becomes, how well is it integrated: can perf 
> > > > 'record -a + perf report', or 'perf top' use LWP, to do 
> > > > system-wide precise [user-space] profiling and such?
> > > 
> > > There is only self-monitoring of a process possible, no 
> > > kernel and system-wide profiling. This is because we can 
> > > not allocate memory regions in the kernel for a thread 
> > > other than the current. This would require a complete 
> > > rework of mm code.
> >
> > Hm, i don't think a rework is needed: check the 
> > vmalloc_to_page() code in kernel/events/ring_buffer.c. Right 
> > now CONFIG_PERF_USE_VMALLOC is an ARM, MIPS, SH and Sparc 
> > specific feature, on x86 it turns on if 
> > CONFIG_DEBUG_PERF_USE_VMALLOC=y.
> >
> > That should be good enough for prototyping the kernel/user 
> > shared buffering approach.
> 
> LWP wants user memory, vmalloc is insufficient.  You need 
> do_mmap() with a different mm.

Take a look at PERF_USE_VMALLOC, it allows in-kernel allocated 
memory to be mmap()ed to user-space. It is basically a 
shared/dual user/kernel mode vmalloc implementation.

So all the conceptual pieces are there.

> You could let a workqueue call use_mm() and then do_mmap().  
> Even then it is subject to disruption by the monitored thread 
> (and may disrupt the monitored thread by playing with its 
> address space). [...]

Injecting this into another thread's context is indeed advanced 
stuff:

> [...] This is for thread monitoring only, I don't think 
> system-wide monitoring is possible with LWP.

That should be possible too, via two methods:

1) the easy hack: a (per cpu) vmalloc()ed buffer is made ring 3 
   accessible (by clearing the system bit in the ptes) - and 
   thus accessible to all user-space.

   This is obviously globally writable/readable memory so only a 
   debugging/prototyping hack - but would be a great first step 
   to prove the concept and see some nice perf top and perf 
   record results ...

2) the proper solution: creating a 'user-space vmalloc()' that 
   is per mm and that gets inherited transparently, across 
   fork() and exec(), and which lies outside the regular vma
   spaces. On 64-bit this should be straightforward.

   These vmas are not actually 'known' to user-space normally -
   the kernel PMU code knows about it and does what we do with
   PEBS: flushes it when necessary and puts it into the
   regular perf event channels.

   This solves the inherited perf record workflow immediately:
   the parent task just creates the buffer, which gets inherited 
   across exec() and fork(), into every portion of the workload.

   System-wide profiling is a small additional variant of this: 
   creating such a user-vmalloc() area for all tasks in the
   system so that the PMU code has them ready in the 
   context-switch code.

Solution #2 has the additional advantage that we could migrate 
PEBS to it and could allow interested user-space access to the 
'raw' PEBS buffer as well. (currently the PEBS buffer is only 
visible to kernel-space.)

I'd suggest the easy hack first, to get things going - we can 
then help out with the proper solution.

Thanks,

	Ingo


  reply	other threads:[~2011-12-19 11:42 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29 12:41 [PATCH 0/9] rework of extended state handling, LWP support Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 1/9] x86, xsave: warn on #NM exceptions caused by the kernel Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 2/9] x86, xsave: cleanup fpu/xsave support Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 3/9] x86, xsave: cleanup fpu/xsave signal frame setup Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 4/9] x86, xsave: rework fpu/xsave support Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 5/9] x86, xsave: remove unused code Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 6/9] x86, xsave: more cleanups Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 7/9] x86, xsave: remove lazy allocation of xstate area Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 8/9] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-11-29 12:41 ` [PATCH 9/9] x86, xsave: add kernel support for AMDs Lightweight Profiling (LWP) Hans Rosenfeld
2011-11-29 21:31 ` [PATCH 0/9] rework of extended state handling, LWP support Andi Kleen
2011-11-30 17:37   ` Hans Rosenfeld
2011-11-30 21:52     ` Andi Kleen
2011-12-01 20:36       ` Hans Rosenfeld
2011-12-02  2:01         ` H. Peter Anvin
2011-12-02 11:20           ` Hans Rosenfeld
2011-12-07 19:57             ` Hans Rosenfeld
2011-12-07 20:00               ` [PATCH 7/8] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-12-07 20:00                 ` [PATCH 8/8] x86, xsave: add kernel support for AMDs Lightweight Profiling (LWP) Hans Rosenfeld
2011-12-05 10:22 ` [PATCH 0/9] rework of extended state handling, LWP support Ingo Molnar
2011-12-16 16:07   ` Hans Rosenfeld
2011-12-16 16:12     ` [RFC 1/5] x86, perf: Implement software-activation of lwp Hans Rosenfeld
2011-12-16 16:12       ` [RFC 2/5] perf: adds prototype for a new perf-context-type Hans Rosenfeld
2011-12-16 16:12       ` [RFC 3/5] perf: adds a new pmu-initialization-call Hans Rosenfeld
2011-12-16 16:12       ` [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1) Hans Rosenfeld
2011-12-18  8:04         ` Ingo Molnar
2011-12-18 15:22           ` Benjamin Block
2011-12-18 23:43             ` Ingo Molnar
2011-12-19  9:09               ` Robert Richter
2011-12-19 10:54                 ` Ingo Molnar
2011-12-19 11:12                   ` Avi Kivity
2011-12-19 11:40                     ` Ingo Molnar [this message]
2011-12-19 11:58                       ` Avi Kivity
2011-12-19 18:13                         ` Benjamin
2011-12-20  8:56                           ` Ingo Molnar
2011-12-20  9:15                         ` Ingo Molnar
2011-12-20  9:47                           ` Avi Kivity
2011-12-20 10:09                             ` Ingo Molnar
2011-12-20 15:27                               ` Joerg Roedel
2011-12-20 18:40                                 ` Ingo Molnar
2011-12-21  0:07                                   ` Joerg Roedel
2011-12-21 12:34                                     ` Ingo Molnar
2011-12-21 12:44                                       ` Avi Kivity
2011-12-21 13:22                                         ` Ingo Molnar
2011-12-21 22:49                                           ` Joerg Roedel
2011-12-23 10:53                                             ` Ingo Molnar
2011-12-21 11:46                                   ` Gleb Natapov
2011-12-23 10:56                                     ` Ingo Molnar
2011-12-20 15:48                           ` Vince Weaver
2011-12-20 18:27                             ` Ingo Molnar
2011-12-20 22:47                               ` Vince Weaver
2011-12-21 12:00                                 ` Ingo Molnar
2011-12-21 13:55                                   ` Vince Weaver
2011-12-16 16:12       ` [RFC 5/5] x86, perf: adds support for the LWP threshold-int Hans Rosenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111219114023.GB29855@elte.hu \
    --to=mingo@elte.hu \
    --cc=Andreas.Herrmann3@amd.com \
    --cc=avi@redhat.com \
    --cc=bebl@mageta.org \
    --cc=benjamin.block@amd.com \
    --cc=brgerst@gmail.com \
    --cc=eranian@google.com \
    --cc=hans.rosenfeld@amd.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robert.richter@amd.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.