All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Jim Keniston <jkenisto@linux.vnet.ibm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Roland McGrath <roland@redhat.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH v1 7/10] Uprobes Implementation
Date: Tue, 23 Mar 2010 16:15:11 +0100	[thread overview]
Message-ID: <1269357311.5109.81.camel@twins> (raw)
In-Reply-To: <4BA8CE39.8050203@redhat.com>

On Tue, 2010-03-23 at 10:20 -0400, Masami Hiramatsu wrote:

> > And yes, all processes that share that DSO, consumers can install
> > filters.
> 
> Hmm, for low-level interface, it will be good. If we provide
> a user interface(trace_uprobe.c), we'd better add pid filter
> for it.

ftrace already has pid filters.

> >>> Also, like we discussed in person, I think we can do away with the
> >>> handler_in_interrupt thing by letting the handler have an error return
> >>> value and doing something like:
> >>>
> >>> do_int3:
> >>>
> >>>   uprobe = find_probe_point(addr);
> >>>
> >>>   pagefault_disable();
> >>>   err = uprobe->handler(uprobe, regs);
> >>>   pagefault_enable();
> >>>
> >>>   if (err == -EFAULT) {
> >>>     /* set TIF flag and call the handler again from
> >>>        task context */
> >>>   }
> >>>
> >>> This should allow the handler to optimistically access memory from the
> >>> trap handler, but in case it does need to fault pages in we'll call it
> >>> from task context.
> >>
> >> Okay but what if the handler is coded to sleep.
> > 
> > Don't do that ;-)
> > 
> > What reason would you have to sleep from a int3 anyway? You want to log
> > bits and get on with life, right? The only interesting case is faulting
> > when some memory references you want are not currently available, and
> > that can be done as suggested.
> 
> Out of curiously, what does the task-context mean? ('current' is probed
> task in int3, isn't it?). I think, uprobe handler can cause page fault
> (and should sleep) if the page is swapped out.

Task context means the regular kernel task stack where we can schedule,
int3 has its own exception stack and we cannot schedule from that.

And yes, the fault thing is the one case where sleeping makes sense and
is dealt with in my proposal, you don't need two handlers for that, just
call it from trap context with pagefault_disable() and when it fails
with -EFAULT set a TIF flag to deal with it later when we're back in
task context.

There is a very good probability that the memory you want to reference
is mapped (because typically the program itself will want to access it
as well) so doing the optimistic access with pagefault_disabled() will
work most of the times and you only end up taking the slow path when it
does indeed fault.

> >>> Everybody else simply places callbacks in kernel/fork.c and
> >>> kernel/exit.c, but as it is I don't think you want per-task state like
> >>> this.
> >>>
> >>> One thing I would like to see is a slot per task, that has a number of
> >>> advantages over the current patch-set in that it doesn't have one page
> >>> limit in number of probe sites, nor do you need to insert vmas into each
> >>> and every address space that happens to have your DSO mapped.
> >>>
> >>
> >> where are the per task slots stored?
> >> or Are you looking at a XOL vma area per DSO?
> > 
> > The per task slot (note the singular, each task needs only ever have a
> > single slot since a task can only ever hit one trap at a time) would
> > live in the task TLS or task stack.
> 
> Hmm, I just worried about whether TLS/task stack can be executable
> (no one set NX bit).

You can remove the NX bit from that one page I guess.

> >>> Also, I would simply kill the user_bkpt stuff and merge it into uprobes,
> >>> we don't have a kernel_bkpt thing either, only kprobes.
> >>>
> >>
> >> We had uprobes as one single layer. However it was suggested that
> >> breaking it up into two layers was useful because it would help code
> >> reuse. Esp it was felt that a generic user_bkpt layer would be far more
> >> useful than being used for just uprobes.
> >> Here are links where these discussion happened.
> > 
> > I'm so not going to read ancient emails on a funky list. What re-use?
> > uprobe should be the only interface to this, there's no second interface
> > to kprobes either is there?
> 
> It will be good when we start working on 'ptrace2' :)
> Anyway, the patch order looks a bit odd, because user_bkpt uses XOL
> but XOL patch is introduced after user_bkpt patch...

But why would ptrace2 use a different interface? Also, why introduce
some abstraction layer now without having a user for it, you could
always restructure things and or add interfaces later when you have a
clear idea what it is you need.

  reply	other threads:[~2010-03-23 15:16 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-20 14:24 [PATCH v1 0/10] Uprobes patches Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 1/10] Move Macro W to insn.h Srikar Dronamraju
2010-03-20 15:50   ` Masami Hiramatsu
2010-03-22  6:24     ` Srikar Dronamraju
2010-03-22 14:11       ` Masami Hiramatsu
2010-03-20 14:25 ` [PATCH v1 2/10] Move replace_page() to mm/memory.c Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 3/10] Enhance replace_page() to support pagecache Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 4/10] User Space Breakpoint Assistance Layer Srikar Dronamraju
2010-03-23  1:40   ` Andrew Morton
2010-03-23  4:48     ` Randy Dunlap
2010-03-23 11:26     ` Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 5/10] X86 details for user space breakpoint assistance Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 6/10] Slot allocation for Execution out of line Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 7/10] Uprobes Implementation Srikar Dronamraju
2010-03-23 11:01   ` Peter Zijlstra
2010-03-23 11:04     ` Peter Zijlstra
2010-03-23 12:23     ` Srikar Dronamraju
2010-03-23 13:46       ` Peter Zijlstra
2010-03-23 14:20         ` Masami Hiramatsu
2010-03-23 15:15           ` Peter Zijlstra [this message]
2010-03-23 17:36             ` Masami Hiramatsu
2010-03-24 10:22           ` Srikar Dronamraju
2010-03-23 15:05         ` Ananth N Mavinakayanahalli
2010-03-23 15:15           ` Peter Zijlstra
2010-03-23 15:26             ` Frank Ch. Eigler
2010-03-24  5:59             ` Ananth N Mavinakayanahalli
2010-03-24  7:58         ` Srikar Dronamraju
2010-03-24 13:00           ` Peter Zijlstra
2010-03-25  7:56             ` Srikar Dronamraju
2010-03-25  8:41             ` Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 8/10] X86 details for uprobes Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 9/10] Uprobes Documentation patch Srikar Dronamraju
2010-03-22  3:00   ` Randy Dunlap
2010-03-22  5:34     ` Srikar Dronamraju
2010-03-22 14:51       ` Randy Dunlap
2010-03-20 14:26 ` [PATCH v1 10/10] Uprobes samples Srikar Dronamraju
2010-03-23  1:38 ` [PATCH v1 0/10] Uprobes patches Andrew Morton
2010-03-23 10:55   ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1269357311.5109.81.camel@twins \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=ananth@in.ibm.com \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=jkenisto@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=roland@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.