All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Aleksa Sarai <asarai@suse.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Shuah Khan <shuah@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Brendan Gregg <bgregg@netflix.com>,
	Christian Brauner <christian@brauner.io>,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v3 1/2] kretprobe: produce sane stack traces
Date: Sun, 11 Nov 2018 00:31:37 +0900	[thread overview]
Message-ID: <20181111003137.c9df7a077d983cde57c06ee8@kernel.org> (raw)
In-Reply-To: <20181109150629.wpedwxsgbftkl3ab@mikami>

On Sat, 10 Nov 2018 02:06:29 +1100
Aleksa Sarai <asarai@suse.de> wrote:

> On 2018-11-09, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
> > > index ee696efec99f..c4dfafd43e11 100644
> > > --- a/arch/x86/include/asm/ptrace.h
> > > +++ b/arch/x86/include/asm/ptrace.h
> > > @@ -172,6 +172,7 @@ static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> > >  	return regs->sp;
> > >  }
> > >  #endif
> > > +#define stack_addr(regs) ((unsigned long *) kernel_stack_pointer(regs))
> > 
> > No, you should use kernel_stack_pointer(regs) itself instead of stack_addr().
> > 
> > > 
> > >  #define GET_IP(regs) ((regs)->ip)
> > >  #define GET_FP(regs) ((regs)->bp)
> > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> > > index b0d1e81c96bb..eb4da885020c 100644
> > > --- a/arch/x86/kernel/kprobes/core.c
> > > +++ b/arch/x86/kernel/kprobes/core.c
> > > @@ -69,8 +69,6 @@
> > >  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> > >  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > >  
> > > -#define stack_addr(regs) ((unsigned long *)kernel_stack_pointer(regs))
> > 
> > I don't like keeping this meaningless macro... this should be replaced with generic
> > kernel_stack_pointer() macro.
> 
> Sure. This patch was just an example -- I can remove stack_addr() all
> over.
> 
> > > -	if (regs)
> > > -		save_stack_address(trace, regs->ip, nosched);
> > > +	if (regs) {
> > > +		/* XXX: Currently broken -- stack_addr(regs) doesn't match entry. */
> > > +		addr = regs->ip;
> > 
> > Since this part is for storing regs->ip as a top of call-stack, this
> > seems correct code. Stack unwind will be done next block.
> 
> This comment was referring to the usage of stack_addr(). stack_addr()
> doesn't give you the right result (it isn't the address of the return
> address -- it's slightly wrong). This is the main issue I was having --
> am I doing something wrong here?

Of course stack_addr() actually just returns where the stack is. It should
not return address, but maybe a return address from this event happens.
Note that the "regs != NULL" means you will be in the interrupt handler
and it will be returned to the regs->ip.


> > > +		//addr = ftrace_graph_ret_addr(current, &state.graph_idx, addr, stack_addr(regs));
> > 
> > so func graph return trampoline address will be shown only when unwinding stack entries.
> > I mean func-graph tracer is not used as an event, so it never kicks stackdump.
> 
> Just to make sure I understand what you're saying -- func-graph trace
> will never actually call __ftrace_stack_trace? Because if it does, then
> this code will be necessary (and then I'm a bit confused why the
> unwinder has func-graph trace code -- if stack traces are never taken
> under func-graph then the code in the unwinder is not necessary)

You seems misunderstanding. Even if this is not called from func-graph
tracer, the stack entries are already replaced with func-graph trampoline.
However, regs->ip (IRQ return address) is never replaced by the func-graph
trampoline.

> My reason for commenting this out is because at this point "state" isn't
> initialised and thus .graph_idx would not be correctly handled during
> unwind (and it's the same reason I commented it out later).

OK, but anyway, I think we don't need it.

> > > +		addr = kretprobe_ret_addr(current, addr, stack_addr(regs));
> > 
> > But since kretprobe will be an event, which can kick the stackdump.
> > BTW, from kretprobe, regs->ip should always be the trampoline handler, 
> > see arch/x86/kernel/kprobes/core.c:772 :-)
> > So it must be fixed always.
> 
> Right, but kretprobe_ret_addr() is returning the *original* return
> address (and we need to do an (addr == kretprobe_trampoline)). The
> real problem is that stack_addr(regs) isn't the same as it is during
> kretprobe setup (but kretprobe_ret_addr() works everywhere else).

I think stack_addr(regs) should be same when this is called from kretprobe
handler context. Otherwise, yes, it is not same, but in that case, regs->ip
is not kretprobe_trampoline too.

If you find kretprobe_trampoline on the "stack", of course it's address should be
same as it is during kretprobe setup, but if you find kretprobe_trampoline on the
regs->ip, that should always happen on kretprobe handler context. Otherwise,
some critical violation happens on kretprobe_trampoline. In that case, we should
dump the kretprobe_trampoline address itself, should not recover it.

> > > @@ -1856,6 +1870,41 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
> > >  }
> > >  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> > >  
> > > +unsigned long kretprobe_ret_addr(struct task_struct *tsk, unsigned long ret,
> > > +				 unsigned long *retp)
> > > +{
> > > +	struct kretprobe_instance *ri;
> > > +	unsigned long flags = 0;
> > > +	struct hlist_head *head;
> > > +	bool need_lock;
> > > +
> > > +	if (likely(ret != (unsigned long) &kretprobe_trampoline))
> > > +		return ret;
> > > +
> > > +	need_lock = !kretprobe_hash_is_locked(tsk);
> > > +	if (WARN_ON(need_lock))
> > > +		kretprobe_hash_lock(tsk, &head, &flags);
> > > +	else
> > > +		head = kretprobe_inst_table_head(tsk);
> > 
> > This may not work unless this is called from the kretprobe handler context,
> > since if we are out of kretprobe handler context, another CPU can lock the
> > hash table and it can be detected by kretprobe_hash_is_locked();.
> 
> Yeah, I noticed this as well when writing it (but needed a quick impl
> that I could test). I will fix this, thanks!
> 
> By is_kretprobe_handler_context() I imagine you are referring to
> checking is_kretprobe(current_kprobe())?

yes, that's correct :)

Thank you,

> 
> > So, we should check we are in the kretprobe handler context if tsk == current,
> > if not, we definately can lock the hash lock without any warning. This can
> > be something like;
> > 
> > if (is_kretprobe_handler_context()) {
> >   // kretprobe_hash_lock(current == tsk) has been locked by caller  
> >   if (tsk != current && kretprobe_hash(tsk) != kretprobe_hash(current))
> >     // the hash of tsk and current can be same.
> >     need_lock = true;
> > } else
> >   // we should take a lock for tsk.
> >   need_lock = true;
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

WARNING: multiple messages have this Message-ID (diff)
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Aleksa Sarai <asarai@suse.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Shuah Khan <shuah@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Brendan Gregg <bgregg@netflix.com>,
	Christian Brauner <christian@brauner.io>,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel
Subject: Re: [PATCH v3 1/2] kretprobe: produce sane stack traces
Date: Sun, 11 Nov 2018 00:31:37 +0900	[thread overview]
Message-ID: <20181111003137.c9df7a077d983cde57c06ee8@kernel.org> (raw)
In-Reply-To: <20181109150629.wpedwxsgbftkl3ab@mikami>

On Sat, 10 Nov 2018 02:06:29 +1100
Aleksa Sarai <asarai@suse.de> wrote:

> On 2018-11-09, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
> > > index ee696efec99f..c4dfafd43e11 100644
> > > --- a/arch/x86/include/asm/ptrace.h
> > > +++ b/arch/x86/include/asm/ptrace.h
> > > @@ -172,6 +172,7 @@ static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> > >  	return regs->sp;
> > >  }
> > >  #endif
> > > +#define stack_addr(regs) ((unsigned long *) kernel_stack_pointer(regs))
> > 
> > No, you should use kernel_stack_pointer(regs) itself instead of stack_addr().
> > 
> > > 
> > >  #define GET_IP(regs) ((regs)->ip)
> > >  #define GET_FP(regs) ((regs)->bp)
> > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> > > index b0d1e81c96bb..eb4da885020c 100644
> > > --- a/arch/x86/kernel/kprobes/core.c
> > > +++ b/arch/x86/kernel/kprobes/core.c
> > > @@ -69,8 +69,6 @@
> > >  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> > >  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > >  
> > > -#define stack_addr(regs) ((unsigned long *)kernel_stack_pointer(regs))
> > 
> > I don't like keeping this meaningless macro... this should be replaced with generic
> > kernel_stack_pointer() macro.
> 
> Sure. This patch was just an example -- I can remove stack_addr() all
> over.
> 
> > > -	if (regs)
> > > -		save_stack_address(trace, regs->ip, nosched);
> > > +	if (regs) {
> > > +		/* XXX: Currently broken -- stack_addr(regs) doesn't match entry. */
> > > +		addr = regs->ip;
> > 
> > Since this part is for storing regs->ip as a top of call-stack, this
> > seems correct code. Stack unwind will be done next block.
> 
> This comment was referring to the usage of stack_addr(). stack_addr()
> doesn't give you the right result (it isn't the address of the return
> address -- it's slightly wrong). This is the main issue I was having --
> am I doing something wrong here?

Of course stack_addr() actually just returns where the stack is. It should
not return address, but maybe a return address from this event happens.
Note that the "regs != NULL" means you will be in the interrupt handler
and it will be returned to the regs->ip.


> > > +		//addr = ftrace_graph_ret_addr(current, &state.graph_idx, addr, stack_addr(regs));
> > 
> > so func graph return trampoline address will be shown only when unwinding stack entries.
> > I mean func-graph tracer is not used as an event, so it never kicks stackdump.
> 
> Just to make sure I understand what you're saying -- func-graph trace
> will never actually call __ftrace_stack_trace? Because if it does, then
> this code will be necessary (and then I'm a bit confused why the
> unwinder has func-graph trace code -- if stack traces are never taken
> under func-graph then the code in the unwinder is not necessary)

You seems misunderstanding. Even if this is not called from func-graph
tracer, the stack entries are already replaced with func-graph trampoline.
However, regs->ip (IRQ return address) is never replaced by the func-graph
trampoline.

> My reason for commenting this out is because at this point "state" isn't
> initialised and thus .graph_idx would not be correctly handled during
> unwind (and it's the same reason I commented it out later).

OK, but anyway, I think we don't need it.

> > > +		addr = kretprobe_ret_addr(current, addr, stack_addr(regs));
> > 
> > But since kretprobe will be an event, which can kick the stackdump.
> > BTW, from kretprobe, regs->ip should always be the trampoline handler, 
> > see arch/x86/kernel/kprobes/core.c:772 :-)
> > So it must be fixed always.
> 
> Right, but kretprobe_ret_addr() is returning the *original* return
> address (and we need to do an (addr == kretprobe_trampoline)). The
> real problem is that stack_addr(regs) isn't the same as it is during
> kretprobe setup (but kretprobe_ret_addr() works everywhere else).

I think stack_addr(regs) should be same when this is called from kretprobe
handler context. Otherwise, yes, it is not same, but in that case, regs->ip
is not kretprobe_trampoline too.

If you find kretprobe_trampoline on the "stack", of course it's address should be
same as it is during kretprobe setup, but if you find kretprobe_trampoline on the
regs->ip, that should always happen on kretprobe handler context. Otherwise,
some critical violation happens on kretprobe_trampoline. In that case, we should
dump the kretprobe_trampoline address itself, should not recover it.

> > > @@ -1856,6 +1870,41 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
> > >  }
> > >  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> > >  
> > > +unsigned long kretprobe_ret_addr(struct task_struct *tsk, unsigned long ret,
> > > +				 unsigned long *retp)
> > > +{
> > > +	struct kretprobe_instance *ri;
> > > +	unsigned long flags = 0;
> > > +	struct hlist_head *head;
> > > +	bool need_lock;
> > > +
> > > +	if (likely(ret != (unsigned long) &kretprobe_trampoline))
> > > +		return ret;
> > > +
> > > +	need_lock = !kretprobe_hash_is_locked(tsk);
> > > +	if (WARN_ON(need_lock))
> > > +		kretprobe_hash_lock(tsk, &head, &flags);
> > > +	else
> > > +		head = kretprobe_inst_table_head(tsk);
> > 
> > This may not work unless this is called from the kretprobe handler context,
> > since if we are out of kretprobe handler context, another CPU can lock the
> > hash table and it can be detected by kretprobe_hash_is_locked();.
> 
> Yeah, I noticed this as well when writing it (but needed a quick impl
> that I could test). I will fix this, thanks!
> 
> By is_kretprobe_handler_context() I imagine you are referring to
> checking is_kretprobe(current_kprobe())?

yes, that's correct :)

Thank you,

> 
> > So, we should check we are in the kretprobe handler context if tsk == current,
> > if not, we definately can lock the hash lock without any warning. This can
> > be something like;
> > 
> > if (is_kretprobe_handler_context()) {
> >   // kretprobe_hash_lock(current == tsk) has been locked by caller  
> >   if (tsk != current && kretprobe_hash(tsk) != kretprobe_hash(current))
> >     // the hash of tsk and current can be same.
> >     need_lock = true;
> > } else
> >   // we should take a lock for tsk.
> >   need_lock = true;
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

WARNING: multiple messages have this Message-ID (diff)
From: mhiramat at kernel.org (Masami Hiramatsu)
Subject: [PATCH v3 1/2] kretprobe: produce sane stack traces
Date: Sun, 11 Nov 2018 00:31:37 +0900	[thread overview]
Message-ID: <20181111003137.c9df7a077d983cde57c06ee8@kernel.org> (raw)
In-Reply-To: <20181109150629.wpedwxsgbftkl3ab@mikami>

On Sat, 10 Nov 2018 02:06:29 +1100
Aleksa Sarai <asarai at suse.de> wrote:

> On 2018-11-09, Masami Hiramatsu <mhiramat at kernel.org> wrote:
> > > diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
> > > index ee696efec99f..c4dfafd43e11 100644
> > > --- a/arch/x86/include/asm/ptrace.h
> > > +++ b/arch/x86/include/asm/ptrace.h
> > > @@ -172,6 +172,7 @@ static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> > >  	return regs->sp;
> > >  }
> > >  #endif
> > > +#define stack_addr(regs) ((unsigned long *) kernel_stack_pointer(regs))
> > 
> > No, you should use kernel_stack_pointer(regs) itself instead of stack_addr().
> > 
> > > 
> > >  #define GET_IP(regs) ((regs)->ip)
> > >  #define GET_FP(regs) ((regs)->bp)
> > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> > > index b0d1e81c96bb..eb4da885020c 100644
> > > --- a/arch/x86/kernel/kprobes/core.c
> > > +++ b/arch/x86/kernel/kprobes/core.c
> > > @@ -69,8 +69,6 @@
> > >  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> > >  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > >  
> > > -#define stack_addr(regs) ((unsigned long *)kernel_stack_pointer(regs))
> > 
> > I don't like keeping this meaningless macro... this should be replaced with generic
> > kernel_stack_pointer() macro.
> 
> Sure. This patch was just an example -- I can remove stack_addr() all
> over.
> 
> > > -	if (regs)
> > > -		save_stack_address(trace, regs->ip, nosched);
> > > +	if (regs) {
> > > +		/* XXX: Currently broken -- stack_addr(regs) doesn't match entry. */
> > > +		addr = regs->ip;
> > 
> > Since this part is for storing regs->ip as a top of call-stack, this
> > seems correct code. Stack unwind will be done next block.
> 
> This comment was referring to the usage of stack_addr(). stack_addr()
> doesn't give you the right result (it isn't the address of the return
> address -- it's slightly wrong). This is the main issue I was having --
> am I doing something wrong here?

Of course stack_addr() actually just returns where the stack is. It should
not return address, but maybe a return address from this event happens.
Note that the "regs != NULL" means you will be in the interrupt handler
and it will be returned to the regs->ip.


> > > +		//addr = ftrace_graph_ret_addr(current, &state.graph_idx, addr, stack_addr(regs));
> > 
> > so func graph return trampoline address will be shown only when unwinding stack entries.
> > I mean func-graph tracer is not used as an event, so it never kicks stackdump.
> 
> Just to make sure I understand what you're saying -- func-graph trace
> will never actually call __ftrace_stack_trace? Because if it does, then
> this code will be necessary (and then I'm a bit confused why the
> unwinder has func-graph trace code -- if stack traces are never taken
> under func-graph then the code in the unwinder is not necessary)

You seems misunderstanding. Even if this is not called from func-graph
tracer, the stack entries are already replaced with func-graph trampoline.
However, regs->ip (IRQ return address) is never replaced by the func-graph
trampoline.

> My reason for commenting this out is because at this point "state" isn't
> initialised and thus .graph_idx would not be correctly handled during
> unwind (and it's the same reason I commented it out later).

OK, but anyway, I think we don't need it.

> > > +		addr = kretprobe_ret_addr(current, addr, stack_addr(regs));
> > 
> > But since kretprobe will be an event, which can kick the stackdump.
> > BTW, from kretprobe, regs->ip should always be the trampoline handler, 
> > see arch/x86/kernel/kprobes/core.c:772 :-)
> > So it must be fixed always.
> 
> Right, but kretprobe_ret_addr() is returning the *original* return
> address (and we need to do an (addr == kretprobe_trampoline)). The
> real problem is that stack_addr(regs) isn't the same as it is during
> kretprobe setup (but kretprobe_ret_addr() works everywhere else).

I think stack_addr(regs) should be same when this is called from kretprobe
handler context. Otherwise, yes, it is not same, but in that case, regs->ip
is not kretprobe_trampoline too.

If you find kretprobe_trampoline on the "stack", of course it's address should be
same as it is during kretprobe setup, but if you find kretprobe_trampoline on the
regs->ip, that should always happen on kretprobe handler context. Otherwise,
some critical violation happens on kretprobe_trampoline. In that case, we should
dump the kretprobe_trampoline address itself, should not recover it.

> > > @@ -1856,6 +1870,41 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
> > >  }
> > >  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> > >  
> > > +unsigned long kretprobe_ret_addr(struct task_struct *tsk, unsigned long ret,
> > > +				 unsigned long *retp)
> > > +{
> > > +	struct kretprobe_instance *ri;
> > > +	unsigned long flags = 0;
> > > +	struct hlist_head *head;
> > > +	bool need_lock;
> > > +
> > > +	if (likely(ret != (unsigned long) &kretprobe_trampoline))
> > > +		return ret;
> > > +
> > > +	need_lock = !kretprobe_hash_is_locked(tsk);
> > > +	if (WARN_ON(need_lock))
> > > +		kretprobe_hash_lock(tsk, &head, &flags);
> > > +	else
> > > +		head = kretprobe_inst_table_head(tsk);
> > 
> > This may not work unless this is called from the kretprobe handler context,
> > since if we are out of kretprobe handler context, another CPU can lock the
> > hash table and it can be detected by kretprobe_hash_is_locked();.
> 
> Yeah, I noticed this as well when writing it (but needed a quick impl
> that I could test). I will fix this, thanks!
> 
> By is_kretprobe_handler_context() I imagine you are referring to
> checking is_kretprobe(current_kprobe())?

yes, that's correct :)

Thank you,

> 
> > So, we should check we are in the kretprobe handler context if tsk == current,
> > if not, we definately can lock the hash lock without any warning. This can
> > be something like;
> > 
> > if (is_kretprobe_handler_context()) {
> >   // kretprobe_hash_lock(current == tsk) has been locked by caller  
> >   if (tsk != current && kretprobe_hash(tsk) != kretprobe_hash(current))
> >     // the hash of tsk and current can be same.
> >     need_lock = true;
> > } else
> >   // we should take a lock for tsk.
> >   need_lock = true;
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>


-- 
Masami Hiramatsu <mhiramat at kernel.org>

WARNING: multiple messages have this Message-ID (diff)
From: mhiramat@kernel.org (Masami Hiramatsu)
Subject: [PATCH v3 1/2] kretprobe: produce sane stack traces
Date: Sun, 11 Nov 2018 00:31:37 +0900	[thread overview]
Message-ID: <20181111003137.c9df7a077d983cde57c06ee8@kernel.org> (raw)
Message-ID: <20181110153137.UAqKWcEAfIDdBm3ktencBhj7VQjpRdoGZFJcyW1cEZI@z> (raw)
In-Reply-To: <20181109150629.wpedwxsgbftkl3ab@mikami>

On Sat, 10 Nov 2018 02:06:29 +1100
Aleksa Sarai <asarai@suse.de> wrote:

> On 2018-11-09, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
> > > index ee696efec99f..c4dfafd43e11 100644
> > > --- a/arch/x86/include/asm/ptrace.h
> > > +++ b/arch/x86/include/asm/ptrace.h
> > > @@ -172,6 +172,7 @@ static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> > >  	return regs->sp;
> > >  }
> > >  #endif
> > > +#define stack_addr(regs) ((unsigned long *) kernel_stack_pointer(regs))
> > 
> > No, you should use kernel_stack_pointer(regs) itself instead of stack_addr().
> > 
> > > 
> > >  #define GET_IP(regs) ((regs)->ip)
> > >  #define GET_FP(regs) ((regs)->bp)
> > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> > > index b0d1e81c96bb..eb4da885020c 100644
> > > --- a/arch/x86/kernel/kprobes/core.c
> > > +++ b/arch/x86/kernel/kprobes/core.c
> > > @@ -69,8 +69,6 @@
> > >  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> > >  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > >  
> > > -#define stack_addr(regs) ((unsigned long *)kernel_stack_pointer(regs))
> > 
> > I don't like keeping this meaningless macro... this should be replaced with generic
> > kernel_stack_pointer() macro.
> 
> Sure. This patch was just an example -- I can remove stack_addr() all
> over.
> 
> > > -	if (regs)
> > > -		save_stack_address(trace, regs->ip, nosched);
> > > +	if (regs) {
> > > +		/* XXX: Currently broken -- stack_addr(regs) doesn't match entry. */
> > > +		addr = regs->ip;
> > 
> > Since this part is for storing regs->ip as a top of call-stack, this
> > seems correct code. Stack unwind will be done next block.
> 
> This comment was referring to the usage of stack_addr(). stack_addr()
> doesn't give you the right result (it isn't the address of the return
> address -- it's slightly wrong). This is the main issue I was having --
> am I doing something wrong here?

Of course stack_addr() actually just returns where the stack is. It should
not return address, but maybe a return address from this event happens.
Note that the "regs != NULL" means you will be in the interrupt handler
and it will be returned to the regs->ip.


> > > +		//addr = ftrace_graph_ret_addr(current, &state.graph_idx, addr, stack_addr(regs));
> > 
> > so func graph return trampoline address will be shown only when unwinding stack entries.
> > I mean func-graph tracer is not used as an event, so it never kicks stackdump.
> 
> Just to make sure I understand what you're saying -- func-graph trace
> will never actually call __ftrace_stack_trace? Because if it does, then
> this code will be necessary (and then I'm a bit confused why the
> unwinder has func-graph trace code -- if stack traces are never taken
> under func-graph then the code in the unwinder is not necessary)

You seems misunderstanding. Even if this is not called from func-graph
tracer, the stack entries are already replaced with func-graph trampoline.
However, regs->ip (IRQ return address) is never replaced by the func-graph
trampoline.

> My reason for commenting this out is because at this point "state" isn't
> initialised and thus .graph_idx would not be correctly handled during
> unwind (and it's the same reason I commented it out later).

OK, but anyway, I think we don't need it.

> > > +		addr = kretprobe_ret_addr(current, addr, stack_addr(regs));
> > 
> > But since kretprobe will be an event, which can kick the stackdump.
> > BTW, from kretprobe, regs->ip should always be the trampoline handler, 
> > see arch/x86/kernel/kprobes/core.c:772 :-)
> > So it must be fixed always.
> 
> Right, but kretprobe_ret_addr() is returning the *original* return
> address (and we need to do an (addr == kretprobe_trampoline)). The
> real problem is that stack_addr(regs) isn't the same as it is during
> kretprobe setup (but kretprobe_ret_addr() works everywhere else).

I think stack_addr(regs) should be same when this is called from kretprobe
handler context. Otherwise, yes, it is not same, but in that case, regs->ip
is not kretprobe_trampoline too.

If you find kretprobe_trampoline on the "stack", of course it's address should be
same as it is during kretprobe setup, but if you find kretprobe_trampoline on the
regs->ip, that should always happen on kretprobe handler context. Otherwise,
some critical violation happens on kretprobe_trampoline. In that case, we should
dump the kretprobe_trampoline address itself, should not recover it.

> > > @@ -1856,6 +1870,41 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
> > >  }
> > >  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> > >  
> > > +unsigned long kretprobe_ret_addr(struct task_struct *tsk, unsigned long ret,
> > > +				 unsigned long *retp)
> > > +{
> > > +	struct kretprobe_instance *ri;
> > > +	unsigned long flags = 0;
> > > +	struct hlist_head *head;
> > > +	bool need_lock;
> > > +
> > > +	if (likely(ret != (unsigned long) &kretprobe_trampoline))
> > > +		return ret;
> > > +
> > > +	need_lock = !kretprobe_hash_is_locked(tsk);
> > > +	if (WARN_ON(need_lock))
> > > +		kretprobe_hash_lock(tsk, &head, &flags);
> > > +	else
> > > +		head = kretprobe_inst_table_head(tsk);
> > 
> > This may not work unless this is called from the kretprobe handler context,
> > since if we are out of kretprobe handler context, another CPU can lock the
> > hash table and it can be detected by kretprobe_hash_is_locked();.
> 
> Yeah, I noticed this as well when writing it (but needed a quick impl
> that I could test). I will fix this, thanks!
> 
> By is_kretprobe_handler_context() I imagine you are referring to
> checking is_kretprobe(current_kprobe())?

yes, that's correct :)

Thank you,

> 
> > So, we should check we are in the kretprobe handler context if tsk == current,
> > if not, we definately can lock the hash lock without any warning. This can
> > be something like;
> > 
> > if (is_kretprobe_handler_context()) {
> >   // kretprobe_hash_lock(current == tsk) has been locked by caller  
> >   if (tsk != current && kretprobe_hash(tsk) != kretprobe_hash(current))
> >     // the hash of tsk and current can be same.
> >     need_lock = true;
> > } else
> >   // we should take a lock for tsk.
> >   need_lock = true;
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>


-- 
Masami Hiramatsu <mhiramat at kernel.org>

  reply	other threads:[~2018-11-10 15:31 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-01  8:35 [PATCH v3 0/2] kretprobe: produce sane stack traces Aleksa Sarai
2018-11-01  8:35 ` Aleksa Sarai
2018-11-01  8:35 ` cyphar
2018-11-01  8:35 ` [PATCH v3 1/2] " Aleksa Sarai
2018-11-01  8:35   ` Aleksa Sarai
2018-11-01  8:35   ` cyphar
2018-11-01 15:20   ` Masami Hiramatsu
2018-11-01 15:20     ` Masami Hiramatsu
2018-11-01 15:20     ` mhiramat
2018-11-01 15:20     ` Masami Hiramatsu
2018-11-01 21:13     ` Aleksa Sarai
2018-11-01 21:13       ` Aleksa Sarai
2018-11-01 21:13       ` cyphar
2018-11-01 21:13       ` Aleksa Sarai
2018-11-02  3:04       ` Masami Hiramatsu
2018-11-02  3:04         ` Masami Hiramatsu
2018-11-02  3:04         ` mhiramat
2018-11-02  3:04         ` Masami Hiramatsu
2018-11-02  4:37         ` Aleksa Sarai
2018-11-02  4:37           ` Aleksa Sarai
2018-11-02  4:37           ` cyphar
2018-11-02  4:37           ` Aleksa Sarai
2018-11-03 12:47           ` Masami Hiramatsu
2018-11-03 12:47             ` Masami Hiramatsu
2018-11-03 12:47             ` mhiramat
2018-11-03 12:47             ` Masami Hiramatsu
2018-11-02  0:47   ` Steven Rostedt
2018-11-02  0:47     ` Steven Rostedt
2018-11-02  0:47     ` rostedt
2018-11-02  0:47     ` Steven Rostedt
2018-11-02  5:05     ` Aleksa Sarai
2018-11-02  5:05       ` Aleksa Sarai
2018-11-02  5:05       ` cyphar
2018-11-02  5:05       ` Aleksa Sarai
2018-11-02  6:59       ` Aleksa Sarai
2018-11-02  6:59         ` Aleksa Sarai
2018-11-02  6:59         ` cyphar
2018-11-02  6:59         ` Aleksa Sarai
2018-11-02 13:16         ` Steven Rostedt
2018-11-02 13:16           ` Steven Rostedt
2018-11-02 13:16           ` rostedt
2018-11-02 13:16           ` Steven Rostedt
2018-11-02 15:43           ` Josh Poimboeuf
2018-11-02 15:43             ` Josh Poimboeuf
2018-11-02 15:43             ` jpoimboe
2018-11-02 15:43             ` Josh Poimboeuf
2018-11-02 16:13             ` Steven Rostedt
2018-11-02 16:13               ` Steven Rostedt
2018-11-02 16:13               ` rostedt
2018-11-02 16:13               ` Steven Rostedt
2018-11-03 13:00               ` Masami Hiramatsu
2018-11-03 13:00                 ` Masami Hiramatsu
2018-11-03 13:00                 ` mhiramat
2018-11-03 13:00                 ` Masami Hiramatsu
2018-11-03 13:13                 ` Steven Rostedt
2018-11-03 13:13                   ` Steven Rostedt
2018-11-03 13:13                   ` rostedt
2018-11-03 13:13                   ` Steven Rostedt
2018-11-03 16:34                   ` Masami Hiramatsu
2018-11-03 16:34                     ` Masami Hiramatsu
2018-11-03 16:34                     ` mhiramat
2018-11-03 16:34                     ` Masami Hiramatsu
2018-11-03 17:30                     ` Steven Rostedt
2018-11-03 17:30                       ` Steven Rostedt
2018-11-03 17:30                       ` rostedt
2018-11-03 17:30                       ` Steven Rostedt
2018-11-03 17:33                       ` Steven Rostedt
2018-11-03 17:33                         ` Steven Rostedt
2018-11-03 17:33                         ` rostedt
2018-11-03 17:33                         ` Steven Rostedt
2018-11-04  2:25                       ` Masami Hiramatsu
2018-11-04  2:25                         ` Masami Hiramatsu
2018-11-04  2:25                         ` mhiramat
2018-11-04  2:25                         ` Masami Hiramatsu
2018-11-03  7:02           ` Aleksa Sarai
2018-11-03  7:02             ` Aleksa Sarai
2018-11-03  7:02             ` cyphar
2018-11-03  7:02             ` Aleksa Sarai
2018-11-04 11:59             ` Aleksa Sarai
2018-11-04 11:59               ` Aleksa Sarai
2018-11-04 11:59               ` cyphar
2018-11-04 11:59               ` Aleksa Sarai
2018-11-06 22:15               ` Steven Rostedt
2018-11-06 22:15                 ` Steven Rostedt
2018-11-06 22:15                 ` rostedt
2018-11-06 22:15                 ` Steven Rostedt
2018-11-08  7:46                 ` Aleksa Sarai
2018-11-08  7:46                   ` Aleksa Sarai
2018-11-08  7:46                   ` cyphar
2018-11-08  7:46                   ` Aleksa Sarai
2018-11-08  8:04                   ` Aleksa Sarai
2018-11-08  8:04                     ` Aleksa Sarai
2018-11-08  8:04                     ` cyphar
2018-11-08  8:04                     ` Aleksa Sarai
2018-11-08 14:44                     ` Josh Poimboeuf
2018-11-08 14:44                       ` Josh Poimboeuf
2018-11-08 14:44                       ` jpoimboe
2018-11-08 14:44                       ` Josh Poimboeuf
2018-11-09  7:26                       ` Masami Hiramatsu
2018-11-09  7:26                         ` Masami Hiramatsu
2018-11-09  7:26                         ` mhiramat
2018-11-09  7:26                         ` Masami Hiramatsu
2018-11-09 15:10                         ` Aleksa Sarai
2018-11-09 15:10                           ` Aleksa Sarai
2018-11-09 15:10                           ` asarai
2018-11-09 15:10                           ` Aleksa Sarai
2018-11-09  7:15                     ` Masami Hiramatsu
2018-11-09  7:15                       ` Masami Hiramatsu
2018-11-09  7:15                       ` mhiramat
2018-11-09  7:15                       ` Masami Hiramatsu
2018-11-09 15:06                       ` Aleksa Sarai
2018-11-09 15:06                         ` Aleksa Sarai
2018-11-09 15:06                         ` asarai
2018-11-09 15:06                         ` Aleksa Sarai
2018-11-10 15:31                         ` Masami Hiramatsu [this message]
2018-11-10 15:31                           ` Masami Hiramatsu
2018-11-10 15:31                           ` mhiramat
2018-11-10 15:31                           ` Masami Hiramatsu
2018-11-12 10:38                           ` Aleksa Sarai
2018-11-12 10:38                             ` Aleksa Sarai
2018-11-12 10:38                             ` asarai
2018-11-12 10:38                             ` Aleksa Sarai
2018-11-03 13:23           ` Masami Hiramatsu
2018-11-03 13:23             ` Masami Hiramatsu
2018-11-03 13:23             ` mhiramat
2018-11-03 13:23             ` Masami Hiramatsu
2018-11-02  7:58       ` Aleksa Sarai
2018-11-02  7:58         ` Aleksa Sarai
2018-11-02  7:58         ` cyphar
2018-11-02  7:58         ` Aleksa Sarai
2018-11-02  4:01   ` kbuild test robot
2018-11-02  4:01     ` kbuild test robot
2018-11-02  4:01     ` lkp
2018-11-02  4:01     ` kbuild test robot
2018-11-01  8:35 ` [PATCH v3 2/2] trace: remove kretprobed checks Aleksa Sarai
2018-11-01  8:35   ` Aleksa Sarai
2018-11-01  8:35   ` cyphar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181111003137.c9df7a077d983cde57c06ee8@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=asarai@suse.de \
    --cc=ast@kernel.org \
    --cc=bgregg@netflix.com \
    --cc=christian@brauner.io \
    --cc=corbet@lwn.net \
    --cc=cyphar@cyphar.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jolsa@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.