linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer
@ 2015-07-13  5:29 AKASHI Takahiro
  2015-07-13  5:29 ` [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64 AKASHI Takahiro
                   ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-13  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

As reported in the thread below[1], the output from stack tracer using
ftrace on arm64 seems to be incorrect due to different reasons. Each
problem is described and fixed repsectively in the following patches.
Please see the commit messages for the details.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/354126.html

If the patch[1/3], which adds "#ifdef CONFIG_ARM64" to generic ftrace code,
is not acceptable, we will have to introduce an arch-dependent function,
ie. arch_check_stack().

Even with those patches, we see another issue that the values in 'Size'
field are *inaccurate*. This is simply because we have no way to estimate
the value of stack pointer at each function from the content of stack.
Thus the function's reported stack size does not include its own local
variables, but includes *its child's* ones.
See more details below.

In my opinion, we cannot fix this issue unless analyzing the function's
first instruction, ie. "stp x29, x30, [sp, #xx]!".

* How does stack tracer identify each function's stack size?

Take an example, func-0 calls func-1 and func-1 calls func-2.
The stack layout looks like the below:
("p" is a temp variable in check_stack().)

sp2     +-------+ <--------- func-2's stackframe
        |       |
        |       |
fp2     +-------+
        |  fp1  |
        +-------+ <-- p1 (*p1 == stack_dump_trace[i] == lr1)
        |  lr1  |
        +-------+
        |       |
        |  func-2's local variables
        |       |
sp1     +-------+ <--------- func-1(lr1)'s stackframe
        |       |             (stack_dump_index[i] = top - p1)
        |  func-1's dynamic local variables
        |       |
fp1     +-------+
        |  fp0  |
        +-------+ <-- p0 (*p0 == stack_dump_trace[i+1] == lr0)
        |  lr0  |
        +-------+
        |       |
        |  func-1's local variables
        |       |
sp0     +-------+ <--------- func-0(lr0)'s stackframe
        |       |             (stack_dump_index[i+1] = top - p0)
        |       |
        *-------+ top

Stack tracer records the stack height of func-1 (== stack_dump_trace[i]):
	stack_dump_index[i] = <stack top> - <estimated stack pointer>
in check_stack() by searching for func-1's return address (lr1)
and eventually calculates func-1's stack size by:
	stack_dump_index[i] - stack_dump_index[i+1]
		=> (top - p1) - (top - p0)
		=> p1 - p0

On x86, this calculation is correct because x86's call instruction pushes
the return address to the stack and jump into the child(func-2) function,
thus the func-1's stack pointer is "p1" where *p1 is equal to
stack_dump_trace[i]. But this is not true on arm64.

AKASHI Takahiro (3):
  ftrace: adjust a function's pc to search for in check_stack() for
    arm64
  arm64: refactor save_stack_trace()
  arm64: ftrace: mcount() should not create a stack frame

 arch/arm64/kernel/entry-ftrace.S |   15 ++++++++-------
 arch/arm64/kernel/stacktrace.c   |   31 +++++++++++++++++++++++--------
 kernel/trace/trace_stack.c       |    4 ++++
 3 files changed, 35 insertions(+), 15 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64
  2015-07-13  5:29 [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer AKASHI Takahiro
@ 2015-07-13  5:29 ` AKASHI Takahiro
  2015-07-13 15:24   ` Steven Rostedt
  2015-07-13  5:29 ` [RFC 2/3] arm64: refactor save_stack_trace() AKASHI Takahiro
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-13  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

Ftace's stack tracer on arm64 returns wrong information about call stacks:

        Depth    Size   Location    (50 entries)
        -----    ----   --------
  0)     5256       0   notifier_call_chain+0x30/0x94
  1)     5256       0   ftrace_call+0x0/0x4
  2)     5256       0   notifier_call_chain+0x2c/0x94
  3)     5256       0   raw_notifier_call_chain+0x34/0x44
  4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
  5)     5256       0   update_wall_time+0x408/0x6dc

Most of 'Size' fields are unexpectedly zero.

This is because stack tracer fails to recognize each function's stack frame
in check_stack(). Stack tracer searches for a function's pc in the stack
based on the list returned by save_stack_trace(), but save_stack_trace() on
arm64 does not return the exact return address saved in a stack frame, but
a value decrmented by 4 (which means a branch instruction's address).
This behavior was introduced by
    commit e306dfd06fcb ("ARM64: unwind: Fix PC calculation")

So the matching doesn't succeed in most cases.

This problem can be fixed either by
a) reverting the commit above
b) adding an arm64-specific hack to check_patch()

This patch does b).

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 kernel/trace/trace_stack.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 3f34496..7086fc3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -143,7 +143,11 @@ check_stack(unsigned long ip, unsigned long *stack)
 		p = start;
 
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
+#ifdef CONFIG_ARM64
+			if (*p == (stack_dump_trace[i] + 4)) {
+#else
 			if (*p == stack_dump_trace[i]) {
+#endif
 				this_size = stack_dump_index[i++] =
 					(top - p) * sizeof(unsigned long);
 				found = 1;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-13  5:29 [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer AKASHI Takahiro
  2015-07-13  5:29 ` [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64 AKASHI Takahiro
@ 2015-07-13  5:29 ` AKASHI Takahiro
  2015-07-14 12:47   ` Jungseok Lee
  2015-07-13  5:29 ` [RFC 3/3] arm64: ftrace: mcount() should not create a stack frame AKASHI Takahiro
  2015-07-13 15:01 ` [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer Jungseok Lee
  3 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-13  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

Ftrace's stack tracer on arm64 returns wrong information about call stacks:

        Depth    Size   Location    (50 entries)
        -----    ----   --------
  0)     5256       0   notifier_call_chain+0x30/0x94
  1)     5256       0   ftrace_call+0x0/0x4
  2)     5256       0   notifier_call_chain+0x2c/0x94
  3)     5256       0   raw_notifier_call_chain+0x34/0x44
  4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
  5)     5256       0   update_wall_time+0x408/0x6dc

One of tracer functions, ftrace_call (or mcount), is unexpectedly listed.
The *bare* stack dump returned by save_stack_trace() is:
	save_stack_trace_tsk()
	save_stack_trace()
	stack_trace_call()
	ftrace_ops_no_ops()
	ftrace_call()
	notifier_call_chain()
	raw_notifier_call_chain()
	...

On arm64, save_stack_trace() calls save_stack_trace_tsk() and this will
result in putting additional stack frame in the returned list.
This behavior, however, conflicts with  stack stracer's assumption
that the number of functions to be skiped as part of tracer is 4, from
save_stack_trace() to mcount(), if ftrace_ops_list_func() is used.
The value is hard coded in check_patch().

This patch refactors save_stack_trace() and save_stack_trace_tsk() in order
to reduce the stack depth by making the common code inlined.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/stacktrace.c |   31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 407991b..978c923 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -97,34 +97,49 @@ static int save_trace(struct stackframe *frame, void *d)
 	return trace->nr_entries >= trace->max_entries;
 }
 
-void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+static inline void __save_stack_trace(struct stackframe *frame,
+			struct stack_trace *trace, int no_sched)
 {
 	struct stack_trace_data data;
-	struct stackframe frame;
 
 	data.trace = trace;
 	data.skip = trace->skip;
+	data.no_sched_functions = no_sched;
+
+	walk_stackframe(frame, save_trace, &data);
+	if (trace->nr_entries < trace->max_entries)
+		trace->entries[trace->nr_entries++] = ULONG_MAX;
+}
+
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+	struct stackframe frame;
+	int no_sched;
 
 	if (tsk != current) {
-		data.no_sched_functions = 1;
+		no_sched = 1;
 		frame.fp = thread_saved_fp(tsk);
 		frame.sp = thread_saved_sp(tsk);
 		frame.pc = thread_saved_pc(tsk);
 	} else {
-		data.no_sched_functions = 0;
+		no_sched = 0;
 		frame.fp = (unsigned long)__builtin_frame_address(0);
 		frame.sp = current_stack_pointer;
 		frame.pc = (unsigned long)save_stack_trace_tsk;
 	}
 
-	walk_stackframe(&frame, save_trace, &data);
-	if (trace->nr_entries < trace->max_entries)
-		trace->entries[trace->nr_entries++] = ULONG_MAX;
+	__save_stack_trace(&frame, trace, no_sched);
 }
 
 void save_stack_trace(struct stack_trace *trace)
 {
-	save_stack_trace_tsk(current, trace);
+	struct stackframe frame;
+
+	frame.fp = (unsigned long)__builtin_frame_address(0);
+	frame.sp = current_stack_pointer;
+	frame.pc = (unsigned long)save_stack_trace_tsk;
+
+	__save_stack_trace(&frame, trace, 0);
 }
 EXPORT_SYMBOL_GPL(save_stack_trace);
 #endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 3/3] arm64: ftrace: mcount() should not create a stack frame
  2015-07-13  5:29 [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer AKASHI Takahiro
  2015-07-13  5:29 ` [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64 AKASHI Takahiro
  2015-07-13  5:29 ` [RFC 2/3] arm64: refactor save_stack_trace() AKASHI Takahiro
@ 2015-07-13  5:29 ` AKASHI Takahiro
  2015-07-13 15:01 ` [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer Jungseok Lee
  3 siblings, 0 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-13  5:29 UTC (permalink / raw)
  To: linux-arm-kernel

Ftrace's stack tracer on arm64 returns wrong information about call stacks:

        Depth    Size   Location    (50 entries)
        -----    ----   --------
  0)     5256       0   notifier_call_chain+0x30/0x94
  1)     5256       0   ftrace_call+0x0/0x4
  2)     5256       0   notifier_call_chain+0x2c/0x94
  3)     5256       0   raw_notifier_call_chain+0x34/0x44
  4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
  5)     5256       0   update_wall_time+0x408/0x6dc

The instrumented function, notifier_call_chain(), appears twice.

On x86 (and other arch's), mcount (or ftrace_call) does not create a new
stack frame. This will eventually result in not listing the instrumented
function in save_stack_call() because the function's returned address does
not appear as saved lr in the stack. Stack tracer, instead, explicitly adds
this skipped function at the top of the list later in check_stack().
That is why it is listed twice on arm64 as shown above.

This patch modifies arm64 function tracer not to create a stack frame at
mcount() in order to fix this issue.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/entry-ftrace.S |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 08cafc5..c74fa12 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -26,9 +26,11 @@
  * as long as the kernel is compiled without -fomit-frame-pointer.
  * (or CONFIG_FRAME_POINTER, this is forced on arm64)
  *
+ * We don't update a frame pointer here as ftrace, in special stack tracer,
+ * assumes not. If we did, the instrumented function would be listed twice.
  * stack layout after mcount_enter in _mcount():
  *
- * current sp/fp =>  0:+-----+
+ * current sp    =>  0:+-----+
  * in _mcount()        | x29 | -> instrumented function's fp
  *                     +-----+
  *                     | x30 | -> _mcount()'s lr (= instrumented function's pc)
@@ -47,7 +49,6 @@
 
 	.macro mcount_enter
 	stp	x29, x30, [sp, #-16]!
-	mov	x29, sp
 	.endm
 
 	.macro mcount_exit
@@ -61,7 +62,7 @@
 
 	/* for instrumented function's parent */
 	.macro mcount_get_parent_fp reg
-	ldr	\reg, [x29]
+	ldr	\reg, [sp]
 	ldr	\reg, [\reg]
 	.endm
 
@@ -71,18 +72,18 @@
 	.endm
 
 	.macro mcount_get_pc reg
-	ldr	\reg, [x29, #8]
+	ldr	\reg, [sp, #8]
 	mcount_adjust_addr	\reg, \reg
 	.endm
 
 	.macro mcount_get_lr reg
-	ldr	\reg, [x29]
+	ldr	\reg, [sp]
 	ldr	\reg, [\reg, #8]
 	mcount_adjust_addr	\reg, \reg
 	.endm
 
 	.macro mcount_get_lr_addr reg
-	ldr	\reg, [x29]
+	ldr	\reg, [sp]
 	add	\reg, \reg, #8
 	.endm
 
@@ -205,7 +206,7 @@ ENDPROC(ftrace_graph_caller)
  */
 ENTRY(return_to_handler)
 	str	x0, [sp, #-16]!
-	mov	x0, x29			//     parent's fp
+	mov	x0, sp			//     parent's fp
 	bl	ftrace_return_to_handler// addr = ftrace_return_to_hander(fp);
 	mov	x30, x0			// restore the original return address
 	ldr	x0, [sp], #16
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer
  2015-07-13  5:29 [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer AKASHI Takahiro
                   ` (2 preceding siblings ...)
  2015-07-13  5:29 ` [RFC 3/3] arm64: ftrace: mcount() should not create a stack frame AKASHI Takahiro
@ 2015-07-13 15:01 ` Jungseok Lee
  3 siblings, 0 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-13 15:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 13, 2015, at 2:29 PM, AKASHI Takahiro wrote:

Hi, AKASHI

> As reported in the thread below[1], the output from stack tracer using
> ftrace on arm64 seems to be incorrect due to different reasons. Each
> problem is described and fixed repsectively in the following patches.
> Please see the commit messages for the details.
> 
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/354126.html
> 
> If the patch[1/3], which adds "#ifdef CONFIG_ARM64" to generic ftrace code,
> is not acceptable, we will have to introduce an arch-dependent function,
> ie. arch_check_stack().
> 
> Even with those patches, we see another issue that the values in 'Size'
> field are *inaccurate*. This is simply because we have no way to estimate
> the value of stack pointer at each function from the content of stack.
> Thus the function's reported stack size does not include its own local
> variables, but includes *its child's* ones.
> See more details below.
> 
> In my opinion, we cannot fix this issue unless analyzing the function's
> first instruction, ie. "stp x29, x30, [sp, #xx]!".
> 
> * How does stack tracer identify each function's stack size?
> 
> Take an example, func-0 calls func-1 and func-1 calls func-2.
> The stack layout looks like the below:
> ("p" is a temp variable in check_stack().)
> 
> sp2     +-------+ <--------- func-2's stackframe
>        |       |
>        |       |
> fp2     +-------+
>        |  fp1  |
>        +-------+ <-- p1 (*p1 == stack_dump_trace[i] == lr1)
>        |  lr1  |
>        +-------+
>        |       |
>        |  func-2's local variables
>        |       |
> sp1     +-------+ <--------- func-1(lr1)'s stackframe
>        |       |             (stack_dump_index[i] = top - p1)
>        |  func-1's dynamic local variables
>        |       |
> fp1     +-------+
>        |  fp0  |
>        +-------+ <-- p0 (*p0 == stack_dump_trace[i+1] == lr0)
>        |  lr0  |
>        +-------+
>        |       |
>        |  func-1's local variables
>        |       |
> sp0     +-------+ <--------- func-0(lr0)'s stackframe
>        |       |             (stack_dump_index[i+1] = top - p0)
>        |       |
>        *-------+ top
> 
> Stack tracer records the stack height of func-1 (== stack_dump_trace[i]):
> 	stack_dump_index[i] = <stack top> - <estimated stack pointer>
> in check_stack() by searching for func-1's return address (lr1)
> and eventually calculates func-1's stack size by:
> 	stack_dump_index[i] - stack_dump_index[i+1]
> 		=> (top - p1) - (top - p0)
> 		=> p1 - p0
> 
> On x86, this calculation is correct because x86's call instruction pushes
> the return address to the stack and jump into the child(func-2) function,
> thus the func-1's stack pointer is "p1" where *p1 is equal to
> stack_dump_trace[i]. But this is not true on arm64.
> 
> AKASHI Takahiro (3):
>  ftrace: adjust a function's pc to search for in check_stack() for
>    arm64
>  arm64: refactor save_stack_trace()
>  arm64: ftrace: mcount() should not create a stack frame

One of stack_trace examples with this patch set is as follows.
*0* size does not appear any more now.

        Depth    Size   Location    (36 entries)
        -----    ----   --------
  0)     4504      96   put_cpu_partial+0x2c/0x1d0
  1)     4408      80   get_partial_node.isra.64+0x13c/0x344
  2)     4328     256   __slab_alloc.isra.65.constprop.67+0xd8/0x37c
  3)     4072      32   kmem_cache_alloc+0x258/0x294
  4)     4040     304   __alloc_skb+0x48/0x180
  5)     3736      96   alloc_skb_with_frags+0x74/0x234
  6)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
  7)     3528     160   sock_alloc_send_skb+0x44/0x54
  8)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
  9)     3304     224   ip_append_data.part.42+0x98/0xe8
 10)     3080     112   ip_append_data+0x68/0x7c
 11)     2968      96   icmp_push_reply+0x7c/0x144
 12)     2872      96   icmp_send+0x3c0/0x3c8
 13)     2776     192   __udp4_lib_rcv+0x5b8/0x684
 14)     2584      96   udp_rcv+0x2c/0x3c
 15)     2488      32   ip_local_deliver+0xa0/0x224
 16)     2456      48   ip_rcv+0x360/0x57c
 17)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
 18)     2344     128   __netif_receive_skb+0x24/0x84
 19)     2216      32   process_backlog+0x9c/0x15c
 20)     2184      80   net_rx_action+0x1ec/0x32c
 21)     2104     160   __do_softirq+0x114/0x2f0
 22)     1944     128   do_softirq+0x60/0x68
 23)     1816      32   __local_bh_enable_ip+0xb0/0xd4
 24)     1784      32   ip_finish_output+0x1f4/0xabc
 25)     1752      96   ip_output+0xf0/0x120
 26)     1656      64   ip_local_out_sk+0x44/0x54
 27)     1592      32   ip_send_skb+0x24/0xbc
 28)     1560      48   udp_send_skb+0x1b4/0x2f4
 29)     1512      80   udp_sendmsg+0x2a8/0x7a0
 30)     1432     272   inet_sendmsg+0xa0/0xd0
 31)     1160      48   sock_sendmsg+0x30/0x78
 32)     1112      32   ___sys_sendmsg+0x15c/0x26c
 33)     1080     400   __sys_sendmmsg+0x94/0x180
 34)      680     320   SyS_sendmmsg+0x38/0x54
 35)      360     360   el0_svc_naked+0x20/0x28

I will leave more comments after reviewing and testing them carefully.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64
  2015-07-13  5:29 ` [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64 AKASHI Takahiro
@ 2015-07-13 15:24   ` Steven Rostedt
  2015-07-15  0:22     ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-13 15:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 13 Jul 2015 14:29:33 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> Ftace's stack tracer on arm64 returns wrong information about call stacks:
> 
>         Depth    Size   Location    (50 entries)
>         -----    ----   --------
>   0)     5256       0   notifier_call_chain+0x30/0x94
>   1)     5256       0   ftrace_call+0x0/0x4
>   2)     5256       0   notifier_call_chain+0x2c/0x94
>   3)     5256       0   raw_notifier_call_chain+0x34/0x44
>   4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
>   5)     5256       0   update_wall_time+0x408/0x6dc
> 
> Most of 'Size' fields are unexpectedly zero.
> 
> This is because stack tracer fails to recognize each function's stack frame
> in check_stack(). Stack tracer searches for a function's pc in the stack
> based on the list returned by save_stack_trace(), but save_stack_trace() on
> arm64 does not return the exact return address saved in a stack frame, but
> a value decrmented by 4 (which means a branch instruction's address).
> This behavior was introduced by
>     commit e306dfd06fcb ("ARM64: unwind: Fix PC calculation")
> 
> So the matching doesn't succeed in most cases.
> 
> This problem can be fixed either by
> a) reverting the commit above
> b) adding an arm64-specific hack to check_patch()
> 
> This patch does b).
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  kernel/trace/trace_stack.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 3f34496..7086fc3 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -143,7 +143,11 @@ check_stack(unsigned long ip, unsigned long *stack)
>  		p = start;
>  
>  		for (; p < top && i < max_stack_trace.nr_entries; p++) {
> +#ifdef CONFIG_ARM64
> +			if (*p == (stack_dump_trace[i] + 4)) {
> +#else
>  			if (*p == stack_dump_trace[i]) {
> +#endif

Instead of the ugly #ifdef in this code, please add a macro
FTRACE_STACK_FRAME_OFFSET

Then in include/linux/ftrace.h have:

#ifndef FTRACE_STACK_FRAME_OFFSET
# define FTRACE_STACK_FRAME_OFFSET 0
#endif

And in arch/arm64/include/asm/ftrace.h

#define FTRACE_STACK_FRAME_OFFSET 4

And then just do:

	if (*p == (stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET)) {

-- Steve

>  				this_size = stack_dump_index[i++] =
>  					(top - p) * sizeof(unsigned long);
>  				found = 1;

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-13  5:29 ` [RFC 2/3] arm64: refactor save_stack_trace() AKASHI Takahiro
@ 2015-07-14 12:47   ` Jungseok Lee
  2015-07-14 13:31     ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-07-14 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 13, 2015, at 2:29 PM, AKASHI Takahiro wrote:

Hi, AKASHI

> Ftrace's stack tracer on arm64 returns wrong information about call stacks:
> 
>        Depth    Size   Location    (50 entries)
>        -----    ----   --------
>  0)     5256       0   notifier_call_chain+0x30/0x94
>  1)     5256       0   ftrace_call+0x0/0x4
>  2)     5256       0   notifier_call_chain+0x2c/0x94
>  3)     5256       0   raw_notifier_call_chain+0x34/0x44
>  4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
>  5)     5256       0   update_wall_time+0x408/0x6dc
> 
> One of tracer functions, ftrace_call (or mcount), is unexpectedly listed.
> The *bare* stack dump returned by save_stack_trace() is:
> 	save_stack_trace_tsk()
> 	save_stack_trace()
> 	stack_trace_call()
> 	ftrace_ops_no_ops()
> 	ftrace_call()
> 	notifier_call_chain()
> 	raw_notifier_call_chain()
> 	...

Is the below example an unexpected result?
Entry 17 and 18 are ftrace_call and ftrace_ops_no_ops, respectively.

        Depth    Size   Location    (51 entries)
        -----    ----   --------
  0)     5352      96   _raw_spin_unlock_irqrestore+0x1c/0x60
  1)     5256      48   gic_raise_softirq+0xa0/0xbc
  2)     5208      80   smp_cross_call+0x40/0xbc
  3)     5128      48   smp_send_reschedule+0x38/0x48
  4)     5080      32   trigger_load_balance+0x184/0x29c
  5)     5048     112   scheduler_tick+0xac/0x104
  6)     4936      64   update_process_times+0x5c/0x74
  7)     4872      32   tick_sched_handle.isra.15+0x38/0x7c
  8)     4840      48   tick_sched_timer+0x48/0x90
  9)     4792      48   __run_hrtimer+0x60/0x258
 10)     4744      64   hrtimer_interrupt+0xe8/0x260
 11)     4680     128   arch_timer_handler_virt+0x38/0x48
 12)     4552      32   handle_percpu_devid_irq+0x84/0x188
 13)     4520      64   generic_handle_irq+0x38/0x54
 14)     4456      32   __handle_domain_irq+0x68/0xbc
 15)     4424      64   gic_handle_irq+0x38/0x88
 16)     4360     280   el1_irq+0x64/0xd8
 17)     4080     168   ftrace_ops_no_ops+0xb4/0x16c
 18)     3912      32   ftrace_call+0x0/0x4
 19)     3880     144   __alloc_skb+0x48/0x180
 20)     3736      96   alloc_skb_with_frags+0x74/0x234
 21)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
 22)     3528     160   sock_alloc_send_skb+0x44/0x54
 23)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
 24)     3304     224   ip_append_data.part.42+0x98/0xe8
 25)     3080     112   ip_append_data+0x68/0x7c
 26)     2968      96   icmp_push_reply+0x7c/0x144
 27)     2872      96   icmp_send+0x3c0/0x3c8
 28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
 29)     2584      96   udp_rcv+0x2c/0x3c
 30)     2488      32   ip_local_deliver+0xa0/0x224
 31)     2456      48   ip_rcv+0x360/0x57c
 32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
 33)     2344     128   __netif_receive_skb+0x24/0x84
 34)     2216      32   process_backlog+0x9c/0x15c
 35)     2184      80   net_rx_action+0x1ec/0x32c
 36)     2104     160   __do_softirq+0x114/0x2f0
 37)     1944     128   do_softirq+0x60/0x68
 38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
 39)     1784      32   ip_finish_output+0x1f4/0xabc
 40)     1752      96   ip_output+0xf0/0x120
 41)     1656      64   ip_local_out_sk+0x44/0x54
 42)     1592      32   ip_send_skb+0x24/0xbc
 43)     1560      48   udp_send_skb+0x1b4/0x2f4
 44)     1512      80   udp_sendmsg+0x2a8/0x7a0
 45)     1432     272   inet_sendmsg+0xa0/0xd0
 46)     1160      48   sock_sendmsg+0x30/0x78
 47)     1112      32   ___sys_sendmsg+0x15c/0x26c
 48)     1080     400   __sys_sendmmsg+0x94/0x180
 49)      680     320   SyS_sendmmsg+0x38/0x54
 50)      360     360   el0_svc_naked+0x20/0x28

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-14 12:47   ` Jungseok Lee
@ 2015-07-14 13:31     ` Steven Rostedt
  2015-07-15  0:20       ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-14 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 14 Jul 2015 21:47:10 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:

> Is the below example an unexpected result?
> Entry 17 and 18 are ftrace_call and ftrace_ops_no_ops, respectively.
> 
>         Depth    Size   Location    (51 entries)
>         -----    ----   --------
>   0)     5352      96   _raw_spin_unlock_irqrestore+0x1c/0x60
>   1)     5256      48   gic_raise_softirq+0xa0/0xbc
>   2)     5208      80   smp_cross_call+0x40/0xbc
>   3)     5128      48   smp_send_reschedule+0x38/0x48
>   4)     5080      32   trigger_load_balance+0x184/0x29c
>   5)     5048     112   scheduler_tick+0xac/0x104
>   6)     4936      64   update_process_times+0x5c/0x74
>   7)     4872      32   tick_sched_handle.isra.15+0x38/0x7c
>   8)     4840      48   tick_sched_timer+0x48/0x90
>   9)     4792      48   __run_hrtimer+0x60/0x258
>  10)     4744      64   hrtimer_interrupt+0xe8/0x260
>  11)     4680     128   arch_timer_handler_virt+0x38/0x48
>  12)     4552      32   handle_percpu_devid_irq+0x84/0x188
>  13)     4520      64   generic_handle_irq+0x38/0x54
>  14)     4456      32   __handle_domain_irq+0x68/0xbc
>  15)     4424      64   gic_handle_irq+0x38/0x88
>  16)     4360     280   el1_irq+0x64/0xd8

Note, function tracing does not disable interrupts. This looks to be
that an interrupt came in while __aloc_skb() was being traced.

-- Steve

>  17)     4080     168   ftrace_ops_no_ops+0xb4/0x16c
>  18)     3912      32   ftrace_call+0x0/0x4
>  19)     3880     144   __alloc_skb+0x48/0x180
>  20)     3736      96   alloc_skb_with_frags+0x74/0x234
>  21)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
>  22)     3528     160   sock_alloc_send_skb+0x44/0x54
>  23)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
>  24)     3304     224   ip_append_data.part.42+0x98/0xe8
>  25)     3080     112   ip_append_data+0x68/0x7c
>  26)     2968      96   icmp_push_reply+0x7c/0x144
>  27)     2872      96   icmp_send+0x3c0/0x3c8
>  28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>  29)     2584      96   udp_rcv+0x2c/0x3c
>  30)     2488      32   ip_local_deliver+0xa0/0x224
>  31)     2456      48   ip_rcv+0x360/0x57c
>  32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>  33)     2344     128   __netif_receive_skb+0x24/0x84
>  34)     2216      32   process_backlog+0x9c/0x15c
>  35)     2184      80   net_rx_action+0x1ec/0x32c
>  36)     2104     160   __do_softirq+0x114/0x2f0
>  37)     1944     128   do_softirq+0x60/0x68
>  38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>  39)     1784      32   ip_finish_output+0x1f4/0xabc
>  40)     1752      96   ip_output+0xf0/0x120
>  41)     1656      64   ip_local_out_sk+0x44/0x54
>  42)     1592      32   ip_send_skb+0x24/0xbc
>  43)     1560      48   udp_send_skb+0x1b4/0x2f4
>  44)     1512      80   udp_sendmsg+0x2a8/0x7a0
>  45)     1432     272   inet_sendmsg+0xa0/0xd0
>  46)     1160      48   sock_sendmsg+0x30/0x78
>  47)     1112      32   ___sys_sendmsg+0x15c/0x26c
>  48)     1080     400   __sys_sendmmsg+0x94/0x180
>  49)      680     320   SyS_sendmmsg+0x38/0x54
>  50)      360     360   el0_svc_naked+0x20/0x28
> 
> Best Regards
> Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-14 13:31     ` Steven Rostedt
@ 2015-07-15  0:20       ` AKASHI Takahiro
  2015-07-15  2:51         ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-15  0:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/14/2015 10:31 PM, Steven Rostedt wrote:
> On Tue, 14 Jul 2015 21:47:10 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:
>
>> Is the below example an unexpected result?
>> Entry 17 and 18 are ftrace_call and ftrace_ops_no_ops, respectively.

[snip]

> Note, function tracing does not disable interrupts. This looks to be
> that an interrupt came in while __aloc_skb() was being traced.

Yeah, I think so, too. But if my insight is correct, it's not __alloc_skb()
but one of functions that it calls. As I said in the commit log message
of patch[1/3], the exact traced function will not be listed by
save_stack_trace() because we don't create a stack frame at mcount().
I think this is a flaw in the current implementation (on x86).

what do you think, Steve?

-Takahiro AKASHI

>
> -- Steve
>
>>   17)     4080     168   ftrace_ops_no_ops+0xb4/0x16c
>>   18)     3912      32   ftrace_call+0x0/0x4
>>   19)     3880     144   __alloc_skb+0x48/0x180
>>   20)     3736      96   alloc_skb_with_frags+0x74/0x234
>>   21)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
>>   22)     3528     160   sock_alloc_send_skb+0x44/0x54
>>   23)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
>>   24)     3304     224   ip_append_data.part.42+0x98/0xe8
>>   25)     3080     112   ip_append_data+0x68/0x7c
>>   26)     2968      96   icmp_push_reply+0x7c/0x144
>>   27)     2872      96   icmp_send+0x3c0/0x3c8
>>   28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>>   29)     2584      96   udp_rcv+0x2c/0x3c
>>   30)     2488      32   ip_local_deliver+0xa0/0x224
>>   31)     2456      48   ip_rcv+0x360/0x57c
>>   32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>>   33)     2344     128   __netif_receive_skb+0x24/0x84
>>   34)     2216      32   process_backlog+0x9c/0x15c
>>   35)     2184      80   net_rx_action+0x1ec/0x32c
>>   36)     2104     160   __do_softirq+0x114/0x2f0
>>   37)     1944     128   do_softirq+0x60/0x68
>>   38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>>   39)     1784      32   ip_finish_output+0x1f4/0xabc
>>   40)     1752      96   ip_output+0xf0/0x120
>>   41)     1656      64   ip_local_out_sk+0x44/0x54
>>   42)     1592      32   ip_send_skb+0x24/0xbc
>>   43)     1560      48   udp_send_skb+0x1b4/0x2f4
>>   44)     1512      80   udp_sendmsg+0x2a8/0x7a0
>>   45)     1432     272   inet_sendmsg+0xa0/0xd0
>>   46)     1160      48   sock_sendmsg+0x30/0x78
>>   47)     1112      32   ___sys_sendmsg+0x15c/0x26c
>>   48)     1080     400   __sys_sendmmsg+0x94/0x180
>>   49)      680     320   SyS_sendmmsg+0x38/0x54
>>   50)      360     360   el0_svc_naked+0x20/0x28
>>
>> Best Regards
>> Jungseok Lee
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64
  2015-07-13 15:24   ` Steven Rostedt
@ 2015-07-15  0:22     ` AKASHI Takahiro
  0 siblings, 0 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-15  0:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/14/2015 12:24 AM, Steven Rostedt wrote:
> On Mon, 13 Jul 2015 14:29:33 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
>
>> Ftace's stack tracer on arm64 returns wrong information about call stacks:
>>
>>          Depth    Size   Location    (50 entries)
>>          -----    ----   --------
>>    0)     5256       0   notifier_call_chain+0x30/0x94
>>    1)     5256       0   ftrace_call+0x0/0x4
>>    2)     5256       0   notifier_call_chain+0x2c/0x94
>>    3)     5256       0   raw_notifier_call_chain+0x34/0x44
>>    4)     5256       0   timekeeping_update.constprop.9+0xb8/0x114
>>    5)     5256       0   update_wall_time+0x408/0x6dc
>>
>> Most of 'Size' fields are unexpectedly zero.
>>
>> This is because stack tracer fails to recognize each function's stack frame
>> in check_stack(). Stack tracer searches for a function's pc in the stack
>> based on the list returned by save_stack_trace(), but save_stack_trace() on
>> arm64 does not return the exact return address saved in a stack frame, but
>> a value decrmented by 4 (which means a branch instruction's address).
>> This behavior was introduced by
>>      commit e306dfd06fcb ("ARM64: unwind: Fix PC calculation")
>>
>> So the matching doesn't succeed in most cases.
>>
>> This problem can be fixed either by
>> a) reverting the commit above
>> b) adding an arm64-specific hack to check_patch()
>>
>> This patch does b).
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   kernel/trace/trace_stack.c |    4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>> index 3f34496..7086fc3 100644
>> --- a/kernel/trace/trace_stack.c
>> +++ b/kernel/trace/trace_stack.c
>> @@ -143,7 +143,11 @@ check_stack(unsigned long ip, unsigned long *stack)
>>   		p = start;
>>
>>   		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>> +#ifdef CONFIG_ARM64
>> +			if (*p == (stack_dump_trace[i] + 4)) {
>> +#else
>>   			if (*p == stack_dump_trace[i]) {
>> +#endif
>
> Instead of the ugly #ifdef in this code, please add a macro
> FTRACE_STACK_FRAME_OFFSET
>
> Then in include/linux/ftrace.h have:
>
> #ifndef FTRACE_STACK_FRAME_OFFSET
> # define FTRACE_STACK_FRAME_OFFSET 0
> #endif
>
> And in arch/arm64/include/asm/ftrace.h
>
> #define FTRACE_STACK_FRAME_OFFSET 4
>
> And then just do:
>
> 	if (*p == (stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET)) {

Yes.

Thanks,
-Takahiro AKASHI

> -- Steve
>
>>   				this_size = stack_dump_index[i++] =
>>   					(top - p) * sizeof(unsigned long);
>>   				found = 1;
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-15  0:20       ` AKASHI Takahiro
@ 2015-07-15  2:51         ` Steven Rostedt
  2015-07-15 11:41           ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-15  2:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jul 2015 09:20:42 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On 07/14/2015 10:31 PM, Steven Rostedt wrote:
> > On Tue, 14 Jul 2015 21:47:10 +0900
> > Jungseok Lee <jungseoklee85@gmail.com> wrote:
> >
> >> Is the below example an unexpected result?
> >> Entry 17 and 18 are ftrace_call and ftrace_ops_no_ops, respectively.
> 
> [snip]
> 
> > Note, function tracing does not disable interrupts. This looks to be
> > that an interrupt came in while __aloc_skb() was being traced.
> 
> Yeah, I think so, too. But if my insight is correct, it's not __alloc_skb()
> but one of functions that it calls. As I said in the commit log message
> of patch[1/3], the exact traced function will not be listed by
> save_stack_trace() because we don't create a stack frame at mcount().
> I think this is a flaw in the current implementation (on x86).
> 
> what do you think, Steve?
> 

mcount (well ftrace_call actually) does indeed create a stack frame for
itself *and* for what called it. At least on x86_64. See mcount_64.S.

With -pg -mfentry, it creates a stack frame. Without -mfentry, mcount
is called after the current function's frame is made so we don't need
to do much.

Here's what the -mfentry version does:

	pushq %rbp
	pushq 8*2(%rsp)  /* this is the parent pointer */
	pushq %rbp
	movq %rsp, %rbp
	pushq 8*3(%rsp)   /* Return address to ftrace_call */
	pushq %rbp
	movq %rsp, %rbp


Thus the stack looks like this:

                                         <---+
	|                              |     |
	+------------------------------+     |
	| return address for func      |     |
	| return address for func_call |     |
	| original %rbp                |     |
	+------------------------------+     |
	| return address for func      |     |
	| ptr to parent frame (%rbp)   | ----+
 	+------------------------------| <-----+
	| return address for func_call |       |
        | ptr to next frame (%rbp)     | ------+
	+------------------------------+ <---+
                                             |
                                             |
 Current %rbp points to func_call frame -----+

 The first box isn't used as a frame, but is used by ftrace_call to save
 information to restore everything properly.

Thus, __alloc_skb() is what is currently being traced.


-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-15  2:51         ` Steven Rostedt
@ 2015-07-15 11:41           ` AKASHI Takahiro
  2015-07-15 14:55             ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-15 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

Steve,

On 07/15/2015 11:51 AM, Steven Rostedt wrote:
> On Wed, 15 Jul 2015 09:20:42 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
>
>> On 07/14/2015 10:31 PM, Steven Rostedt wrote:
>>> On Tue, 14 Jul 2015 21:47:10 +0900
>>> Jungseok Lee <jungseoklee85@gmail.com> wrote:
>>>
>>>> Is the below example an unexpected result?
>>>> Entry 17 and 18 are ftrace_call and ftrace_ops_no_ops, respectively.
>>
>> [snip]
>>
>>> Note, function tracing does not disable interrupts. This looks to be
>>> that an interrupt came in while __aloc_skb() was being traced.
>>
>> Yeah, I think so, too. But if my insight is correct, it's not __alloc_skb()
>> but one of functions that it calls. As I said in the commit log message
>> of patch[1/3], the exact traced function will not be listed by

not patch[1/3], but patch[3/3]

>> save_stack_trace() because we don't create a stack frame at mcount().
>> I think this is a flaw in the current implementation (on x86).
>>
>> what do you think, Steve?
>>
>
> mcount (well ftrace_call actually) does indeed create a stack frame for
> itself *and* for what called it. At least on x86_64. See mcount_64.S.
>
> With -pg -mfentry, it creates a stack frame. Without -mfentry, mcount
> is called after the current function's frame is made so we don't need
> to do much.

Thank you for the explanation. But what I don't really understand here
is why we need to add the "current function" to the stack dump list
returned by save_stack_trace():

In check_stack(),
 >        /*
 >         * Add the passed in ip from the function tracer.
 >         * Searching for this on the stack will skip over
 >         * most of the overhead from the stack tracer itself.
 >         */
 >        stack_dump_trace[0] = ip;
 >        max_stack_trace.nr_entries++;

I think that "ip" here is the "return address for func" in your
ascii art, and it should be already in the list if a frame is made
by mcount (or func_call).

In fact, stack tracer on arm64 works OK even without the patch[3/3]
if the code quoted above is commented out.
Even on x86, the code is conditional and not activated if the kernel
is compiled without -mfentry before the following commit:
     commit 4df297129f62 ("tracing: Remove most or all of stack tracer stack size from stack_max_size")

So what do I misunderstand here?

Thanks,
-Takahiro AKASHI

> Here's what the -mfentry version does:
>
> 	pushq %rbp
> 	pushq 8*2(%rsp)  /* this is the parent pointer */
> 	pushq %rbp
> 	movq %rsp, %rbp
> 	pushq 8*3(%rsp)   /* Return address to ftrace_call */
> 	pushq %rbp
> 	movq %rsp, %rbp
>
>
> Thus the stack looks like this:
>
>                                           <---+
> 	|                              |     |
> 	+------------------------------+     |
> 	| return address for func      |     |
> 	| return address for func_call |     |
> 	| original %rbp                |     |
> 	+------------------------------+     |
> 	| return address for func      |     |
> 	| ptr to parent frame (%rbp)   | ----+
>   	+------------------------------| <-----+
> 	| return address for func_call |       |
>          | ptr to next frame (%rbp)     | ------+
> 	+------------------------------+ <---+
>                                               |
>                                               |
>   Current %rbp points to func_call frame -----+
>
>   The first box isn't used as a frame, but is used by ftrace_call to save
>   information to restore everything properly.
>
> Thus, __alloc_skb() is what is currently being traced.
>
>
> -- Steve
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-15 11:41           ` AKASHI Takahiro
@ 2015-07-15 14:55             ` Steven Rostedt
  2015-07-15 16:13               ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-15 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jul 2015 20:41:34 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:


> Thank you for the explanation. But what I don't really understand here
> is why we need to add the "current function" to the stack dump list
> returned by save_stack_trace():
> 
> In check_stack(),
>  >        /*
>  >         * Add the passed in ip from the function tracer.
>  >         * Searching for this on the stack will skip over
>  >         * most of the overhead from the stack tracer itself.
>  >         */
>  >        stack_dump_trace[0] = ip;
>  >        max_stack_trace.nr_entries++;
> 
> I think that "ip" here is the "return address for func" in your

Ah, you are correct (for fentry).

> ascii art, and it should be already in the list if a frame is made
> by mcount (or func_call).
> 
> In fact, stack tracer on arm64 works OK even without the patch[3/3]
> if the code quoted above is commented out.
> Even on x86, the code is conditional and not activated if the kernel
> is compiled without -mfentry before the following commit:
>      commit 4df297129f62 ("tracing: Remove most or all of stack tracer stack size from stack_max_size")
> 
> So what do I misunderstand here?
> 

Hmm, I haven't touched the stack trace code in a while. With the added
stack frame for fentry, the hacks there are probably not needed.

I'll take a look at it and try to clean up the code.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-15 14:55             ` Steven Rostedt
@ 2015-07-15 16:13               ` Steven Rostedt
  2015-07-16  0:27                 ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-15 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 15 Jul 2015 10:55:36 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:


> I'll take a look at it and try to clean up the code.

Does the  following patch make sense for you?

-- Steve

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 3f34496244e9..9384647d07c3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -18,12 +18,6 @@
 
 #define STACK_TRACE_ENTRIES 500
 
-#ifdef CC_USING_FENTRY
-# define fentry		1
-#else
-# define fentry		0
-#endif
-
 static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
 	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
 static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
@@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
  */
 static struct stack_trace max_stack_trace = {
 	.max_entries		= STACK_TRACE_ENTRIES - 1,
-	.entries		= &stack_dump_trace[1],
+	.entries		= &stack_dump_trace[0],
 };
 
 static unsigned long max_stack_size;
@@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 	unsigned long this_size, flags; unsigned long *p, *top, *start;
 	static int tracer_frame;
 	int frame_size = ACCESS_ONCE(tracer_frame);
-	int i;
+	int i, x;
 
 	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
 	this_size = THREAD_SIZE - this_size;
@@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
 	max_stack_size = this_size;
 
 	max_stack_trace.nr_entries = 0;
-
-	if (using_ftrace_ops_list_func())
-		max_stack_trace.skip = 4;
-	else
-		max_stack_trace.skip = 3;
+	max_stack_trace.skip = 0;
 
 	save_stack_trace(&max_stack_trace);
 
-	/*
-	 * Add the passed in ip from the function tracer.
-	 * Searching for this on the stack will skip over
-	 * most of the overhead from the stack tracer itself.
-	 */
-	stack_dump_trace[0] = ip;
-	max_stack_trace.nr_entries++;
+	/* Skip over the overhead of the stack tracer itself */
+	for (i = 0; i < max_stack_trace.nr_entries; i++) {
+		if (stack_dump_trace[i] == ip)
+			break;
+	}
 
 	/*
 	 * Now find where in the stack these are.
 	 */
-	i = 0;
+	x = 0;
 	start = stack;
 	top = (unsigned long *)
 		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
@@ -139,12 +127,13 @@ check_stack(unsigned long ip, unsigned long *stack)
 	while (i < max_stack_trace.nr_entries) {
 		int found = 0;
 
-		stack_dump_index[i] = this_size;
+		stack_dump_index[x] = this_size;
 		p = start;
 
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
 			if (*p == stack_dump_trace[i]) {
-				this_size = stack_dump_index[i++] =
+				stack_dump_trace[x] = stack_dump_trace[i++];
+				this_size = stack_dump_index[x++] =
 					(top - p) * sizeof(unsigned long);
 				found = 1;
 				/* Start the search from here */
@@ -168,6 +157,9 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
+	for (; x < max_stack_trace.nr_entries; x++)
+		stack_dump_trace[x] = ULONG_MAX;
+
 	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
@@ -192,24 +184,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (per_cpu(trace_active, cpu)++ != 0)
 		goto out;
 
-	/*
-	 * When fentry is used, the traced function does not get
-	 * its stack frame set up, and we lose the parent.
-	 * The ip is pretty useless because the function tracer
-	 * was called before that function set up its stack frame.
-	 * In this case, we use the parent ip.
-	 *
-	 * By adding the return address of either the parent ip
-	 * or the current ip we can disregard most of the stack usage
-	 * caused by the stack tracer itself.
-	 *
-	 * The function tracer always reports the address of where the
-	 * mcount call was, but the stack will hold the return address.
-	 */
-	if (fentry)
-		ip = parent_ip;
-	else
-		ip += MCOUNT_INSN_SIZE;
+	ip += MCOUNT_INSN_SIZE;
 
 	check_stack(ip, &stack);
 

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-15 16:13               ` Steven Rostedt
@ 2015-07-16  0:27                 ` AKASHI Takahiro
  2015-07-16  1:08                   ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-16  0:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/16/2015 01:13 AM, Steven Rostedt wrote:
> On Wed, 15 Jul 2015 10:55:36 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
>> I'll take a look at it and try to clean up the code.
>
> Does the  following patch make sense for you?

Looks nice. The patch greatly simplifies changes on arm64 side.

- Takahiro AKASHI

> -- Steve
>
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 3f34496244e9..9384647d07c3 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -18,12 +18,6 @@
>
>   #define STACK_TRACE_ENTRIES 500
>
> -#ifdef CC_USING_FENTRY
> -# define fentry		1
> -#else
> -# define fentry		0
> -#endif
> -
>   static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
>   	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
>   static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
> @@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
>    */
>   static struct stack_trace max_stack_trace = {
>   	.max_entries		= STACK_TRACE_ENTRIES - 1,
> -	.entries		= &stack_dump_trace[1],
> +	.entries		= &stack_dump_trace[0],
>   };
>
>   static unsigned long max_stack_size;
> @@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	unsigned long this_size, flags; unsigned long *p, *top, *start;
>   	static int tracer_frame;
>   	int frame_size = ACCESS_ONCE(tracer_frame);
> -	int i;
> +	int i, x;
>
>   	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
>   	this_size = THREAD_SIZE - this_size;
> @@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	max_stack_size = this_size;
>
>   	max_stack_trace.nr_entries = 0;
> -
> -	if (using_ftrace_ops_list_func())
> -		max_stack_trace.skip = 4;
> -	else
> -		max_stack_trace.skip = 3;
> +	max_stack_trace.skip = 0;
>
>   	save_stack_trace(&max_stack_trace);
>
> -	/*
> -	 * Add the passed in ip from the function tracer.
> -	 * Searching for this on the stack will skip over
> -	 * most of the overhead from the stack tracer itself.
> -	 */
> -	stack_dump_trace[0] = ip;
> -	max_stack_trace.nr_entries++;
> +	/* Skip over the overhead of the stack tracer itself */
> +	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> +		if (stack_dump_trace[i] == ip)
> +			break;
> +	}
>
>   	/*
>   	 * Now find where in the stack these are.
>   	 */
> -	i = 0;
> +	x = 0;
>   	start = stack;
>   	top = (unsigned long *)
>   		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
> @@ -139,12 +127,13 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	while (i < max_stack_trace.nr_entries) {
>   		int found = 0;
>
> -		stack_dump_index[i] = this_size;
> +		stack_dump_index[x] = this_size;
>   		p = start;
>
>   		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>   			if (*p == stack_dump_trace[i]) {
> -				this_size = stack_dump_index[i++] =
> +				stack_dump_trace[x] = stack_dump_trace[i++];
> +				this_size = stack_dump_index[x++] =
>   					(top - p) * sizeof(unsigned long);
>   				found = 1;
>   				/* Start the search from here */
> @@ -168,6 +157,9 @@ check_stack(unsigned long ip, unsigned long *stack)
>   			i++;
>   	}
>
> +	for (; x < max_stack_trace.nr_entries; x++)
> +		stack_dump_trace[x] = ULONG_MAX;
> +
>   	if (task_stack_end_corrupted(current)) {
>   		print_max_stack();
>   		BUG();
> @@ -192,24 +184,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>   	if (per_cpu(trace_active, cpu)++ != 0)
>   		goto out;
>
> -	/*
> -	 * When fentry is used, the traced function does not get
> -	 * its stack frame set up, and we lose the parent.
> -	 * The ip is pretty useless because the function tracer
> -	 * was called before that function set up its stack frame.
> -	 * In this case, we use the parent ip.
> -	 *
> -	 * By adding the return address of either the parent ip
> -	 * or the current ip we can disregard most of the stack usage
> -	 * caused by the stack tracer itself.
> -	 *
> -	 * The function tracer always reports the address of where the
> -	 * mcount call was, but the stack will hold the return address.
> -	 */
> -	if (fentry)
> -		ip = parent_ip;
> -	else
> -		ip += MCOUNT_INSN_SIZE;
> +	ip += MCOUNT_INSN_SIZE;
>
>   	check_stack(ip, &stack);
>
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16  0:27                 ` AKASHI Takahiro
@ 2015-07-16  1:08                   ` AKASHI Takahiro
  2015-07-16  1:38                     ` Steven Rostedt
                                       ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-16  1:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
> On 07/16/2015 01:13 AM, Steven Rostedt wrote:
>> On Wed, 15 Jul 2015 10:55:36 -0400
>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>>
>>> I'll take a look at it and try to clean up the code.
>>
>> Does the  following patch make sense for you?
>
> Looks nice. The patch greatly simplifies changes on arm64 side.

As follows:

- Takahiro AKASHI

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index c5534fa..868d6f1 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -15,6 +15,7 @@

  #define MCOUNT_ADDR		((unsigned long)_mcount)
  #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
+#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */

  #ifndef __ASSEMBLY__
  #include <linux/compat.h>
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1da6029..2c1bf7d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
  #endif /* CONFIG_FUNCTION_TRACER */

  #ifdef CONFIG_STACK_TRACER
+/*
+ * the offset value to add to return address from save_stack_trace()
+ */
+#ifndef FTRACE_STACK_FRAME_OFFSET
+#define FTRACE_STACK_FRAME_OFFSET 0
+#endif
+
  extern int stack_tracer_enabled;
  int
  stack_trace_sysctl(struct ctl_table *table, int write,
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 9384647..c5b9748 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)

  	/* Skip over the overhead of the stack tracer itself */
  	for (i = 0; i < max_stack_trace.nr_entries; i++) {
-		if (stack_dump_trace[i] == ip)
+		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
  			break;
  	}

@@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
  		p = start;

  		for (; p < top && i < max_stack_trace.nr_entries; p++) {
-			if (*p == stack_dump_trace[i]) {
+			if (*p == (stack_dump_trace[i]
+					+ FTRACE_STACK_FRAME_OFFSET)) {
  				stack_dump_trace[x] = stack_dump_trace[i++];
  				this_size = stack_dump_index[x++] =
  					(top - p) * sizeof(unsigned long);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16  1:08                   ` AKASHI Takahiro
@ 2015-07-16  1:38                     ` Steven Rostedt
  2015-07-17 10:46                       ` Will Deacon
  2015-07-16 13:29                     ` Jungseok Lee
  2015-07-16 14:28                     ` Mark Rutland
  2 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16  1:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jul 2015 10:08:03 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
> > On 07/16/2015 01:13 AM, Steven Rostedt wrote:
> >> On Wed, 15 Jul 2015 10:55:36 -0400
> >> Steven Rostedt <rostedt@goodmis.org> wrote:
> >>
> >>
> >>> I'll take a look at it and try to clean up the code.
> >>
> >> Does the  following patch make sense for you?
> >
> > Looks nice. The patch greatly simplifies changes on arm64 side.
> 
> As follows:
> 
> - Takahiro AKASHI
> 

How should we go forward with this. Want me to make my change to my
tree, and pull this patch in with it? I can set up for a 4.3 release.

Then if I can get an Acked-by from one of the arm64 maintainers, I can
have this go through my tree?

-- Steve


> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index c5534fa..868d6f1 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -15,6 +15,7 @@
> 
>   #define MCOUNT_ADDR		((unsigned long)_mcount)
>   #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */
> 
>   #ifndef __ASSEMBLY__
>   #include <linux/compat.h>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 1da6029..2c1bf7d 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
>   #endif /* CONFIG_FUNCTION_TRACER */
> 
>   #ifdef CONFIG_STACK_TRACER
> +/*
> + * the offset value to add to return address from save_stack_trace()
> + */
> +#ifndef FTRACE_STACK_FRAME_OFFSET
> +#define FTRACE_STACK_FRAME_OFFSET 0
> +#endif
> +
>   extern int stack_tracer_enabled;
>   int
>   stack_trace_sysctl(struct ctl_table *table, int write,
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 9384647..c5b9748 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
> 
>   	/* Skip over the overhead of the stack tracer itself */
>   	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> -		if (stack_dump_trace[i] == ip)
> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
>   			break;
>   	}
> 
> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>   		p = start;
> 
>   		for (; p < top && i < max_stack_trace.nr_entries; p++) {
> -			if (*p == stack_dump_trace[i]) {
> +			if (*p == (stack_dump_trace[i]
> +					+ FTRACE_STACK_FRAME_OFFSET)) {
>   				stack_dump_trace[x] = stack_dump_trace[i++];
>   				this_size = stack_dump_index[x++] =
>   					(top - p) * sizeof(unsigned long);

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16  1:08                   ` AKASHI Takahiro
  2015-07-16  1:38                     ` Steven Rostedt
@ 2015-07-16 13:29                     ` Jungseok Lee
  2015-07-16 13:54                       ` Jungseok Lee
                                         ` (2 more replies)
  2015-07-16 14:28                     ` Mark Rutland
  2 siblings, 3 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-16 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 16, 2015, at 10:08 AM, AKASHI Takahiro wrote:

Hi, AKASHI

> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
>> On 07/16/2015 01:13 AM, Steven Rostedt wrote:
>>> On Wed, 15 Jul 2015 10:55:36 -0400
>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>> 
>>> 
>>>> I'll take a look at it and try to clean up the code.
>>> 
>>> Does the  following patch make sense for you?
>> 
>> Looks nice. The patch greatly simplifies changes on arm64 side.
> 
> As follows:
> 
> - Takahiro AKASHI
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index c5534fa..868d6f1 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -15,6 +15,7 @@
> 
> #define MCOUNT_ADDR		((unsigned long)_mcount)
> #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */

How about binding it to -4 in unwind_frame function?
IMHO, it would help other developers trying to change stack trace code
be aware of this stack tracer feature.

> #ifndef __ASSEMBLY__
> #include <linux/compat.h>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 1da6029..2c1bf7d 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
> #endif /* CONFIG_FUNCTION_TRACER */
> 
> #ifdef CONFIG_STACK_TRACER
> +/*
> + * the offset value to add to return address from save_stack_trace()
> + */
> +#ifndef FTRACE_STACK_FRAME_OFFSET
> +#define FTRACE_STACK_FRAME_OFFSET 0
> +#endif
> +
> extern int stack_tracer_enabled;
> int
> stack_trace_sysctl(struct ctl_table *table, int write,
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 9384647..c5b9748 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
> 
> 	/* Skip over the overhead of the stack tracer itself */
> 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> -		if (stack_dump_trace[i] == ip)
> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
> 			break;
> 	}
> 
> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
> 		p = start;
> 
> 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
> -			if (*p == stack_dump_trace[i]) {
> +			if (*p == (stack_dump_trace[i]
> +					+ FTRACE_STACK_FRAME_OFFSET)) {
> 				stack_dump_trace[x] = stack_dump_trace[i++];
> 				this_size = stack_dump_index[x++] =
> 					(top - p) * sizeof(unsigned long);
> -- 

I've prepared a kernel with the following patches and reviewed them.

1) Steve's clean up patch 
2) This patch
3) [RFC 2/3]

AFAIU, [RFC 3/3] is not needed any more thanks to Steve's patch.

First of all, let's look at the following data.

1) stack_trace data
        Depth    Size   Location    (55 entries)
        -----    ----   --------
  0)     4808      16   notifier_call_chain+0x2c/0x94
  1)     4792      64   raw_notifier_call_chain+0x34/0x44
  2)     4728      48   timekeeping_update.constprop.9+0xb8/0x114
  3)     4680      48   update_wall_time+0x408/0x6dc
  4)     4632     128   tick_do_update_jiffies64+0xd8/0x154
  5)     4504      80   tick_sched_do_timer+0x50/0x60
  6)     4424      32   tick_sched_timer+0x34/0x90
  7)     4392      48   __run_hrtimer+0x60/0x258
  8)     4344      64   hrtimer_interrupt+0xe8/0x260
  9)     4280     128   arch_timer_handler_virt+0x38/0x48
 10)     4152      32   handle_percpu_devid_irq+0x84/0x188
 11)     4120      64   generic_handle_irq+0x38/0x54
 12)     4056      32   __handle_domain_irq+0x68/0xbc
 13)     4024      64   gic_handle_irq+0x38/0x88
 14)     3960     336   el1_irq+0x64/0xd8
 15)     3624      16   netif_rx_internal+0x14/0x198
 16)     3608      64   netif_rx+0x20/0xa4
 17)     3544      32   loopback_xmit+0x64/0xf4
 18)     3512      48   dev_hard_start_xmit+0x25c/0x3f8
 19)     3464     160   __dev_queue_xmit+0x440/0x4dc
 20)     3304      96   dev_queue_xmit_sk+0x20/0x30
 21)     3208      32   ip_finish_output+0x1e0/0xabc
 22)     3176      96   ip_output+0xf0/0x120
 23)     3080      64   ip_local_out_sk+0x44/0x54
 24)     3016      32   ip_send_skb+0x24/0xbc
 25)     2984      48   ip_push_pending_frames+0x40/0x60
 26)     2936      64   icmp_push_reply+0x104/0x144
 27)     2872      96   icmp_send+0x3c0/0x3c8
 28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
 29)     2584      96   udp_rcv+0x2c/0x3c
 30)     2488      32   ip_local_deliver+0xa0/0x224
 31)     2456      48   ip_rcv+0x360/0x57c
 32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
 33)     2344     128   __netif_receive_skb+0x24/0x84
 34)     2216      32   process_backlog+0x9c/0x15c
 35)     2184      80   net_rx_action+0x1ec/0x32c
 36)     2104     160   __do_softirq+0x114/0x2f0
 37)     1944     128   do_softirq+0x60/0x68
 38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
 39)     1784      32   ip_finish_output+0x1f4/0xabc
 40)     1752      96   ip_output+0xf0/0x120
 41)     1656      64   ip_local_out_sk+0x44/0x54
 42)     1592      32   ip_send_skb+0x24/0xbc
 43)     1560      48   udp_send_skb+0x1b4/0x2f4
 44)     1512      80   udp_sendmsg+0x2a8/0x7a0
 45)     1432     272   inet_sendmsg+0xa0/0xd0
 46)     1160      48   sock_sendmsg+0x30/0x78
 47)     1112      32   ___sys_sendmsg+0x15c/0x26c
 48)     1080     400   __sys_sendmmsg+0x94/0x180
 49)      680     320   SyS_sendmmsg+0x38/0x54
 50)      360     360   el0_svc_naked+0x20/0x28

2) stack_max_size data
4888

The data looks odd in two points.
1) the number of entry
There is a mismatch between start token and real data

2) 80-byte gap
stack_max_size is not aligned with "Depth" field of the first entry of stack_trace.

IMHO, two items are not considered in this series as digging them out.

1) skipped entries
As x variable is introduced in Steve's patch, it is needed to track down
how many entries are recorded in both stack_dump_trace and stack_dump_index.

2) max_stack_trace.skip value
max_stack_trace.skip is 0 as applying Steve's patch. The above gap could be
observed unless the value is not considered in arch code. In the above example,
80-byte gap is save_stack_trace function in arch/arm64/kernel/stacktrace.c.

As applying the following fix, stack_trace and stack_max_size are okay.
However, I'm not sure which code, arch or generic ftrace, should handle trace->skip.
The latter one is responsible for it under current implementation, not Steve's change.

Please correct me if I am wrong.

Best Regards
Jungseok Lee

----8<----

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 978c923..5f6f4df 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -137,7 +137,9 @@ void save_stack_trace(struct stack_trace *trace)
 
 	frame.fp = (unsigned long)__builtin_frame_address(0);
 	frame.sp = current_stack_pointer;
-	frame.pc = (unsigned long)save_stack_trace_tsk;
+	frame.pc = (unsigned long)save_stack_trace;
+
+	trace->skip += 4;
 
 	__save_stack_trace(&frame, trace, 0);
 }
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index c5b9748..5635683 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -158,6 +158,9 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
+	/* We do not want to include skipped entries */
+	max_stack_trace.nr_entries -= (i - x);
+
 	for (; x < max_stack_trace.nr_entries; x++)
 		stack_dump_trace[x] = ULONG_MAX;
 
@@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
 		seq_printf(m, "        Depth    Size   Location"
 			   "    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 		if (!stack_tracer_enabled && !max_stack_size)
 			print_disabled(m);

----8<----

1) stack_trace data
        Depth    Size   Location    (47 entries)
        -----    ----   --------
  0)     4648      16   notifier_call_chain+0x2c/0x94
  1)     4632      64   raw_notifier_call_chain+0x34/0x44
  2)     4568      48   timekeeping_update.constprop.9+0xb8/0x114
  3)     4520      48   update_wall_time+0x408/0x6dc
  4)     4472     128   tick_do_update_jiffies64+0xd8/0x154
  5)     4344      80   tick_sched_do_timer+0x50/0x60
  6)     4264      32   tick_sched_timer+0x34/0x90
  7)     4232      48   __run_hrtimer+0x60/0x258
  8)     4184      64   hrtimer_interrupt+0xe8/0x260
  9)     4120     128   arch_timer_handler_virt+0x38/0x48
 10)     3992      32   handle_percpu_devid_irq+0x84/0x188
 11)     3960      64   generic_handle_irq+0x38/0x54
 12)     3896      32   __handle_domain_irq+0x68/0xbc
 13)     3864      64   gic_handle_irq+0x38/0x88
 14)     3800     336   el1_irq+0x64/0xd8
 15)     3464      48   __kmalloc_track_caller+0x40/0x2f8
 16)     3416      16   __kmalloc_reserve.isra.46+0x44/0x9c
 17)     3400     144   __alloc_skb+0x70/0x180
 18)     3256      96   alloc_skb_with_frags+0x74/0x234
 19)     3160     112   sock_alloc_send_pskb+0x1d0/0x294
 20)     3048     160   sock_alloc_send_skb+0x44/0x54
 21)     2888      64   __ip_append_data.isra.40+0x78c/0xb48
 22)     2824     224   ip_append_data.part.42+0x98/0xe8
 23)     2600     112   ip_append_data+0x68/0x7c
 24)     2488      96   icmp_push_reply+0x7c/0x144
 25)     2392      96   icmp_send+0x3c0/0x3c8
 26)     2296     192   __udp4_lib_rcv+0x5b8/0x684
 27)     2104      96   udp_rcv+0x2c/0x3c
 28)     2008      32   ip_local_deliver+0xa0/0x224
 29)     1976      48   ip_rcv+0x360/0x57c
 30)     1928      64   __netif_receive_skb_core+0x4d0/0x80c
 31)     1864     128   __netif_receive_skb+0x24/0x84
 32)     1736      32   process_backlog+0x9c/0x15c
 33)     1704      80   net_rx_action+0x1ec/0x32c
 34)     1624     160   __do_softirq+0x114/0x2f0
 35)     1464     128   do_softirq+0x60/0x68
 36)     1336      32   __local_bh_enable_ip+0xb0/0xd4
 37)     1304      32   ip_finish_output+0x1f4/0xabc
 38)     1272      96   ip_output+0xf0/0x120
 39)     1176      64   ip_local_out_sk+0x44/0x54
 40)     1112      32   ip_send_skb+0x24/0xbc
 41)     1080      48   udp_send_skb+0x1b4/0x2f4
 42)     1032      80   udp_sendmsg+0x2a8/0x7a0
 43)      952     272   inet_sendmsg+0xa0/0xd0
 44)      680      48   sock_sendmsg+0x30/0x78
 45)      632      32   SyS_sendto+0xc4/0x108
 46)      600     600   el0_svc_naked+0x20/0x28

2) stack_max_size data
4648

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 13:29                     ` Jungseok Lee
@ 2015-07-16 13:54                       ` Jungseok Lee
  2015-07-16 14:24                       ` Steven Rostedt
  2015-07-17  2:04                       ` AKASHI Takahiro
  2 siblings, 0 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-16 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 16, 2015, at 10:29 PM, Jungseok Lee wrote:
> On Jul 16, 2015, at 10:08 AM, AKASHI Takahiro wrote:
> 
> Hi, AKASHI
> 
>> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
>>> On 07/16/2015 01:13 AM, Steven Rostedt wrote:
>>>> On Wed, 15 Jul 2015 10:55:36 -0400
>>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>>> 
>>>> 
>>>>> I'll take a look at it and try to clean up the code.
>>>> 
>>>> Does the  following patch make sense for you?
>>> 
>>> Looks nice. The patch greatly simplifies changes on arm64 side.
>> 
>> As follows:
>> 
>> - Takahiro AKASHI
>> 
>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>> index c5534fa..868d6f1 100644
>> --- a/arch/arm64/include/asm/ftrace.h
>> +++ b/arch/arm64/include/asm/ftrace.h
>> @@ -15,6 +15,7 @@
>> 
>> #define MCOUNT_ADDR		((unsigned long)_mcount)
>> #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */
> 
> How about binding it to -4 in unwind_frame function?
> IMHO, it would help other developers trying to change stack trace code
> be aware of this stack tracer feature.
> 
>> #ifndef __ASSEMBLY__
>> #include <linux/compat.h>
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 1da6029..2c1bf7d 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
>> #endif /* CONFIG_FUNCTION_TRACER */
>> 
>> #ifdef CONFIG_STACK_TRACER
>> +/*
>> + * the offset value to add to return address from save_stack_trace()
>> + */
>> +#ifndef FTRACE_STACK_FRAME_OFFSET
>> +#define FTRACE_STACK_FRAME_OFFSET 0
>> +#endif
>> +
>> extern int stack_tracer_enabled;
>> int
>> stack_trace_sysctl(struct ctl_table *table, int write,
>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>> index 9384647..c5b9748 100644
>> --- a/kernel/trace/trace_stack.c
>> +++ b/kernel/trace/trace_stack.c
>> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>> 
>> 	/* Skip over the overhead of the stack tracer itself */
>> 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
>> -		if (stack_dump_trace[i] == ip)
>> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
>> 			break;
>> 	}
>> 
>> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>> 		p = start;
>> 
>> 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>> -			if (*p == stack_dump_trace[i]) {
>> +			if (*p == (stack_dump_trace[i]
>> +					+ FTRACE_STACK_FRAME_OFFSET)) {
>> 				stack_dump_trace[x] = stack_dump_trace[i++];
>> 				this_size = stack_dump_index[x++] =
>> 					(top - p) * sizeof(unsigned long);
>> -- 
> 
> I've prepared a kernel with the following patches and reviewed them.
> 
> 1) Steve's clean up patch 
> 2) This patch
> 3) [RFC 2/3]
> 
> AFAIU, [RFC 3/3] is not needed any more thanks to Steve's patch.
> 
> First of all, let's look at the following data.
> 
> 1) stack_trace data
>        Depth    Size   Location    (55 entries)
>        -----    ----   --------
>  0)     4808      16   notifier_call_chain+0x2c/0x94
>  1)     4792      64   raw_notifier_call_chain+0x34/0x44
>  2)     4728      48   timekeeping_update.constprop.9+0xb8/0x114
>  3)     4680      48   update_wall_time+0x408/0x6dc
>  4)     4632     128   tick_do_update_jiffies64+0xd8/0x154
>  5)     4504      80   tick_sched_do_timer+0x50/0x60
>  6)     4424      32   tick_sched_timer+0x34/0x90
>  7)     4392      48   __run_hrtimer+0x60/0x258
>  8)     4344      64   hrtimer_interrupt+0xe8/0x260
>  9)     4280     128   arch_timer_handler_virt+0x38/0x48
> 10)     4152      32   handle_percpu_devid_irq+0x84/0x188
> 11)     4120      64   generic_handle_irq+0x38/0x54
> 12)     4056      32   __handle_domain_irq+0x68/0xbc
> 13)     4024      64   gic_handle_irq+0x38/0x88
> 14)     3960     336   el1_irq+0x64/0xd8
> 15)     3624      16   netif_rx_internal+0x14/0x198
> 16)     3608      64   netif_rx+0x20/0xa4
> 17)     3544      32   loopback_xmit+0x64/0xf4
> 18)     3512      48   dev_hard_start_xmit+0x25c/0x3f8
> 19)     3464     160   __dev_queue_xmit+0x440/0x4dc
> 20)     3304      96   dev_queue_xmit_sk+0x20/0x30
> 21)     3208      32   ip_finish_output+0x1e0/0xabc
> 22)     3176      96   ip_output+0xf0/0x120
> 23)     3080      64   ip_local_out_sk+0x44/0x54
> 24)     3016      32   ip_send_skb+0x24/0xbc
> 25)     2984      48   ip_push_pending_frames+0x40/0x60
> 26)     2936      64   icmp_push_reply+0x104/0x144
> 27)     2872      96   icmp_send+0x3c0/0x3c8
> 28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
> 29)     2584      96   udp_rcv+0x2c/0x3c
> 30)     2488      32   ip_local_deliver+0xa0/0x224
> 31)     2456      48   ip_rcv+0x360/0x57c
> 32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
> 33)     2344     128   __netif_receive_skb+0x24/0x84
> 34)     2216      32   process_backlog+0x9c/0x15c
> 35)     2184      80   net_rx_action+0x1ec/0x32c
> 36)     2104     160   __do_softirq+0x114/0x2f0
> 37)     1944     128   do_softirq+0x60/0x68
> 38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
> 39)     1784      32   ip_finish_output+0x1f4/0xabc
> 40)     1752      96   ip_output+0xf0/0x120
> 41)     1656      64   ip_local_out_sk+0x44/0x54
> 42)     1592      32   ip_send_skb+0x24/0xbc
> 43)     1560      48   udp_send_skb+0x1b4/0x2f4
> 44)     1512      80   udp_sendmsg+0x2a8/0x7a0
> 45)     1432     272   inet_sendmsg+0xa0/0xd0
> 46)     1160      48   sock_sendmsg+0x30/0x78
> 47)     1112      32   ___sys_sendmsg+0x15c/0x26c
> 48)     1080     400   __sys_sendmmsg+0x94/0x180
> 49)      680     320   SyS_sendmmsg+0x38/0x54
> 50)      360     360   el0_svc_naked+0x20/0x28
> 
> 2) stack_max_size data
> 4888
> 
> The data looks odd in two points.
> 1) the number of entry
> There is a mismatch between start token and real data
> 
> 2) 80-byte gap
> stack_max_size is not aligned with "Depth" field of the first entry of stack_trace.
> 
> IMHO, two items are not considered in this series as digging them out.
> 
> 1) skipped entries
> As x variable is introduced in Steve's patch, it is needed to track down
> how many entries are recorded in both stack_dump_trace and stack_dump_index.
> 
> 2) max_stack_trace.skip value
> max_stack_trace.skip is 0 as applying Steve's patch. The above gap could be
> observed unless the value is not considered in arch code. In the above example,
> 80-byte gap is save_stack_trace function in arch/arm64/kernel/stacktrace.c.
> 
> As applying the following fix, stack_trace and stack_max_size are okay.
> However, I'm not sure which code, arch or generic ftrace, should handle trace->skip.
> The latter one is responsible for it under current implementation, not Steve's change.
> 
> Please correct me if I am wrong.
> 
> Best Regards
> Jungseok Lee
> 
> ----8<----
> 
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 978c923..5f6f4df 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -137,7 +137,9 @@ void save_stack_trace(struct stack_trace *trace)
> 
> 	frame.fp = (unsigned long)__builtin_frame_address(0);
> 	frame.sp = current_stack_pointer;
> -	frame.pc = (unsigned long)save_stack_trace_tsk;
> +	frame.pc = (unsigned long)save_stack_trace;
> +
> +	trace->skip += 4;
> 
> 	__save_stack_trace(&frame, trace, 0);
> }
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index c5b9748..5635683 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -158,6 +158,9 @@ check_stack(unsigned long ip, unsigned long *stack)
> 			i++;
> 	}
> 
> +	/* We do not want to include skipped entries */
> +	max_stack_trace.nr_entries -= (i - x);
> +
> 	for (; x < max_stack_trace.nr_entries; x++)

This line should be changed as follows.

for (; x < i; x++)

Sorry for confusion.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 13:29                     ` Jungseok Lee
  2015-07-16 13:54                       ` Jungseok Lee
@ 2015-07-16 14:24                       ` Steven Rostedt
  2015-07-16 15:01                         ` Jungseok Lee
  2015-07-17  2:04                       ` AKASHI Takahiro
  2 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jul 2015 22:29:05 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:


> First of all, let's look at the following data.
> 
> 1) stack_trace data
>         Depth    Size   Location    (55 entries)
>         -----    ----   --------
>   0)     4808      16   notifier_call_chain+0x2c/0x94
>   1)     4792      64   raw_notifier_call_chain+0x34/0x44
>   2)     4728      48   timekeeping_update.constprop.9+0xb8/0x114
>   3)     4680      48   update_wall_time+0x408/0x6dc
>   4)     4632     128   tick_do_update_jiffies64+0xd8/0x154
>   5)     4504      80   tick_sched_do_timer+0x50/0x60
>   6)     4424      32   tick_sched_timer+0x34/0x90
>   7)     4392      48   __run_hrtimer+0x60/0x258
>   8)     4344      64   hrtimer_interrupt+0xe8/0x260
>   9)     4280     128   arch_timer_handler_virt+0x38/0x48
>  10)     4152      32   handle_percpu_devid_irq+0x84/0x188
>  11)     4120      64   generic_handle_irq+0x38/0x54
>  12)     4056      32   __handle_domain_irq+0x68/0xbc
>  13)     4024      64   gic_handle_irq+0x38/0x88
>  14)     3960     336   el1_irq+0x64/0xd8
>  15)     3624      16   netif_rx_internal+0x14/0x198
>  16)     3608      64   netif_rx+0x20/0xa4
>  17)     3544      32   loopback_xmit+0x64/0xf4
>  18)     3512      48   dev_hard_start_xmit+0x25c/0x3f8
>  19)     3464     160   __dev_queue_xmit+0x440/0x4dc
>  20)     3304      96   dev_queue_xmit_sk+0x20/0x30
>  21)     3208      32   ip_finish_output+0x1e0/0xabc
>  22)     3176      96   ip_output+0xf0/0x120
>  23)     3080      64   ip_local_out_sk+0x44/0x54
>  24)     3016      32   ip_send_skb+0x24/0xbc
>  25)     2984      48   ip_push_pending_frames+0x40/0x60
>  26)     2936      64   icmp_push_reply+0x104/0x144
>  27)     2872      96   icmp_send+0x3c0/0x3c8
>  28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>  29)     2584      96   udp_rcv+0x2c/0x3c
>  30)     2488      32   ip_local_deliver+0xa0/0x224
>  31)     2456      48   ip_rcv+0x360/0x57c
>  32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>  33)     2344     128   __netif_receive_skb+0x24/0x84
>  34)     2216      32   process_backlog+0x9c/0x15c
>  35)     2184      80   net_rx_action+0x1ec/0x32c
>  36)     2104     160   __do_softirq+0x114/0x2f0
>  37)     1944     128   do_softirq+0x60/0x68
>  38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>  39)     1784      32   ip_finish_output+0x1f4/0xabc
>  40)     1752      96   ip_output+0xf0/0x120
>  41)     1656      64   ip_local_out_sk+0x44/0x54
>  42)     1592      32   ip_send_skb+0x24/0xbc
>  43)     1560      48   udp_send_skb+0x1b4/0x2f4
>  44)     1512      80   udp_sendmsg+0x2a8/0x7a0
>  45)     1432     272   inet_sendmsg+0xa0/0xd0
>  46)     1160      48   sock_sendmsg+0x30/0x78
>  47)     1112      32   ___sys_sendmsg+0x15c/0x26c
>  48)     1080     400   __sys_sendmmsg+0x94/0x180
>  49)      680     320   SyS_sendmmsg+0x38/0x54
>  50)      360     360   el0_svc_naked+0x20/0x28
> 
> 2) stack_max_size data
> 4888
> 
> The data looks odd in two points.
> 1) the number of entry
> There is a mismatch between start token and real data

Yep, good catch. As soon as I read that, I realized exactly what the
issue was ;-)

> 
> 2) 80-byte gap
> stack_max_size is not aligned with "Depth" field of the first entry of stack_trace.
> 
> IMHO, two items are not considered in this series as digging them out.
> 
> 1) skipped entries
> As x variable is introduced in Steve's patch, it is needed to track down
> how many entries are recorded in both stack_dump_trace and stack_dump_index.

Yep.

> 
> 2) max_stack_trace.skip value
> max_stack_trace.skip is 0 as applying Steve's patch. The above gap could be
> observed unless the value is not considered in arch code. In the above example,
> 80-byte gap is save_stack_trace function in arch/arm64/kernel/stacktrace.c.
> 
> As applying the following fix, stack_trace and stack_max_size are okay.
> However, I'm not sure which code, arch or generic ftrace, should handle trace->skip.
> The latter one is responsible for it under current implementation, not Steve's change.
> 
> Please correct me if I am wrong.

No, it's a bug in my patch. I'll make an update.

Does this new patch fix it for you?

-- Steve

---
 kernel/trace/trace_stack.c |   58 +++++++++++++--------------------------------
 1 file changed, 17 insertions(+), 41 deletions(-)

Index: linux-trace.git/kernel/trace/trace_stack.c
===================================================================
--- linux-trace.git.orig/kernel/trace/trace_stack.c	2015-07-15 12:49:59.290063597 -0400
+++ linux-trace.git/kernel/trace/trace_stack.c	2015-07-16 09:55:37.936188197 -0400
@@ -18,12 +18,6 @@
 
 #define STACK_TRACE_ENTRIES 500
 
-#ifdef CC_USING_FENTRY
-# define fentry		1
-#else
-# define fentry		0
-#endif
-
 static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
 	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
 static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
@@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_T
  */
 static struct stack_trace max_stack_trace = {
 	.max_entries		= STACK_TRACE_ENTRIES - 1,
-	.entries		= &stack_dump_trace[1],
+	.entries		= &stack_dump_trace[0],
 };
 
 static unsigned long max_stack_size;
@@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned l
 	unsigned long this_size, flags; unsigned long *p, *top, *start;
 	static int tracer_frame;
 	int frame_size = ACCESS_ONCE(tracer_frame);
-	int i;
+	int i, x;
 
 	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
 	this_size = THREAD_SIZE - this_size;
@@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned l
 	max_stack_size = this_size;
 
 	max_stack_trace.nr_entries = 0;
-
-	if (using_ftrace_ops_list_func())
-		max_stack_trace.skip = 4;
-	else
-		max_stack_trace.skip = 3;
+	max_stack_trace.skip = 3;
 
 	save_stack_trace(&max_stack_trace);
 
-	/*
-	 * Add the passed in ip from the function tracer.
-	 * Searching for this on the stack will skip over
-	 * most of the overhead from the stack tracer itself.
-	 */
-	stack_dump_trace[0] = ip;
-	max_stack_trace.nr_entries++;
+	/* Skip over the overhead of the stack tracer itself */
+	for (i = 0; i < max_stack_trace.nr_entries; i++) {
+		if (stack_dump_trace[i] == ip)
+			break;
+	}
 
 	/*
 	 * Now find where in the stack these are.
 	 */
-	i = 0;
+	x = 0;
 	start = stack;
 	top = (unsigned long *)
 		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
@@ -139,12 +127,13 @@ check_stack(unsigned long ip, unsigned l
 	while (i < max_stack_trace.nr_entries) {
 		int found = 0;
 
-		stack_dump_index[i] = this_size;
+		stack_dump_index[x] = this_size;
 		p = start;
 
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
 			if (*p == stack_dump_trace[i]) {
-				this_size = stack_dump_index[i++] =
+				stack_dump_trace[x] = stack_dump_trace[i++];
+				this_size = stack_dump_index[x++] =
 					(top - p) * sizeof(unsigned long);
 				found = 1;
 				/* Start the search from here */
@@ -168,6 +157,10 @@ check_stack(unsigned long ip, unsigned l
 			i++;
 	}
 
+	max_stack_trace.nr_entries = x;
+	for (; x < i; x++)
+		stack_dump_trace[x] = ULONG_MAX;
+
 	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
@@ -192,24 +185,7 @@ stack_trace_call(unsigned long ip, unsig
 	if (per_cpu(trace_active, cpu)++ != 0)
 		goto out;
 
-	/*
-	 * When fentry is used, the traced function does not get
-	 * its stack frame set up, and we lose the parent.
-	 * The ip is pretty useless because the function tracer
-	 * was called before that function set up its stack frame.
-	 * In this case, we use the parent ip.
-	 *
-	 * By adding the return address of either the parent ip
-	 * or the current ip we can disregard most of the stack usage
-	 * caused by the stack tracer itself.
-	 *
-	 * The function tracer always reports the address of where the
-	 * mcount call was, but the stack will hold the return address.
-	 */
-	if (fentry)
-		ip = parent_ip;
-	else
-		ip += MCOUNT_INSN_SIZE;
+	ip += MCOUNT_INSN_SIZE;
 
 	check_stack(ip, &stack);
 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16  1:08                   ` AKASHI Takahiro
  2015-07-16  1:38                     ` Steven Rostedt
  2015-07-16 13:29                     ` Jungseok Lee
@ 2015-07-16 14:28                     ` Mark Rutland
  2015-07-16 14:34                       ` Steven Rostedt
  2 siblings, 1 reply; 53+ messages in thread
From: Mark Rutland @ 2015-07-16 14:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 16, 2015 at 02:08:03AM +0100, AKASHI Takahiro wrote:
> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
> > On 07/16/2015 01:13 AM, Steven Rostedt wrote:
> >> On Wed, 15 Jul 2015 10:55:36 -0400
> >> Steven Rostedt <rostedt@goodmis.org> wrote:
> >>
> >>
> >>> I'll take a look at it and try to clean up the code.
> >>
> >> Does the  following patch make sense for you?
> >
> > Looks nice. The patch greatly simplifies changes on arm64 side.
> 
> As follows:
> 
> - Takahiro AKASHI
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index c5534fa..868d6f1 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -15,6 +15,7 @@
> 
>   #define MCOUNT_ADDR		((unsigned long)_mcount)
>   #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */

Is there any reason we couldn't have the arch code dump the stack depth
for each function when it walks the stack to generate the stack trace?

That means we can provide a more precise result (because we know the
layour of our own stackframes), and we only need walk the stack once to
do so.

The downside is that we need a new function per-arch to do so.

Mark.

> 
>   #ifndef __ASSEMBLY__
>   #include <linux/compat.h>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 1da6029..2c1bf7d 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
>   #endif /* CONFIG_FUNCTION_TRACER */
> 
>   #ifdef CONFIG_STACK_TRACER
> +/*
> + * the offset value to add to return address from save_stack_trace()
> + */
> +#ifndef FTRACE_STACK_FRAME_OFFSET
> +#define FTRACE_STACK_FRAME_OFFSET 0
> +#endif
> +
>   extern int stack_tracer_enabled;
>   int
>   stack_trace_sysctl(struct ctl_table *table, int write,
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 9384647..c5b9748 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
> 
>   	/* Skip over the overhead of the stack tracer itself */
>   	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> -		if (stack_dump_trace[i] == ip)
> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
>   			break;
>   	}
> 
> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>   		p = start;
> 
>   		for (; p < top && i < max_stack_trace.nr_entries; p++) {
> -			if (*p == stack_dump_trace[i]) {
> +			if (*p == (stack_dump_trace[i]
> +					+ FTRACE_STACK_FRAME_OFFSET)) {
>   				stack_dump_trace[x] = stack_dump_trace[i++];
>   				this_size = stack_dump_index[x++] =
>   					(top - p) * sizeof(unsigned long);
> -- 
> 1.7.9.5
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 14:28                     ` Mark Rutland
@ 2015-07-16 14:34                       ` Steven Rostedt
  2015-07-17  2:09                         ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jul 2015 15:28:34 +0100
Mark Rutland <mark.rutland@arm.com> wrote:


> > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > index c5534fa..868d6f1 100644
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -15,6 +15,7 @@
> > 
> >   #define MCOUNT_ADDR		((unsigned long)_mcount)
> >   #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
> > +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */
> 
> Is there any reason we couldn't have the arch code dump the stack depth
> for each function when it walks the stack to generate the stack trace?
> 
> That means we can provide a more precise result (because we know the
> layour of our own stackframes), and we only need walk the stack once to
> do so.
> 
> The downside is that we need a new function per-arch to do so.

Or we make check_patch() a weak function, and let archs override it. I
can possibly break up the code a bit to have helper functions where
things are the same.

I want x86 to be able to track irq stacks as well, but there's no way
to do that generically yet, so having arch specific functions has been
on my todo list.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 14:24                       ` Steven Rostedt
@ 2015-07-16 15:01                         ` Jungseok Lee
  2015-07-16 15:31                           ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-07-16 15:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 16, 2015, at 11:24 PM, Steven Rostedt wrote:

Hi, Steve

> On Thu, 16 Jul 2015 22:29:05 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:

[ snip ]

>> The data looks odd in two points.
>> 1) the number of entry
>> There is a mismatch between start token and real data
> 
> Yep, good catch. As soon as I read that, I realized exactly what the
> issue was ;-)
> 
>> 
>> 2) 80-byte gap
>> stack_max_size is not aligned with "Depth" field of the first entry of stack_trace.
>> 
>> IMHO, two items are not considered in this series as digging them out.
>> 
>> 1) skipped entries
>> As x variable is introduced in Steve's patch, it is needed to track down
>> how many entries are recorded in both stack_dump_trace and stack_dump_index.
> 
> Yep.
> 
>> 
>> 2) max_stack_trace.skip value
>> max_stack_trace.skip is 0 as applying Steve's patch. The above gap could be
>> observed unless the value is not considered in arch code. In the above example,
>> 80-byte gap is save_stack_trace function in arch/arm64/kernel/stacktrace.c.
>> 
>> As applying the following fix, stack_trace and stack_max_size are okay.
>> However, I'm not sure which code, arch or generic ftrace, should handle trace->skip.
>> The latter one is responsible for it under current implementation, not Steve's change.
>> 
>> Please correct me if I am wrong.
> 
> No, it's a bug in my patch. I'll make an update.
> 
> Does this new patch fix it for you?

I've gathered stack tracer data with your update.

1) stack_trace
        Depth    Size   Location    (35 entries)
        -----    ----   --------
  0)     4424      16   put_cpu_partial+0x28/0x1d0
  1)     4408      80   get_partial_node.isra.64+0x13c/0x344
  2)     4328     256   __slab_alloc.isra.65.constprop.67+0xd8/0x37c
  3)     4072      32   kmem_cache_alloc+0x258/0x294
  4)     4040     304   __alloc_skb+0x48/0x180
  5)     3736      96   alloc_skb_with_frags+0x74/0x234
  6)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
  7)     3528     160   sock_alloc_send_skb+0x44/0x54
  8)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
  9)     3304     224   ip_append_data.part.42+0x98/0xe8
 10)     3080     112   ip_append_data+0x68/0x7c
 11)     2968      96   icmp_push_reply+0x7c/0x144
 12)     2872      96   icmp_send+0x3c0/0x3c8
 13)     2776     192   __udp4_lib_rcv+0x5b8/0x684
 14)     2584      96   udp_rcv+0x2c/0x3c
 15)     2488      32   ip_local_deliver+0xa0/0x224
 16)     2456      48   ip_rcv+0x360/0x57c
 17)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
 18)     2344     128   __netif_receive_skb+0x24/0x84
 19)     2216      32   process_backlog+0x9c/0x15c
 20)     2184      80   net_rx_action+0x1ec/0x32c
 21)     2104     160   __do_softirq+0x114/0x2f0
 22)     1944     128   do_softirq+0x60/0x68
 23)     1816      32   __local_bh_enable_ip+0xb0/0xd4
 24)     1784      32   ip_finish_output+0x1f4/0xabc
 25)     1752      96   ip_output+0xf0/0x120
 26)     1656      64   ip_local_out_sk+0x44/0x54
 27)     1592      32   ip_send_skb+0x24/0xbc
 28)     1560      48   udp_send_skb+0x1b4/0x2f4
 29)     1512      80   udp_sendmsg+0x2a8/0x7a0
 30)     1432     272   inet_sendmsg+0xa0/0xd0
 31)     1160      48   sock_sendmsg+0x30/0x78
 32)     1112      32   ___sys_sendmsg+0x15c/0x26c
 33)     1080     400   __sys_sendmmsg+0x94/0x180
 34)      680     320   SyS_sendmmsg+0x38/0x54
 35)      360     360   el0_svc_naked+0x20/0x28

2) stack_max_size
4504

In case of the number of entries, the following diff might be needed
as I suggested in the previous reply. ;)

----8<----

@@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
 		seq_printf(m, "        Depth    Size   Location"
 			   "    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 		if (!stack_tracer_enabled && !max_stack_size)
 			print_disabled(m);

----8<----

However, 80-byte gap still appears.

Since max_stack_trace.skip is 3 in your update, save_stack_trace in arm64
should be refactored to align with this value. 

max_stack_trace.skip should be set to 4 if AKASHI's [RFC 2/3] patch is merged.
However, arch code is supposed to follow generic framework's rule in this case.
Isn't it?

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 15:01                         ` Jungseok Lee
@ 2015-07-16 15:31                           ` Steven Rostedt
  2015-07-16 15:52                             ` Jungseok Lee
  2015-07-16 16:16                             ` Steven Rostedt
  0 siblings, 2 replies; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 17 Jul 2015 00:01:25 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:

> I've gathered stack tracer data with your update.
> 
> 1) stack_trace
>         Depth    Size   Location    (35 entries)
>         -----    ----   --------
>   0)     4424      16   put_cpu_partial+0x28/0x1d0
>   1)     4408      80   get_partial_node.isra.64+0x13c/0x344
>   2)     4328     256   __slab_alloc.isra.65.constprop.67+0xd8/0x37c
>   3)     4072      32   kmem_cache_alloc+0x258/0x294
>   4)     4040     304   __alloc_skb+0x48/0x180
>   5)     3736      96   alloc_skb_with_frags+0x74/0x234
>   6)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
>   7)     3528     160   sock_alloc_send_skb+0x44/0x54
>   8)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
>   9)     3304     224   ip_append_data.part.42+0x98/0xe8
>  10)     3080     112   ip_append_data+0x68/0x7c
>  11)     2968      96   icmp_push_reply+0x7c/0x144
>  12)     2872      96   icmp_send+0x3c0/0x3c8
>  13)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>  14)     2584      96   udp_rcv+0x2c/0x3c
>  15)     2488      32   ip_local_deliver+0xa0/0x224
>  16)     2456      48   ip_rcv+0x360/0x57c
>  17)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>  18)     2344     128   __netif_receive_skb+0x24/0x84
>  19)     2216      32   process_backlog+0x9c/0x15c
>  20)     2184      80   net_rx_action+0x1ec/0x32c
>  21)     2104     160   __do_softirq+0x114/0x2f0
>  22)     1944     128   do_softirq+0x60/0x68
>  23)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>  24)     1784      32   ip_finish_output+0x1f4/0xabc
>  25)     1752      96   ip_output+0xf0/0x120
>  26)     1656      64   ip_local_out_sk+0x44/0x54
>  27)     1592      32   ip_send_skb+0x24/0xbc
>  28)     1560      48   udp_send_skb+0x1b4/0x2f4
>  29)     1512      80   udp_sendmsg+0x2a8/0x7a0
>  30)     1432     272   inet_sendmsg+0xa0/0xd0
>  31)     1160      48   sock_sendmsg+0x30/0x78
>  32)     1112      32   ___sys_sendmsg+0x15c/0x26c
>  33)     1080     400   __sys_sendmmsg+0x94/0x180
>  34)      680     320   SyS_sendmmsg+0x38/0x54
>  35)      360     360   el0_svc_naked+0x20/0x28
> 
> 2) stack_max_size
> 4504

Strange, on x86 I have this (with my patch applied):

        Depth    Size   Location    (39 entries)
        -----    ----   --------
  0)     3704      64   _raw_spin_lock+0x5/0x30
  1)     3640     200   get_partial_node.isra.80+0x54/0x1da
  2)     3440     208   __slab_alloc.isra.82+0x199/0x3f7
  3)     3232      80   kmem_cache_alloc+0x151/0x160
  4)     3152      16   mempool_alloc_slab+0x15/0x20
  5)     3136     128   mempool_alloc+0x58/0x150
  6)     3008      16   scsi_sg_alloc+0x42/0x50
  7)     2992     112   __sg_alloc_table+0x10b/0x150
  8)     2880      48   scsi_alloc_sgtable+0x43/0x80
  9)     2832      32   scsi_init_sgtable+0x2b/0x70
 10)     2800      80   scsi_init_io+0x59/0x1e0
 11)     2720     128   sd_init_command+0x66/0xd80
 12)     2592      24   scsi_setup_cmnd+0xa9/0x160
 13)     2568      88   scsi_prep_fn+0x7d/0x160
 14)     2480      48   blk_peek_request+0x168/0x2a0
 15)     2432     112   scsi_request_fn+0x3f/0x610
 16)     2320       8   __blk_run_queue+0x37/0x50
 17)     2312     104   queue_unplugged+0x41/0xe0
 18)     2208     112   blk_flush_plug_list+0x1b7/0x1e0
 19)     2096      80   blk_queue_bio+0x257/0x340
 20)     2016      48   generic_make_request+0xb1/0xf0
 21)     1968      96   submit_bio+0x76/0x130
 22)     1872      48   submit_bh_wbc.isra.35+0x10b/0x140
 23)     1824     112   __block_write_full_page.constprop.40+0x188/0x310
 24)     1712      64   block_write_full_page+0xdd/0x130
 25)     1648      16   blkdev_writepage+0x18/0x20
 26)     1632       8   __writepage+0x17/0x40
 27)     1624     312   write_cache_pages+0x21e/0x480
 28)     1312      96   generic_writepages+0x4a/0x70
 29)     1216      16   do_writepages+0x20/0x30
 30)     1200      96   __writeback_single_inode+0x45/0x350
 31)     1104     176   writeback_sb_inodes+0x218/0x3d0
 32)      928      80   __writeback_inodes_wb+0x8c/0xc0
 33)      848     128   wb_writeback+0x239/0x2c0
 34)      720     192   wb_workfn+0x24b/0x460
 35)      528      80   process_one_work+0x14b/0x430
 36)      448     128   worker_thread+0x117/0x460
 37)      320     144   kthread+0xc9/0xe0
 38)      176     176   ret_from_fork+0x3f/0x70

 # cat /debug/tracing/stack_max_size 
3704


> 
> In case of the number of entries, the following diff might be needed
> as I suggested in the previous reply. ;)
> 
> ----8<----
> 
> @@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
>  		seq_printf(m, "        Depth    Size   Location"
>  			   "    (%d entries)\n"
>  			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);

This would break x86.

>  
>  		if (!stack_tracer_enabled && !max_stack_size)
>  			print_disabled(m);
> 
> ----8<----
> 
> However, 80-byte gap still appears.

This seems to be specific to your arch.

> 
> Since max_stack_trace.skip is 3 in your update, save_stack_trace in arm64
> should be refactored to align with this value. 
> 
> max_stack_trace.skip should be set to 4 if AKASHI's [RFC 2/3] patch is merged.
> However, arch code is supposed to follow generic framework's rule in this case.
> Isn't it?
>

yeah, you don't want to update the skip level. It should just work.

I'll run this on my powerpc box and see if it shows something
different. If I have to, I'll even boot up my arm (not 64) board and
try it there.


-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 15:31                           ` Steven Rostedt
@ 2015-07-16 15:52                             ` Jungseok Lee
  2015-07-16 20:22                               ` Steven Rostedt
  2015-07-16 16:16                             ` Steven Rostedt
  1 sibling, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-07-16 15:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 17, 2015, at 12:31 AM, Steven Rostedt wrote:
> On Fri, 17 Jul 2015 00:01:25 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:
> 
>> I've gathered stack tracer data with your update.
>> 
>> 1) stack_trace
>>        Depth    Size   Location    (35 entries)
>>        -----    ----   --------
>>  0)     4424      16   put_cpu_partial+0x28/0x1d0
>>  1)     4408      80   get_partial_node.isra.64+0x13c/0x344
>>  2)     4328     256   __slab_alloc.isra.65.constprop.67+0xd8/0x37c
>>  3)     4072      32   kmem_cache_alloc+0x258/0x294
>>  4)     4040     304   __alloc_skb+0x48/0x180
>>  5)     3736      96   alloc_skb_with_frags+0x74/0x234
>>  6)     3640     112   sock_alloc_send_pskb+0x1d0/0x294
>>  7)     3528     160   sock_alloc_send_skb+0x44/0x54
>>  8)     3368      64   __ip_append_data.isra.40+0x78c/0xb48
>>  9)     3304     224   ip_append_data.part.42+0x98/0xe8
>> 10)     3080     112   ip_append_data+0x68/0x7c
>> 11)     2968      96   icmp_push_reply+0x7c/0x144
>> 12)     2872      96   icmp_send+0x3c0/0x3c8
>> 13)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>> 14)     2584      96   udp_rcv+0x2c/0x3c
>> 15)     2488      32   ip_local_deliver+0xa0/0x224
>> 16)     2456      48   ip_rcv+0x360/0x57c
>> 17)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>> 18)     2344     128   __netif_receive_skb+0x24/0x84
>> 19)     2216      32   process_backlog+0x9c/0x15c
>> 20)     2184      80   net_rx_action+0x1ec/0x32c
>> 21)     2104     160   __do_softirq+0x114/0x2f0
>> 22)     1944     128   do_softirq+0x60/0x68
>> 23)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>> 24)     1784      32   ip_finish_output+0x1f4/0xabc
>> 25)     1752      96   ip_output+0xf0/0x120
>> 26)     1656      64   ip_local_out_sk+0x44/0x54
>> 27)     1592      32   ip_send_skb+0x24/0xbc
>> 28)     1560      48   udp_send_skb+0x1b4/0x2f4
>> 29)     1512      80   udp_sendmsg+0x2a8/0x7a0
>> 30)     1432     272   inet_sendmsg+0xa0/0xd0
>> 31)     1160      48   sock_sendmsg+0x30/0x78
>> 32)     1112      32   ___sys_sendmsg+0x15c/0x26c
>> 33)     1080     400   __sys_sendmmsg+0x94/0x180
>> 34)      680     320   SyS_sendmmsg+0x38/0x54
>> 35)      360     360   el0_svc_naked+0x20/0x28
>> 
>> 2) stack_max_size
>> 4504
> 
> Strange, on x86 I have this (with my patch applied):
> 
>        Depth    Size   Location    (39 entries)
>        -----    ----   --------
>  0)     3704      64   _raw_spin_lock+0x5/0x30
>  1)     3640     200   get_partial_node.isra.80+0x54/0x1da
>  2)     3440     208   __slab_alloc.isra.82+0x199/0x3f7
>  3)     3232      80   kmem_cache_alloc+0x151/0x160
>  4)     3152      16   mempool_alloc_slab+0x15/0x20
>  5)     3136     128   mempool_alloc+0x58/0x150
>  6)     3008      16   scsi_sg_alloc+0x42/0x50
>  7)     2992     112   __sg_alloc_table+0x10b/0x150
>  8)     2880      48   scsi_alloc_sgtable+0x43/0x80
>  9)     2832      32   scsi_init_sgtable+0x2b/0x70
> 10)     2800      80   scsi_init_io+0x59/0x1e0
> 11)     2720     128   sd_init_command+0x66/0xd80
> 12)     2592      24   scsi_setup_cmnd+0xa9/0x160
> 13)     2568      88   scsi_prep_fn+0x7d/0x160
> 14)     2480      48   blk_peek_request+0x168/0x2a0
> 15)     2432     112   scsi_request_fn+0x3f/0x610
> 16)     2320       8   __blk_run_queue+0x37/0x50
> 17)     2312     104   queue_unplugged+0x41/0xe0
> 18)     2208     112   blk_flush_plug_list+0x1b7/0x1e0
> 19)     2096      80   blk_queue_bio+0x257/0x340
> 20)     2016      48   generic_make_request+0xb1/0xf0
> 21)     1968      96   submit_bio+0x76/0x130
> 22)     1872      48   submit_bh_wbc.isra.35+0x10b/0x140
> 23)     1824     112   __block_write_full_page.constprop.40+0x188/0x310
> 24)     1712      64   block_write_full_page+0xdd/0x130
> 25)     1648      16   blkdev_writepage+0x18/0x20
> 26)     1632       8   __writepage+0x17/0x40
> 27)     1624     312   write_cache_pages+0x21e/0x480
> 28)     1312      96   generic_writepages+0x4a/0x70
> 29)     1216      16   do_writepages+0x20/0x30
> 30)     1200      96   __writeback_single_inode+0x45/0x350
> 31)     1104     176   writeback_sb_inodes+0x218/0x3d0
> 32)      928      80   __writeback_inodes_wb+0x8c/0xc0
> 33)      848     128   wb_writeback+0x239/0x2c0
> 34)      720     192   wb_workfn+0x24b/0x460
> 35)      528      80   process_one_work+0x14b/0x430
> 36)      448     128   worker_thread+0x117/0x460
> 37)      320     144   kthread+0xc9/0xe0
> 38)      176     176   ret_from_fork+0x3f/0x70
> 
> # cat /debug/tracing/stack_max_size 
> 3704
> 
> 
>> 
>> In case of the number of entries, the following diff might be needed
>> as I suggested in the previous reply. ;)
>> 
>> ----8<----
>> 
>> @@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
>> 		seq_printf(m, "        Depth    Size   Location"
>> 			   "    (%d entries)\n"
>> 			   "        -----    ----   --------\n",
>> -			   max_stack_trace.nr_entries - 1);
>> +			   max_stack_trace.nr_entries);
> 
> This would break x86.

Thanks for x86 data. It's really helpful!

> 
>> 
>> 		if (!stack_tracer_enabled && !max_stack_size)
>> 			print_disabled(m);
>> 
>> ----8<----
>> 
>> However, 80-byte gap still appears.
> 
> This seems to be specific to your arch.

Totally agree.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 15:31                           ` Steven Rostedt
  2015-07-16 15:52                             ` Jungseok Lee
@ 2015-07-16 16:16                             ` Steven Rostedt
  2015-07-17 12:40                               ` Mark Rutland
  1 sibling, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16 16:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 16 Jul 2015 11:31:15 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > 
> > In case of the number of entries, the following diff might be needed
> > as I suggested in the previous reply. ;)
> > 
> > ----8<----
> > 
> > @@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
> >  		seq_printf(m, "        Depth    Size   Location"
> >  			   "    (%d entries)\n"
> >  			   "        -----    ----   --------\n",
> > -			   max_stack_trace.nr_entries - 1);
> > +			   max_stack_trace.nr_entries);
> 
> This would break x86.

Nope, you were correct. But it was also missing another change:

-- Steve

Index: linux-trace.git/kernel/trace/trace_stack.c
===================================================================
--- linux-trace.git.orig/kernel/trace/trace_stack.c	2015-07-16 11:45:17.816249113 -0400
+++ linux-trace.git/kernel/trace/trace_stack.c	2015-07-16 12:01:51.051403333 -0400
@@ -284,7 +260,7 @@ __next(struct seq_file *m, loff_t *pos)
 {
 	long n = *pos - 1;
 
-	if (n >= max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
+	if (n > max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
 		return NULL;
 
 	m->private = (void *)n;
@@ -354,7 +330,7 @@ static int t_show(struct seq_file *m, vo
 		seq_printf(m, "        Depth    Size   Location"
 			   "    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 		if (!stack_tracer_enabled && !max_stack_size)
 			print_disabled(m);

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 15:52                             ` Jungseok Lee
@ 2015-07-16 20:22                               ` Steven Rostedt
  2015-07-17  2:49                                 ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-16 20:22 UTC (permalink / raw)
  To: linux-arm-kernel


Here's the patch I now have in my local repo, and plan on pushing to my
repo on korg.

-- Steve

>From d21f02a45fa367beaf97b153aa29849c06ac5609 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Date: Thu, 16 Jul 2015 13:24:54 -0400
Subject: [PATCH] tracing: Clean up stack tracing and fix fentry updates

Akashi Takahiro was porting the stack tracer to arm64 and found some
issues with it. One was that it repeats the top function, due to the
stack frame added by the mcount caller and added by itself. This
was added when fentry came in, and before fentry created its own stack
frame. But x86's fentry now creates its own stack frame, and there's
no need to insert the function again.

This also cleans up the code a bit, where it doesn't need to do something
special for fentry, and doesn't include insertion of a duplicate
entry for the called function being traced.

Link: http://lkml.kernel.org/r/55A646EE.6030402 at linaro.org

Some-Suggestions-by: Jungseok Lee <jungseoklee85@gmail.com>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 62 ++++++++++++++--------------------------------
 1 file changed, 19 insertions(+), 43 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 3f34496244e9..e615cdc9e38a 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -18,12 +18,6 @@
 
 #define STACK_TRACE_ENTRIES 500
 
-#ifdef CC_USING_FENTRY
-# define fentry		1
-#else
-# define fentry		0
-#endif
-
 static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
 	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
 static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
@@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
  */
 static struct stack_trace max_stack_trace = {
 	.max_entries		= STACK_TRACE_ENTRIES - 1,
-	.entries		= &stack_dump_trace[1],
+	.entries		= &stack_dump_trace[0],
 };
 
 static unsigned long max_stack_size;
@@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 	unsigned long this_size, flags; unsigned long *p, *top, *start;
 	static int tracer_frame;
 	int frame_size = ACCESS_ONCE(tracer_frame);
-	int i;
+	int i, x;
 
 	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
 	this_size = THREAD_SIZE - this_size;
@@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
 	max_stack_size = this_size;
 
 	max_stack_trace.nr_entries = 0;
-
-	if (using_ftrace_ops_list_func())
-		max_stack_trace.skip = 4;
-	else
-		max_stack_trace.skip = 3;
+	max_stack_trace.skip = 3;
 
 	save_stack_trace(&max_stack_trace);
 
-	/*
-	 * Add the passed in ip from the function tracer.
-	 * Searching for this on the stack will skip over
-	 * most of the overhead from the stack tracer itself.
-	 */
-	stack_dump_trace[0] = ip;
-	max_stack_trace.nr_entries++;
+	/* Skip over the overhead of the stack tracer itself */
+	for (i = 0; i < max_stack_trace.nr_entries; i++) {
+		if (stack_dump_trace[i] == ip)
+			break;
+	}
 
 	/*
 	 * Now find where in the stack these are.
 	 */
-	i = 0;
+	x = 0;
 	start = stack;
 	top = (unsigned long *)
 		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
@@ -139,12 +127,13 @@ check_stack(unsigned long ip, unsigned long *stack)
 	while (i < max_stack_trace.nr_entries) {
 		int found = 0;
 
-		stack_dump_index[i] = this_size;
+		stack_dump_index[x] = this_size;
 		p = start;
 
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
 			if (*p == stack_dump_trace[i]) {
-				this_size = stack_dump_index[i++] =
+				stack_dump_trace[x] = stack_dump_trace[i++];
+				this_size = stack_dump_index[x++] =
 					(top - p) * sizeof(unsigned long);
 				found = 1;
 				/* Start the search from here */
@@ -168,6 +157,10 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
+	max_stack_trace.nr_entries = x;
+	for (; x < i; x++)
+		stack_dump_trace[x] = ULONG_MAX;
+
 	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
@@ -192,24 +185,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (per_cpu(trace_active, cpu)++ != 0)
 		goto out;
 
-	/*
-	 * When fentry is used, the traced function does not get
-	 * its stack frame set up, and we lose the parent.
-	 * The ip is pretty useless because the function tracer
-	 * was called before that function set up its stack frame.
-	 * In this case, we use the parent ip.
-	 *
-	 * By adding the return address of either the parent ip
-	 * or the current ip we can disregard most of the stack usage
-	 * caused by the stack tracer itself.
-	 *
-	 * The function tracer always reports the address of where the
-	 * mcount call was, but the stack will hold the return address.
-	 */
-	if (fentry)
-		ip = parent_ip;
-	else
-		ip += MCOUNT_INSN_SIZE;
+	ip += MCOUNT_INSN_SIZE;
 
 	check_stack(ip, &stack);
 
@@ -284,7 +260,7 @@ __next(struct seq_file *m, loff_t *pos)
 {
 	long n = *pos - 1;
 
-	if (n >= max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
+	if (n > max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
 		return NULL;
 
 	m->private = (void *)n;
@@ -354,7 +330,7 @@ static int t_show(struct seq_file *m, void *v)
 		seq_printf(m, "        Depth    Size   Location"
 			   "    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 		if (!stack_tracer_enabled && !max_stack_size)
 			print_disabled(m);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 13:29                     ` Jungseok Lee
  2015-07-16 13:54                       ` Jungseok Lee
  2015-07-16 14:24                       ` Steven Rostedt
@ 2015-07-17  2:04                       ` AKASHI Takahiro
  2015-07-17 14:38                         ` Jungseok Lee
  2 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-17  2:04 UTC (permalink / raw)
  To: linux-arm-kernel

Jungseok,

Thank you for your testing and reviews.

On 07/16/2015 10:29 PM, Jungseok Lee wrote:
> On Jul 16, 2015, at 10:08 AM, AKASHI Takahiro wrote:
>
> Hi, AKASHI
>
>> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
>>> On 07/16/2015 01:13 AM, Steven Rostedt wrote:
>>>> On Wed, 15 Jul 2015 10:55:36 -0400
>>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>>>
>>>>
>>>>> I'll take a look at it and try to clean up the code.
>>>>
>>>> Does the  following patch make sense for you?
>>>
>>> Looks nice. The patch greatly simplifies changes on arm64 side.
>>
>> As follows:
>>
>> - Takahiro AKASHI
>>
>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>> index c5534fa..868d6f1 100644
>> --- a/arch/arm64/include/asm/ftrace.h
>> +++ b/arch/arm64/include/asm/ftrace.h
>> @@ -15,6 +15,7 @@
>>
>> #define MCOUNT_ADDR		((unsigned long)_mcount)
>> #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */

Well,
   #define FTRACE_STACK_FRAME_OFFSET AARCH64_INSN_SIZE
might be better.

> How about binding it to -4 in unwind_frame function?

Do you mean like this?
In unwind_frame(),
   frame->pc = *(unsigned long*)(fp + 8) - AARCH64_INSN_SIZE;

> IMHO, it would help other developers trying to change stack trace code
> be aware of this stack tracer feature.
>
>> #ifndef __ASSEMBLY__
>> #include <linux/compat.h>
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 1da6029..2c1bf7d 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
>> #endif /* CONFIG_FUNCTION_TRACER */
>>
>> #ifdef CONFIG_STACK_TRACER
>> +/*
>> + * the offset value to add to return address from save_stack_trace()
>> + */
>> +#ifndef FTRACE_STACK_FRAME_OFFSET
>> +#define FTRACE_STACK_FRAME_OFFSET 0
>> +#endif
>> +
>> extern int stack_tracer_enabled;
>> int
>> stack_trace_sysctl(struct ctl_table *table, int write,
>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>> index 9384647..c5b9748 100644
>> --- a/kernel/trace/trace_stack.c
>> +++ b/kernel/trace/trace_stack.c
>> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>>
>> 	/* Skip over the overhead of the stack tracer itself */
>> 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
>> -		if (stack_dump_trace[i] == ip)
>> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
>> 			break;
>> 	}
>>
>> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>> 		p = start;
>>
>> 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>> -			if (*p == stack_dump_trace[i]) {
>> +			if (*p == (stack_dump_trace[i]
>> +					+ FTRACE_STACK_FRAME_OFFSET)) {
>> 				stack_dump_trace[x] = stack_dump_trace[i++];
>> 				this_size = stack_dump_index[x++] =
>> 					(top - p) * sizeof(unsigned long);
>> --
>
> I've prepared a kernel with the following patches and reviewed them.
>
> 1) Steve's clean up patch
> 2) This patch
> 3) [RFC 2/3]
>
> AFAIU, [RFC 3/3] is not needed any more thanks to Steve's patch.

We don't need [2/3].

> First of all, let's look at the following data.
>
> 1) stack_trace data
>          Depth    Size   Location    (55 entries)
>          -----    ----   --------
>    0)     4808      16   notifier_call_chain+0x2c/0x94
>    1)     4792      64   raw_notifier_call_chain+0x34/0x44
>    2)     4728      48   timekeeping_update.constprop.9+0xb8/0x114
>    3)     4680      48   update_wall_time+0x408/0x6dc
>    4)     4632     128   tick_do_update_jiffies64+0xd8/0x154
>    5)     4504      80   tick_sched_do_timer+0x50/0x60
>    6)     4424      32   tick_sched_timer+0x34/0x90
>    7)     4392      48   __run_hrtimer+0x60/0x258
>    8)     4344      64   hrtimer_interrupt+0xe8/0x260
>    9)     4280     128   arch_timer_handler_virt+0x38/0x48
>   10)     4152      32   handle_percpu_devid_irq+0x84/0x188
>   11)     4120      64   generic_handle_irq+0x38/0x54
>   12)     4056      32   __handle_domain_irq+0x68/0xbc
>   13)     4024      64   gic_handle_irq+0x38/0x88
>   14)     3960     336   el1_irq+0x64/0xd8
>   15)     3624      16   netif_rx_internal+0x14/0x198
>   16)     3608      64   netif_rx+0x20/0xa4
>   17)     3544      32   loopback_xmit+0x64/0xf4
>   18)     3512      48   dev_hard_start_xmit+0x25c/0x3f8
>   19)     3464     160   __dev_queue_xmit+0x440/0x4dc
>   20)     3304      96   dev_queue_xmit_sk+0x20/0x30
>   21)     3208      32   ip_finish_output+0x1e0/0xabc
>   22)     3176      96   ip_output+0xf0/0x120
>   23)     3080      64   ip_local_out_sk+0x44/0x54
>   24)     3016      32   ip_send_skb+0x24/0xbc
>   25)     2984      48   ip_push_pending_frames+0x40/0x60
>   26)     2936      64   icmp_push_reply+0x104/0x144
>   27)     2872      96   icmp_send+0x3c0/0x3c8
>   28)     2776     192   __udp4_lib_rcv+0x5b8/0x684
>   29)     2584      96   udp_rcv+0x2c/0x3c
>   30)     2488      32   ip_local_deliver+0xa0/0x224
>   31)     2456      48   ip_rcv+0x360/0x57c
>   32)     2408      64   __netif_receive_skb_core+0x4d0/0x80c
>   33)     2344     128   __netif_receive_skb+0x24/0x84
>   34)     2216      32   process_backlog+0x9c/0x15c
>   35)     2184      80   net_rx_action+0x1ec/0x32c
>   36)     2104     160   __do_softirq+0x114/0x2f0
>   37)     1944     128   do_softirq+0x60/0x68
>   38)     1816      32   __local_bh_enable_ip+0xb0/0xd4
>   39)     1784      32   ip_finish_output+0x1f4/0xabc
>   40)     1752      96   ip_output+0xf0/0x120
>   41)     1656      64   ip_local_out_sk+0x44/0x54
>   42)     1592      32   ip_send_skb+0x24/0xbc
>   43)     1560      48   udp_send_skb+0x1b4/0x2f4
>   44)     1512      80   udp_sendmsg+0x2a8/0x7a0
>   45)     1432     272   inet_sendmsg+0xa0/0xd0
>   46)     1160      48   sock_sendmsg+0x30/0x78
>   47)     1112      32   ___sys_sendmsg+0x15c/0x26c
>   48)     1080     400   __sys_sendmmsg+0x94/0x180
>   49)      680     320   SyS_sendmmsg+0x38/0x54
>   50)      360     360   el0_svc_naked+0x20/0x28
>
> 2) stack_max_size data
> 4888
>
> The data looks odd in two points.
> 1) the number of entry
> There is a mismatch between start token and real data
>
> 2) 80-byte gap
> stack_max_size is not aligned with "Depth" field of the first entry of stack_trace.
>
> IMHO, two items are not considered in this series as digging them out.
>
> 1) skipped entries
> As x variable is introduced in Steve's patch, it is needed to track down
> how many entries are recorded in both stack_dump_trace and stack_dump_index.
>
> 2) max_stack_trace.skip value
> max_stack_trace.skip is 0 as applying Steve's patch. The above gap could be
> observed unless the value is not considered in arch code. In the above example,
> 80-byte gap is save_stack_trace function in arch/arm64/kernel/stacktrace.c.
>
> As applying the following fix, stack_trace and stack_max_size are okay.
> However, I'm not sure which code, arch or generic ftrace, should handle trace->skip.
> The latter one is responsible for it under current implementation, not Steve's change.
>
> Please correct me if I am wrong.

I will address these issues in my later replies.

Thanks,
-Takahiro AKASHI

> Best Regards
> Jungseok Lee
>
> ----8<----
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 978c923..5f6f4df 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -137,7 +137,9 @@ void save_stack_trace(struct stack_trace *trace)
>
>   	frame.fp = (unsigned long)__builtin_frame_address(0);
>   	frame.sp = current_stack_pointer;
> -	frame.pc = (unsigned long)save_stack_trace_tsk;
> +	frame.pc = (unsigned long)save_stack_trace;
> +
> +	trace->skip += 4;
>
>   	__save_stack_trace(&frame, trace, 0);
>   }
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index c5b9748..5635683 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -158,6 +158,9 @@ check_stack(unsigned long ip, unsigned long *stack)
>   			i++;
>   	}
>
> +	/* We do not want to include skipped entries */
> +	max_stack_trace.nr_entries -= (i - x);
> +
>   	for (; x < max_stack_trace.nr_entries; x++)
>   		stack_dump_trace[x] = ULONG_MAX;
>
> @@ -330,7 +333,7 @@ static int t_show(struct seq_file *m, void *v)
>   		seq_printf(m, "        Depth    Size   Location"
>   			   "    (%d entries)\n"
>   			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);
>
>   		if (!stack_tracer_enabled && !max_stack_size)
>   			print_disabled(m);
>
> ----8<----
>
> 1) stack_trace data
>          Depth    Size   Location    (47 entries)
>          -----    ----   --------
>    0)     4648      16   notifier_call_chain+0x2c/0x94
>    1)     4632      64   raw_notifier_call_chain+0x34/0x44
>    2)     4568      48   timekeeping_update.constprop.9+0xb8/0x114
>    3)     4520      48   update_wall_time+0x408/0x6dc
>    4)     4472     128   tick_do_update_jiffies64+0xd8/0x154
>    5)     4344      80   tick_sched_do_timer+0x50/0x60
>    6)     4264      32   tick_sched_timer+0x34/0x90
>    7)     4232      48   __run_hrtimer+0x60/0x258
>    8)     4184      64   hrtimer_interrupt+0xe8/0x260
>    9)     4120     128   arch_timer_handler_virt+0x38/0x48
>   10)     3992      32   handle_percpu_devid_irq+0x84/0x188
>   11)     3960      64   generic_handle_irq+0x38/0x54
>   12)     3896      32   __handle_domain_irq+0x68/0xbc
>   13)     3864      64   gic_handle_irq+0x38/0x88
>   14)     3800     336   el1_irq+0x64/0xd8
>   15)     3464      48   __kmalloc_track_caller+0x40/0x2f8
>   16)     3416      16   __kmalloc_reserve.isra.46+0x44/0x9c
>   17)     3400     144   __alloc_skb+0x70/0x180
>   18)     3256      96   alloc_skb_with_frags+0x74/0x234
>   19)     3160     112   sock_alloc_send_pskb+0x1d0/0x294
>   20)     3048     160   sock_alloc_send_skb+0x44/0x54
>   21)     2888      64   __ip_append_data.isra.40+0x78c/0xb48
>   22)     2824     224   ip_append_data.part.42+0x98/0xe8
>   23)     2600     112   ip_append_data+0x68/0x7c
>   24)     2488      96   icmp_push_reply+0x7c/0x144
>   25)     2392      96   icmp_send+0x3c0/0x3c8
>   26)     2296     192   __udp4_lib_rcv+0x5b8/0x684
>   27)     2104      96   udp_rcv+0x2c/0x3c
>   28)     2008      32   ip_local_deliver+0xa0/0x224
>   29)     1976      48   ip_rcv+0x360/0x57c
>   30)     1928      64   __netif_receive_skb_core+0x4d0/0x80c
>   31)     1864     128   __netif_receive_skb+0x24/0x84
>   32)     1736      32   process_backlog+0x9c/0x15c
>   33)     1704      80   net_rx_action+0x1ec/0x32c
>   34)     1624     160   __do_softirq+0x114/0x2f0
>   35)     1464     128   do_softirq+0x60/0x68
>   36)     1336      32   __local_bh_enable_ip+0xb0/0xd4
>   37)     1304      32   ip_finish_output+0x1f4/0xabc
>   38)     1272      96   ip_output+0xf0/0x120
>   39)     1176      64   ip_local_out_sk+0x44/0x54
>   40)     1112      32   ip_send_skb+0x24/0xbc
>   41)     1080      48   udp_send_skb+0x1b4/0x2f4
>   42)     1032      80   udp_sendmsg+0x2a8/0x7a0
>   43)      952     272   inet_sendmsg+0xa0/0xd0
>   44)      680      48   sock_sendmsg+0x30/0x78
>   45)      632      32   SyS_sendto+0xc4/0x108
>   46)      600     600   el0_svc_naked+0x20/0x28
>
> 2) stack_max_size data
> 4648
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 14:34                       ` Steven Rostedt
@ 2015-07-17  2:09                         ` AKASHI Takahiro
  0 siblings, 0 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-17  2:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/16/2015 11:34 PM, Steven Rostedt wrote:
> On Thu, 16 Jul 2015 15:28:34 +0100
> Mark Rutland <mark.rutland@arm.com> wrote:
>
>
>>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>>> index c5534fa..868d6f1 100644
>>> --- a/arch/arm64/include/asm/ftrace.h
>>> +++ b/arch/arm64/include/asm/ftrace.h
>>> @@ -15,6 +15,7 @@
>>>
>>>    #define MCOUNT_ADDR		((unsigned long)_mcount)
>>>    #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>>> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */
>>
>> Is there any reason we couldn't have the arch code dump the stack depth
>> for each function when it walks the stack to generate the stack trace?
>>
>> That means we can provide a more precise result (because we know the
>> layour of our own stackframes), and we only need walk the stack once to
>> do so.
>>
>> The downside is that we need a new function per-arch to do so.
>
> Or we make check_patch() a weak function, and let archs override it. I
> can possibly break up the code a bit to have helper functions where
> things are the same.

Yeah, that is exactly what I meant in my cover letter[0/3] if the series
of patches are not acceptable.

-Takahiro AKASHI

> I want x86 to be able to track irq stacks as well, but there's no way
> to do that generically yet, so having arch specific functions has been
> on my todo list.
>
> -- Steve
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 20:22                               ` Steven Rostedt
@ 2015-07-17  2:49                                 ` AKASHI Takahiro
  2015-07-17  3:21                                   ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-17  2:49 UTC (permalink / raw)
  To: linux-arm-kernel

Steve,

On 07/17/2015 05:22 AM, Steven Rostedt wrote:
>
> Here's the patch I now have in my local repo, and plan on pushing to my
> repo on korg.
>
> -- Steve
>
>  From d21f02a45fa367beaf97b153aa29849c06ac5609 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> Date: Thu, 16 Jul 2015 13:24:54 -0400
> Subject: [PATCH] tracing: Clean up stack tracing and fix fentry updates
>
> Akashi Takahiro was porting the stack tracer to arm64 and found some
> issues with it. One was that it repeats the top function, due to the
> stack frame added by the mcount caller and added by itself. This
> was added when fentry came in, and before fentry created its own stack
> frame. But x86's fentry now creates its own stack frame, and there's
> no need to insert the function again.
>
> This also cleans up the code a bit, where it doesn't need to do something
> special for fentry, and doesn't include insertion of a duplicate
> entry for the called function being traced.
>
> Link: http://lkml.kernel.org/r/55A646EE.6030402 at linaro.org
>
> Some-Suggestions-by: Jungseok Lee <jungseoklee85@gmail.com>
> Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>   kernel/trace/trace_stack.c | 62 ++++++++++++++--------------------------------
>   1 file changed, 19 insertions(+), 43 deletions(-)
>
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 3f34496244e9..e615cdc9e38a 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -18,12 +18,6 @@
>
>   #define STACK_TRACE_ENTRIES 500
>
> -#ifdef CC_USING_FENTRY
> -# define fentry		1
> -#else
> -# define fentry		0
> -#endif
> -
>   static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
>   	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
>   static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
> @@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
>    */
>   static struct stack_trace max_stack_trace = {
>   	.max_entries		= STACK_TRACE_ENTRIES - 1,
> -	.entries		= &stack_dump_trace[1],
> +	.entries		= &stack_dump_trace[0],
>   };
>
>   static unsigned long max_stack_size;
> @@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	unsigned long this_size, flags; unsigned long *p, *top, *start;
>   	static int tracer_frame;
>   	int frame_size = ACCESS_ONCE(tracer_frame);
> -	int i;
> +	int i, x;
>
>   	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
>   	this_size = THREAD_SIZE - this_size;
> @@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	max_stack_size = this_size;
>
>   	max_stack_trace.nr_entries = 0;
> -
> -	if (using_ftrace_ops_list_func())
> -		max_stack_trace.skip = 4;
> -	else
> -		max_stack_trace.skip = 3;
> +	max_stack_trace.skip = 3;

I don't think this last line is necessary because we will skip all
the functions anyway below:

>   	save_stack_trace(&max_stack_trace);
>
> -	/*
> -	 * Add the passed in ip from the function tracer.
> -	 * Searching for this on the stack will skip over
> -	 * most of the overhead from the stack tracer itself.
> -	 */
> -	stack_dump_trace[0] = ip;
> -	max_stack_trace.nr_entries++;
> +	/* Skip over the overhead of the stack tracer itself */
> +	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> +		if (stack_dump_trace[i] == ip)
> +			break;
> +	}

here. Now "i" indicates the start point, excepting tracer functions,
and "x" will eventually represent the exact number of functions
that we are interested in after searching the stack.

To calc "stack_max_size" correctly, we should change the line:
     if (unlikely(tracer_frame) && i == 1) {
to
     if (unlikely(tracer_frame)) {

With these two changes applied, the issues Jungseok mentioned will be
fixed.

Thanks,
-Takahiro AKASHI

>   	/*
>   	 * Now find where in the stack these are.
>   	 */
> -	i = 0;
> +	x = 0;
>   	start = stack;
>   	top = (unsigned long *)
>   		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
> @@ -139,12 +127,13 @@ check_stack(unsigned long ip, unsigned long *stack)
>   	while (i < max_stack_trace.nr_entries) {
>   		int found = 0;
>
> -		stack_dump_index[i] = this_size;
> +		stack_dump_index[x] = this_size;
>   		p = start;
>
>   		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>   			if (*p == stack_dump_trace[i]) {
> -				this_size = stack_dump_index[i++] =
> +				stack_dump_trace[x] = stack_dump_trace[i++];
> +				this_size = stack_dump_index[x++] =
>   					(top - p) * sizeof(unsigned long);
>   				found = 1;
>   				/* Start the search from here */
> @@ -168,6 +157,10 @@ check_stack(unsigned long ip, unsigned long *stack)
>   			i++;
>   	}
>
> +	max_stack_trace.nr_entries = x;
> +	for (; x < i; x++)
> +		stack_dump_trace[x] = ULONG_MAX;
> +
>   	if (task_stack_end_corrupted(current)) {
>   		print_max_stack();
>   		BUG();
> @@ -192,24 +185,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>   	if (per_cpu(trace_active, cpu)++ != 0)
>   		goto out;
>
> -	/*
> -	 * When fentry is used, the traced function does not get
> -	 * its stack frame set up, and we lose the parent.
> -	 * The ip is pretty useless because the function tracer
> -	 * was called before that function set up its stack frame.
> -	 * In this case, we use the parent ip.
> -	 *
> -	 * By adding the return address of either the parent ip
> -	 * or the current ip we can disregard most of the stack usage
> -	 * caused by the stack tracer itself.
> -	 *
> -	 * The function tracer always reports the address of where the
> -	 * mcount call was, but the stack will hold the return address.
> -	 */
> -	if (fentry)
> -		ip = parent_ip;
> -	else
> -		ip += MCOUNT_INSN_SIZE;
> +	ip += MCOUNT_INSN_SIZE;
>
>   	check_stack(ip, &stack);
>
> @@ -284,7 +260,7 @@ __next(struct seq_file *m, loff_t *pos)
>   {
>   	long n = *pos - 1;
>
> -	if (n >= max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
> +	if (n > max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
>   		return NULL;
>
>   	m->private = (void *)n;
> @@ -354,7 +330,7 @@ static int t_show(struct seq_file *m, void *v)
>   		seq_printf(m, "        Depth    Size   Location"
>   			   "    (%d entries)\n"
>   			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);
>
>   		if (!stack_tracer_enabled && !max_stack_size)
>   			print_disabled(m);
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17  2:49                                 ` AKASHI Takahiro
@ 2015-07-17  3:21                                   ` Steven Rostedt
  0 siblings, 0 replies; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17  3:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 17 Jul 2015 11:49:52 +0900
AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:

> > -
> > -	if (using_ftrace_ops_list_func())
> > -		max_stack_trace.skip = 4;
> > -	else
> > -		max_stack_trace.skip = 3;
> > +	max_stack_trace.skip = 3;
> 
> I don't think this last line is necessary because we will skip all
> the functions anyway below:

I put this back more as an optimization as it is already known that it
takes at least three calls to get to this point. Unless of course gcc
decides to inline them. But currently I don't see that.

> 
> >   	save_stack_trace(&max_stack_trace);
> >
> > -	/*
> > -	 * Add the passed in ip from the function tracer.
> > -	 * Searching for this on the stack will skip over
> > -	 * most of the overhead from the stack tracer itself.
> > -	 */
> > -	stack_dump_trace[0] = ip;
> > -	max_stack_trace.nr_entries++;
> > +	/* Skip over the overhead of the stack tracer itself */
> > +	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> > +		if (stack_dump_trace[i] == ip)
> > +			break;
> > +	}
> 
> here. Now "i" indicates the start point, excepting tracer functions,
> and "x" will eventually represent the exact number of functions
> that we are interested in after searching the stack.
> 
> To calc "stack_max_size" correctly, we should change the line:
>      if (unlikely(tracer_frame) && i == 1) {
> to
>      if (unlikely(tracer_frame)) {

Good catch, I'll fix that!

-- Steve

> 
> With these two changes applied, the issues Jungseok mentioned will be
> fixed.
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16  1:38                     ` Steven Rostedt
@ 2015-07-17 10:46                       ` Will Deacon
  0 siblings, 0 replies; 53+ messages in thread
From: Will Deacon @ 2015-07-17 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Steve,

On Thu, Jul 16, 2015 at 02:38:18AM +0100, Steven Rostedt wrote:
> On Thu, 16 Jul 2015 10:08:03 +0900
> AKASHI Takahiro <takahiro.akashi@linaro.org> wrote:
> 
> > On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
> > > On 07/16/2015 01:13 AM, Steven Rostedt wrote:
> > >> On Wed, 15 Jul 2015 10:55:36 -0400
> > >> Steven Rostedt <rostedt@goodmis.org> wrote:
> > >>
> > >>
> > >>> I'll take a look at it and try to clean up the code.
> > >>
> > >> Does the  following patch make sense for you?
> > >
> > > Looks nice. The patch greatly simplifies changes on arm64 side.
> > 
> > As follows:
> > 
> > - Takahiro AKASHI
> > 
> 
> How should we go forward with this. Want me to make my change to my
> tree, and pull this patch in with it? I can set up for a 4.3 release.
> 
> Then if I can get an Acked-by from one of the arm64 maintainers, I can
> have this go through my tree?

To be honest, I've completely lost track of this thread!

Once you and Akashi have settled on a way forward, I'm happy to take a look
at anything touching arch/arm64/, but it seems like you're still discussing
some aspects of the series atm.

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-16 16:16                             ` Steven Rostedt
@ 2015-07-17 12:40                               ` Mark Rutland
  2015-07-17 12:51                                 ` Steven Rostedt
  2015-07-17 13:00                                 ` Steven Rostedt
  0 siblings, 2 replies; 53+ messages in thread
From: Mark Rutland @ 2015-07-17 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

> @@ -354,7 +330,7 @@ static int t_show(struct seq_file *m, vo
>  		seq_printf(m, "        Depth    Size   Location"
>  			   "    (%d entries)\n"
>  			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);

I believe you'll want the same fix in print_max_stack, given it's a
carbon copy.

Mark

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 12:40                               ` Mark Rutland
@ 2015-07-17 12:51                                 ` Steven Rostedt
  2015-07-17 13:00                                 ` Steven Rostedt
  1 sibling, 0 replies; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 12:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 17 Jul 2015 13:40:54 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> Hi,
> 
> > @@ -354,7 +330,7 @@ static int t_show(struct seq_file *m, vo
> >  		seq_printf(m, "        Depth    Size   Location"
> >  			   "    (%d entries)\n"
> >  			   "        -----    ----   --------\n",
> > -			   max_stack_trace.nr_entries - 1);
> > +			   max_stack_trace.nr_entries);
> 
> I believe you'll want the same fix in print_max_stack, given it's a
> carbon copy.

Yep, thanks for pointing that out.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 12:40                               ` Mark Rutland
  2015-07-17 12:51                                 ` Steven Rostedt
@ 2015-07-17 13:00                                 ` Steven Rostedt
  2015-07-17 14:28                                   ` Jungseok Lee
  1 sibling, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Here's my latest version of the patch. I also added a fix that made
entries off from the real number of entries. That was to stop the loop
on ULONG_MAX in stack_dump_trace[i], otherwise if for some reason
nr_entries is one off and points to ULONG_MAX, and there is a -1 in the
stack, the trace will include it in the count. The output of the stack
tests against both nr_entries and ULONG_MAX and will stop with either
case, making the dump and the count different.

-- Steve

>From 1c3697c3c4ce1f237466f76e40c91f66e2030bac Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Date: Thu, 16 Jul 2015 13:24:54 -0400
Subject: [PATCH] tracing: Clean up stack tracing and fix fentry updates

Akashi Takahiro was porting the stack tracer to arm64 and found some
issues with it. One was that it repeats the top function, due to the
stack frame added by the mcount caller and added by itself. This
was added when fentry came in, and before fentry created its own stack
frame. But x86's fentry now creates its own stack frame, and there's
no need to insert the function again.

This also cleans up the code a bit, where it doesn't need to do something
special for fentry, and doesn't include insertion of a duplicate
entry for the called function being traced.

Link: http://lkml.kernel.org/r/55A646EE.6030402 at linaro.org

Some-suggestions-by: Jungseok Lee <jungseoklee85@gmail.com>
Some-suggestions-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 68 ++++++++++++++++------------------------------
 1 file changed, 23 insertions(+), 45 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 3f34496244e9..b746399ab59c 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -18,12 +18,6 @@
 
 #define STACK_TRACE_ENTRIES 500
 
-#ifdef CC_USING_FENTRY
-# define fentry		1
-#else
-# define fentry		0
-#endif
-
 static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
 	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
 static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
@@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
  */
 static struct stack_trace max_stack_trace = {
 	.max_entries		= STACK_TRACE_ENTRIES - 1,
-	.entries		= &stack_dump_trace[1],
+	.entries		= &stack_dump_trace[0],
 };
 
 static unsigned long max_stack_size;
@@ -55,7 +49,7 @@ static inline void print_max_stack(void)
 
 	pr_emerg("        Depth    Size   Location    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
 		if (stack_dump_trace[i] == ULONG_MAX)
@@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 	unsigned long this_size, flags; unsigned long *p, *top, *start;
 	static int tracer_frame;
 	int frame_size = ACCESS_ONCE(tracer_frame);
-	int i;
+	int i, x;
 
 	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
 	this_size = THREAD_SIZE - this_size;
@@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
 	max_stack_size = this_size;
 
 	max_stack_trace.nr_entries = 0;
-
-	if (using_ftrace_ops_list_func())
-		max_stack_trace.skip = 4;
-	else
-		max_stack_trace.skip = 3;
+	max_stack_trace.skip = 3;
 
 	save_stack_trace(&max_stack_trace);
 
-	/*
-	 * Add the passed in ip from the function tracer.
-	 * Searching for this on the stack will skip over
-	 * most of the overhead from the stack tracer itself.
-	 */
-	stack_dump_trace[0] = ip;
-	max_stack_trace.nr_entries++;
+	/* Skip over the overhead of the stack tracer itself */
+	for (i = 0; i < max_stack_trace.nr_entries; i++) {
+		if (stack_dump_trace[i] == ip)
+			break;
+	}
 
 	/*
 	 * Now find where in the stack these are.
 	 */
-	i = 0;
+	x = 0;
 	start = stack;
 	top = (unsigned long *)
 		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
@@ -139,12 +127,15 @@ check_stack(unsigned long ip, unsigned long *stack)
 	while (i < max_stack_trace.nr_entries) {
 		int found = 0;
 
-		stack_dump_index[i] = this_size;
+		stack_dump_index[x] = this_size;
 		p = start;
 
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
+			if (stack_dump_trace[i] == ULONG_MAX)
+				break;
 			if (*p == stack_dump_trace[i]) {
-				this_size = stack_dump_index[i++] =
+				stack_dump_trace[x] = stack_dump_trace[i++];
+				this_size = stack_dump_index[x++] =
 					(top - p) * sizeof(unsigned long);
 				found = 1;
 				/* Start the search from here */
@@ -156,7 +147,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 				 * out what that is, then figure it out
 				 * now.
 				 */
-				if (unlikely(!tracer_frame) && i == 1) {
+				if (unlikely(!tracer_frame)) {
 					tracer_frame = (p - stack) *
 						sizeof(unsigned long);
 					max_stack_size -= tracer_frame;
@@ -168,6 +159,10 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
+	max_stack_trace.nr_entries = x;
+	for (; x < i; x++)
+		stack_dump_trace[x] = ULONG_MAX;
+
 	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
@@ -192,24 +187,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (per_cpu(trace_active, cpu)++ != 0)
 		goto out;
 
-	/*
-	 * When fentry is used, the traced function does not get
-	 * its stack frame set up, and we lose the parent.
-	 * The ip is pretty useless because the function tracer
-	 * was called before that function set up its stack frame.
-	 * In this case, we use the parent ip.
-	 *
-	 * By adding the return address of either the parent ip
-	 * or the current ip we can disregard most of the stack usage
-	 * caused by the stack tracer itself.
-	 *
-	 * The function tracer always reports the address of where the
-	 * mcount call was, but the stack will hold the return address.
-	 */
-	if (fentry)
-		ip = parent_ip;
-	else
-		ip += MCOUNT_INSN_SIZE;
+	ip += MCOUNT_INSN_SIZE;
 
 	check_stack(ip, &stack);
 
@@ -284,7 +262,7 @@ __next(struct seq_file *m, loff_t *pos)
 {
 	long n = *pos - 1;
 
-	if (n >= max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
+	if (n > max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
 		return NULL;
 
 	m->private = (void *)n;
@@ -354,7 +332,7 @@ static int t_show(struct seq_file *m, void *v)
 		seq_printf(m, "        Depth    Size   Location"
 			   "    (%d entries)\n"
 			   "        -----    ----   --------\n",
-			   max_stack_trace.nr_entries - 1);
+			   max_stack_trace.nr_entries);
 
 		if (!stack_tracer_enabled && !max_stack_size)
 			print_disabled(m);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 13:00                                 ` Steven Rostedt
@ 2015-07-17 14:28                                   ` Jungseok Lee
  2015-07-17 14:41                                     ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-07-17 14:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 17, 2015, at 10:00 PM, Steven Rostedt wrote:

Hi Steve,

> Here's my latest version of the patch. I also added a fix that made
> entries off from the real number of entries. That was to stop the loop
> on ULONG_MAX in stack_dump_trace[i], otherwise if for some reason
> nr_entries is one off and points to ULONG_MAX, and there is a -1 in the
> stack, the trace will include it in the count. The output of the stack
> tests against both nr_entries and ULONG_MAX and will stop with either
> case, making the dump and the count different.
> 
> -- Steve
> 
> From 1c3697c3c4ce1f237466f76e40c91f66e2030bac Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> Date: Thu, 16 Jul 2015 13:24:54 -0400
> Subject: [PATCH] tracing: Clean up stack tracing and fix fentry updates
> 
> Akashi Takahiro was porting the stack tracer to arm64 and found some
> issues with it. One was that it repeats the top function, due to the
> stack frame added by the mcount caller and added by itself. This
> was added when fentry came in, and before fentry created its own stack
> frame. But x86's fentry now creates its own stack frame, and there's
> no need to insert the function again.
> 
> This also cleans up the code a bit, where it doesn't need to do something
> special for fentry, and doesn't include insertion of a duplicate
> entry for the called function being traced.
> 
> Link: http://lkml.kernel.org/r/55A646EE.6030402 at linaro.org
> 
> Some-suggestions-by: Jungseok Lee <jungseoklee85@gmail.com>
> Some-suggestions-by: Mark Rutland <mark.rutland@arm.com>
> Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> kernel/trace/trace_stack.c | 68 ++++++++++++++++------------------------------
> 1 file changed, 23 insertions(+), 45 deletions(-)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 3f34496244e9..b746399ab59c 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -18,12 +18,6 @@
> 
> #define STACK_TRACE_ENTRIES 500
> 
> -#ifdef CC_USING_FENTRY
> -# define fentry		1
> -#else
> -# define fentry		0
> -#endif
> -
> static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES+1] =
> 	 { [0 ... (STACK_TRACE_ENTRIES)] = ULONG_MAX };
> static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
> @@ -35,7 +29,7 @@ static unsigned stack_dump_index[STACK_TRACE_ENTRIES];
>  */
> static struct stack_trace max_stack_trace = {
> 	.max_entries		= STACK_TRACE_ENTRIES - 1,
> -	.entries		= &stack_dump_trace[1],
> +	.entries		= &stack_dump_trace[0],
> };
> 
> static unsigned long max_stack_size;
> @@ -55,7 +49,7 @@ static inline void print_max_stack(void)
> 
> 	pr_emerg("        Depth    Size   Location    (%d entries)\n"
> 			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);
> 
> 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> 		if (stack_dump_trace[i] == ULONG_MAX)
> @@ -77,7 +71,7 @@ check_stack(unsigned long ip, unsigned long *stack)
> 	unsigned long this_size, flags; unsigned long *p, *top, *start;
> 	static int tracer_frame;
> 	int frame_size = ACCESS_ONCE(tracer_frame);
> -	int i;
> +	int i, x;
> 
> 	this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
> 	this_size = THREAD_SIZE - this_size;
> @@ -105,26 +99,20 @@ check_stack(unsigned long ip, unsigned long *stack)
> 	max_stack_size = this_size;
> 
> 	max_stack_trace.nr_entries = 0;
> -
> -	if (using_ftrace_ops_list_func())
> -		max_stack_trace.skip = 4;
> -	else
> -		max_stack_trace.skip = 3;
> +	max_stack_trace.skip = 3;
> 
> 	save_stack_trace(&max_stack_trace);
> 
> -	/*
> -	 * Add the passed in ip from the function tracer.
> -	 * Searching for this on the stack will skip over
> -	 * most of the overhead from the stack tracer itself.
> -	 */
> -	stack_dump_trace[0] = ip;
> -	max_stack_trace.nr_entries++;
> +	/* Skip over the overhead of the stack tracer itself */
> +	for (i = 0; i < max_stack_trace.nr_entries; i++) {
> +		if (stack_dump_trace[i] == ip)
> +			break;
> +	}
> 
> 	/*
> 	 * Now find where in the stack these are.
> 	 */
> -	i = 0;
> +	x = 0;
> 	start = stack;
> 	top = (unsigned long *)
> 		(((unsigned long)start & ~(THREAD_SIZE-1)) + THREAD_SIZE);
> @@ -139,12 +127,15 @@ check_stack(unsigned long ip, unsigned long *stack)
> 	while (i < max_stack_trace.nr_entries) {
> 		int found = 0;
> 
> -		stack_dump_index[i] = this_size;
> +		stack_dump_index[x] = this_size;
> 		p = start;
> 
> 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
> +			if (stack_dump_trace[i] == ULONG_MAX)
> +				break;
> 			if (*p == stack_dump_trace[i]) {
> -				this_size = stack_dump_index[i++] =
> +				stack_dump_trace[x] = stack_dump_trace[i++];
> +				this_size = stack_dump_index[x++] =
> 					(top - p) * sizeof(unsigned long);
> 				found = 1;
> 				/* Start the search from here */
> @@ -156,7 +147,7 @@ check_stack(unsigned long ip, unsigned long *stack)
> 				 * out what that is, then figure it out
> 				 * now.
> 				 */
> -				if (unlikely(!tracer_frame) && i == 1) {
> +				if (unlikely(!tracer_frame)) {
> 					tracer_frame = (p - stack) *
> 						sizeof(unsigned long);
> 					max_stack_size -= tracer_frame;
> @@ -168,6 +159,10 @@ check_stack(unsigned long ip, unsigned long *stack)
> 			i++;
> 	}
> 
> +	max_stack_trace.nr_entries = x;
> +	for (; x < i; x++)
> +		stack_dump_trace[x] = ULONG_MAX;
> +
> 	if (task_stack_end_corrupted(current)) {
> 		print_max_stack();
> 		BUG();
> @@ -192,24 +187,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
> 	if (per_cpu(trace_active, cpu)++ != 0)
> 		goto out;
> 
> -	/*
> -	 * When fentry is used, the traced function does not get
> -	 * its stack frame set up, and we lose the parent.
> -	 * The ip is pretty useless because the function tracer
> -	 * was called before that function set up its stack frame.
> -	 * In this case, we use the parent ip.
> -	 *
> -	 * By adding the return address of either the parent ip
> -	 * or the current ip we can disregard most of the stack usage
> -	 * caused by the stack tracer itself.
> -	 *
> -	 * The function tracer always reports the address of where the
> -	 * mcount call was, but the stack will hold the return address.
> -	 */
> -	if (fentry)
> -		ip = parent_ip;
> -	else
> -		ip += MCOUNT_INSN_SIZE;
> +	ip += MCOUNT_INSN_SIZE;
> 
> 	check_stack(ip, &stack);
> 
> @@ -284,7 +262,7 @@ __next(struct seq_file *m, loff_t *pos)
> {
> 	long n = *pos - 1;
> 
> -	if (n >= max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
> +	if (n > max_stack_trace.nr_entries || stack_dump_trace[n] == ULONG_MAX)
> 		return NULL;
> 
> 	m->private = (void *)n;
> @@ -354,7 +332,7 @@ static int t_show(struct seq_file *m, void *v)
> 		seq_printf(m, "        Depth    Size   Location"
> 			   "    (%d entries)\n"
> 			   "        -----    ----   --------\n",
> -			   max_stack_trace.nr_entries - 1);
> +			   max_stack_trace.nr_entries);
> 
> 		if (!stack_tracer_enabled && !max_stack_size)
> 			print_disabled(m);
> -- 


I have reviewed and tested the kernel including this patch and only [RFC 1/3].

Now, the number of entries and max_stack_size are always okay, but unexpected functions,
such as ftrace_ops_no_ops and ftrace_call, are *sometimes* listed as follows.

$ cat /sys/kernel/debug/tracing/stack_trace 

        Depth    Size   Location    (49 entries)
        -----    ----   --------
  0)     4456      16   arch_counter_read+0xc/0x24
  1)     4440      16   ktime_get+0x44/0xb4
  2)     4424      48   get_drm_timestamp+0x30/0x40
  3)     4376      16   drm_get_last_vbltimestamp+0x94/0xb4
  4)     4360      80   drm_handle_vblank+0x84/0x3c0
  5)     4280     144   mdp5_irq+0x118/0x130
  6)     4136      80   msm_irq+0x2c/0x68
  7)     4056      32   handle_irq_event_percpu+0x60/0x210
  8)     4024      96   handle_irq_event+0x50/0x80
  9)     3928      64   handle_fasteoi_irq+0xb0/0x178
 10)     3864      48   generic_handle_irq+0x38/0x54
 11)     3816      32   __handle_domain_irq+0x68/0xbc
 12)     3784      64   gic_handle_irq+0x38/0x88
 13)     3720     280   el1_irq+0x64/0xd8
 14)     3440     168   ftrace_ops_no_ops+0xb4/0x16c
 15)     3272      64   ftrace_call+0x0/0x4
 16)     3208      16   _raw_spin_lock_irqsave+0x14/0x70
 17)     3192      32   msm_gpio_set+0x44/0xb4
 18)     3160      48   _gpiod_set_raw_value+0x68/0x148
 19)     3112      64   gpiod_set_value+0x40/0x70
 20)     3048      32   gpio_led_set+0x3c/0x94
 21)     3016      48   led_set_brightness+0x50/0xa4
 22)     2968      32   led_trigger_event+0x4c/0x78
 23)     2936      48   mmc_request_done+0x38/0x84
 24)     2888      32   sdhci_tasklet_finish+0xcc/0x12c
 25)     2856      48   tasklet_action+0x64/0x120
 26)     2808      48   __do_softirq+0x114/0x2f0
 27)     2760     128   irq_exit+0x98/0xd8
 28)     2632      32   __handle_domain_irq+0x6c/0xbc
 29)     2600      64   gic_handle_irq+0x38/0x88
 30)     2536     280   el1_irq+0x64/0xd8
 31)     2256     168   ftrace_ops_no_ops+0xb4/0x16c
 32)     2088      64   ftrace_call+0x0/0x4
 33)     2024      16   __schedule+0x1c/0x748
 34)     2008      80   schedule+0x38/0x94
 35)     1928      32   schedule_timeout+0x1a8/0x200
 36)     1896     128   wait_for_common+0xa8/0x150
 37)     1768     128   wait_for_completion+0x24/0x34
 38)     1640      32   mmc_wait_for_req_done+0x3c/0x104
 39)     1608      64   mmc_wait_for_cmd+0x68/0x94
 40)     1544     128   get_card_status.isra.25+0x6c/0x88
 41)     1416     112   card_busy_detect.isra.31+0x7c/0x154
 42)     1304     128   mmc_blk_err_check+0xd0/0x4f8
 43)     1176     192   mmc_start_req+0xe4/0x3a8
 44)      984     160   mmc_blk_issue_rw_rq+0xc4/0x9c0
 45)      824     176   mmc_blk_issue_rq+0x19c/0x450
 46)      648     112   mmc_queue_thread+0x134/0x17c
 47)      536      80   kthread+0xe0/0xf8
 48)      456     456   ret_from_fork+0xc/0x50

$ cat /sys/kernel/debug/tracing/stack_max_size
4456

This issue might be related to arch code, not this change.
IMHO, this patch look settled now.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17  2:04                       ` AKASHI Takahiro
@ 2015-07-17 14:38                         ` Jungseok Lee
  0 siblings, 0 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-17 14:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 17, 2015, at 11:04 AM, AKASHI Takahiro wrote:
> Jungseok,
> 
> Thank you for your testing and reviews.

You're welcome.

> On 07/16/2015 10:29 PM, Jungseok Lee wrote:
>> On Jul 16, 2015, at 10:08 AM, AKASHI Takahiro wrote:
>> 
>> Hi, AKASHI
>> 
>>> On 07/16/2015 09:27 AM, AKASHI Takahiro wrote:
>>>> On 07/16/2015 01:13 AM, Steven Rostedt wrote:
>>>>> On Wed, 15 Jul 2015 10:55:36 -0400
>>>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>>>> 
>>>>> 
>>>>>> I'll take a look at it and try to clean up the code.
>>>>> 
>>>>> Does the  following patch make sense for you?
>>>> 
>>>> Looks nice. The patch greatly simplifies changes on arm64 side.
>>> 
>>> As follows:
>>> 
>>> - Takahiro AKASHI
>>> 
>>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>>> index c5534fa..868d6f1 100644
>>> --- a/arch/arm64/include/asm/ftrace.h
>>> +++ b/arch/arm64/include/asm/ftrace.h
>>> @@ -15,6 +15,7 @@
>>> 
>>> #define MCOUNT_ADDR		((unsigned long)_mcount)
>>> #define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>>> +#define FTRACE_STACK_FRAME_OFFSET 4 /* sync it up with stacktrace.c */
> 
> Well,
>  #define FTRACE_STACK_FRAME_OFFSET AARCH64_INSN_SIZE
> might be better.

Agree.

> 
>> How about binding it to -4 in unwind_frame function?
> 
> Do you mean like this?
> In unwind_frame(),
>  frame->pc = *(unsigned long*)(fp + 8) - AARCH64_INSN_SIZE;

Exactly.

> 
>> IMHO, it would help other developers trying to change stack trace code
>> be aware of this stack tracer feature.
>> 
>>> #ifndef __ASSEMBLY__
>>> #include <linux/compat.h>
>>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>>> index 1da6029..2c1bf7d 100644
>>> --- a/include/linux/ftrace.h
>>> +++ b/include/linux/ftrace.h
>>> @@ -260,6 +260,13 @@ static inline void ftrace_kill(void) { }
>>> #endif /* CONFIG_FUNCTION_TRACER */
>>> 
>>> #ifdef CONFIG_STACK_TRACER
>>> +/*
>>> + * the offset value to add to return address from save_stack_trace()
>>> + */
>>> +#ifndef FTRACE_STACK_FRAME_OFFSET
>>> +#define FTRACE_STACK_FRAME_OFFSET 0
>>> +#endif
>>> +
>>> extern int stack_tracer_enabled;
>>> int
>>> stack_trace_sysctl(struct ctl_table *table, int write,
>>> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
>>> index 9384647..c5b9748 100644
>>> --- a/kernel/trace/trace_stack.c
>>> +++ b/kernel/trace/trace_stack.c
>>> @@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
>>> 
>>> 	/* Skip over the overhead of the stack tracer itself */
>>> 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
>>> -		if (stack_dump_trace[i] == ip)
>>> +		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
>>> 			break;
>>> 	}
>>> 
>>> @@ -131,7 +131,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>>> 		p = start;
>>> 
>>> 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
>>> -			if (*p == stack_dump_trace[i]) {
>>> +			if (*p == (stack_dump_trace[i]
>>> +					+ FTRACE_STACK_FRAME_OFFSET)) {
>>> 				stack_dump_trace[x] = stack_dump_trace[i++];
>>> 				this_size = stack_dump_index[x++] =
>>> 					(top - p) * sizeof(unsigned long);
>>> --
>> 
>> I've prepared a kernel with the following patches and reviewed them.
>> 
>> 1) Steve's clean up patch
>> 2) This patch
>> 3) [RFC 2/3]
>> 
>> AFAIU, [RFC 3/3] is not needed any more thanks to Steve's patch.
> 
> We don't need [2/3].

Okay. So, I've played a kernel including Steve's latest patch and only [1/3].

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 14:28                                   ` Jungseok Lee
@ 2015-07-17 14:41                                     ` Steven Rostedt
  2015-07-17 14:59                                       ` Jungseok Lee
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 17 Jul 2015 23:28:13 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:


> 
> I have reviewed and tested the kernel including this patch and only [RFC 1/3].

Thanks! Can you repost patch 1 with the changes I recommended, so that
I can get an Acked-by from the arm64 maintainers and pull all the
changes in together. This is fine for a 4.3 release, right? That is, it
doesn't need to go into 4.2-rcs.

> 
> Now, the number of entries and max_stack_size are always okay, but unexpected functions,
> such as ftrace_ops_no_ops and ftrace_call, are *sometimes* listed as follows.
> 
> $ cat /sys/kernel/debug/tracing/stack_trace 
> 
>         Depth    Size   Location    (49 entries)
>         -----    ----   --------
>   0)     4456      16   arch_counter_read+0xc/0x24
>   1)     4440      16   ktime_get+0x44/0xb4
>   2)     4424      48   get_drm_timestamp+0x30/0x40
>   3)     4376      16   drm_get_last_vbltimestamp+0x94/0xb4
>   4)     4360      80   drm_handle_vblank+0x84/0x3c0
>   5)     4280     144   mdp5_irq+0x118/0x130
>   6)     4136      80   msm_irq+0x2c/0x68
>   7)     4056      32   handle_irq_event_percpu+0x60/0x210
>   8)     4024      96   handle_irq_event+0x50/0x80
>   9)     3928      64   handle_fasteoi_irq+0xb0/0x178
>  10)     3864      48   generic_handle_irq+0x38/0x54
>  11)     3816      32   __handle_domain_irq+0x68/0xbc
>  12)     3784      64   gic_handle_irq+0x38/0x88
>  13)     3720     280   el1_irq+0x64/0xd8
>  14)     3440     168   ftrace_ops_no_ops+0xb4/0x16c
>  15)     3272      64   ftrace_call+0x0/0x4
>  16)     3208      16   _raw_spin_lock_irqsave+0x14/0x70
>  17)     3192      32   msm_gpio_set+0x44/0xb4
>  18)     3160      48   _gpiod_set_raw_value+0x68/0x148
>  19)     3112      64   gpiod_set_value+0x40/0x70
>  20)     3048      32   gpio_led_set+0x3c/0x94
>  21)     3016      48   led_set_brightness+0x50/0xa4
>  22)     2968      32   led_trigger_event+0x4c/0x78
>  23)     2936      48   mmc_request_done+0x38/0x84
>  24)     2888      32   sdhci_tasklet_finish+0xcc/0x12c
>  25)     2856      48   tasklet_action+0x64/0x120
>  26)     2808      48   __do_softirq+0x114/0x2f0
>  27)     2760     128   irq_exit+0x98/0xd8
>  28)     2632      32   __handle_domain_irq+0x6c/0xbc
>  29)     2600      64   gic_handle_irq+0x38/0x88
>  30)     2536     280   el1_irq+0x64/0xd8
>  31)     2256     168   ftrace_ops_no_ops+0xb4/0x16c
>  32)     2088      64   ftrace_call+0x0/0x4

Like I stated before, the above looks to be an interrupt coming in
while the tracing was happening. This looks legitimate to me. I'm
guessing that arm64 uses one stack for normal context and interrupt
context, where as x86 uses a separate stack for interrupt context.

-- Steve


>  33)     2024      16   __schedule+0x1c/0x748
>  34)     2008      80   schedule+0x38/0x94
>  35)     1928      32   schedule_timeout+0x1a8/0x200
>  36)     1896     128   wait_for_common+0xa8/0x150
>  37)     1768     128   wait_for_completion+0x24/0x34
>  38)     1640      32   mmc_wait_for_req_done+0x3c/0x104
>  39)     1608      64   mmc_wait_for_cmd+0x68/0x94
>  40)     1544     128   get_card_status.isra.25+0x6c/0x88
>  41)     1416     112   card_busy_detect.isra.31+0x7c/0x154
>  42)     1304     128   mmc_blk_err_check+0xd0/0x4f8
>  43)     1176     192   mmc_start_req+0xe4/0x3a8
>  44)      984     160   mmc_blk_issue_rw_rq+0xc4/0x9c0
>  45)      824     176   mmc_blk_issue_rq+0x19c/0x450
>  46)      648     112   mmc_queue_thread+0x134/0x17c
>  47)      536      80   kthread+0xe0/0xf8
>  48)      456     456   ret_from_fork+0xc/0x50
> 
> $ cat /sys/kernel/debug/tracing/stack_max_size
> 4456
> 
> This issue might be related to arch code, not this change.
> IMHO, this patch look settled now.
> 
> Best Regards
> Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 14:41                                     ` Steven Rostedt
@ 2015-07-17 14:59                                       ` Jungseok Lee
  2015-07-17 15:34                                         ` Jungseok Lee
  0 siblings, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-07-17 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
> On Fri, 17 Jul 2015 23:28:13 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:
> 
> 
>> 
>> I have reviewed and tested the kernel including this patch and only [RFC 1/3].
> 
> Thanks! Can you repost patch 1 with the changes I recommended, so that
> I can get an Acked-by from the arm64 maintainers and pull all the
> changes in together. This is fine for a 4.3 release, right? That is, it
> doesn't need to go into 4.2-rcs.
> 

Welcome!

It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
Also, it might be needed to consider Mark's comment on arch part.

If they are okay, I will proceed.

>> 
>> Now, the number of entries and max_stack_size are always okay, but unexpected functions,
>> such as ftrace_ops_no_ops and ftrace_call, are *sometimes* listed as follows.
>> 
>> $ cat /sys/kernel/debug/tracing/stack_trace 
>> 
>>        Depth    Size   Location    (49 entries)
>>        -----    ----   --------
>>  0)     4456      16   arch_counter_read+0xc/0x24
>>  1)     4440      16   ktime_get+0x44/0xb4
>>  2)     4424      48   get_drm_timestamp+0x30/0x40
>>  3)     4376      16   drm_get_last_vbltimestamp+0x94/0xb4
>>  4)     4360      80   drm_handle_vblank+0x84/0x3c0
>>  5)     4280     144   mdp5_irq+0x118/0x130
>>  6)     4136      80   msm_irq+0x2c/0x68
>>  7)     4056      32   handle_irq_event_percpu+0x60/0x210
>>  8)     4024      96   handle_irq_event+0x50/0x80
>>  9)     3928      64   handle_fasteoi_irq+0xb0/0x178
>> 10)     3864      48   generic_handle_irq+0x38/0x54
>> 11)     3816      32   __handle_domain_irq+0x68/0xbc
>> 12)     3784      64   gic_handle_irq+0x38/0x88
>> 13)     3720     280   el1_irq+0x64/0xd8
>> 14)     3440     168   ftrace_ops_no_ops+0xb4/0x16c
>> 15)     3272      64   ftrace_call+0x0/0x4
>> 16)     3208      16   _raw_spin_lock_irqsave+0x14/0x70
>> 17)     3192      32   msm_gpio_set+0x44/0xb4
>> 18)     3160      48   _gpiod_set_raw_value+0x68/0x148
>> 19)     3112      64   gpiod_set_value+0x40/0x70
>> 20)     3048      32   gpio_led_set+0x3c/0x94
>> 21)     3016      48   led_set_brightness+0x50/0xa4
>> 22)     2968      32   led_trigger_event+0x4c/0x78
>> 23)     2936      48   mmc_request_done+0x38/0x84
>> 24)     2888      32   sdhci_tasklet_finish+0xcc/0x12c
>> 25)     2856      48   tasklet_action+0x64/0x120
>> 26)     2808      48   __do_softirq+0x114/0x2f0
>> 27)     2760     128   irq_exit+0x98/0xd8
>> 28)     2632      32   __handle_domain_irq+0x6c/0xbc
>> 29)     2600      64   gic_handle_irq+0x38/0x88
>> 30)     2536     280   el1_irq+0x64/0xd8
>> 31)     2256     168   ftrace_ops_no_ops+0xb4/0x16c
>> 32)     2088      64   ftrace_call+0x0/0x4
> 
> Like I stated before, the above looks to be an interrupt coming in
> while the tracing was happening. This looks legitimate to me. I'm
> guessing that arm64 uses one stack for normal context and interrupt
> context, where as x86 uses a separate stack for interrupt context.

Okay. Thanks for a kind explanation.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 14:59                                       ` Jungseok Lee
@ 2015-07-17 15:34                                         ` Jungseok Lee
  2015-07-17 16:01                                           ` Steven Rostedt
  2015-07-20 16:20                                           ` Will Deacon
  0 siblings, 2 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-17 15:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
> On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
>> On Fri, 17 Jul 2015 23:28:13 +0900
>> Jungseok Lee <jungseoklee85@gmail.com> wrote:
>> 
>> 
>>> 
>>> I have reviewed and tested the kernel including this patch and only [RFC 1/3].
>> 
>> Thanks! Can you repost patch 1 with the changes I recommended, so that
>> I can get an Acked-by from the arm64 maintainers and pull all the
>> changes in together. This is fine for a 4.3 release, right? That is, it
>> doesn't need to go into 4.2-rcs.
>> 
> 
> Welcome!
> 
> It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
> Also, it might be needed to consider Mark's comment on arch part.
> 
> If they are okay, I will proceed.

Akashi and Mark,

The [RFC 1/3] patch used in my environment is shaped as follows.
I leave the hunk for *only* clear synchronization. This is why I choose this format
instead of reposting a patch. I hope it would help to track down this thread.

Steve,

I think this is my best at this point.

Best Regards
Jungseok Lee

----8<----
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index c5534fa..2b43e20 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -13,8 +13,9 @@
 
 #include <asm/insn.h>
 
-#define MCOUNT_ADDR		((unsigned long)_mcount)
-#define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
+#define MCOUNT_ADDR			((unsigned long)_mcount)
+#define MCOUNT_INSN_SIZE		AARCH64_INSN_SIZE
+#define FTRACE_STACK_FRAME_OFFSET	AARCH64_INSN_SIZE
 
 #ifndef __ASSEMBLY__
 #include <linux/compat.h>
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 407991b..9ab67af 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -20,6 +20,7 @@
 #include <linux/sched.h>
 #include <linux/stacktrace.h>
 
+#include <asm/insn.h>
 #include <asm/stacktrace.h>
 
 /*
@@ -52,7 +53,7 @@ int notrace unwind_frame(struct stackframe *frame)
 	 * -4 here because we care about the PC at time of bl,
 	 * not where the return will go.
 	 */
-	frame->pc = *(unsigned long *)(fp + 8) - 4;
+	frame->pc = *(unsigned long *)(fp + 8) - AARCH64_INSN_SIZE;
 
 	return 0;
 }
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1da6029..6566201 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -260,6 +260,9 @@ static inline void ftrace_kill(void) { }
 #endif /* CONFIG_FUNCTION_TRACER */
 
 #ifdef CONFIG_STACK_TRACER
+#ifndef FTRACE_STACK_FRAME_OFFSET
+#define FTRACE_STACK_FRAME_OFFSET 0
+#endif
 extern int stack_tracer_enabled;
 int
 stack_trace_sysctl(struct ctl_table *table, int write,
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index b746399..30521ea 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -105,7 +105,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 
 	/* Skip over the overhead of the stack tracer itself */
 	for (i = 0; i < max_stack_trace.nr_entries; i++) {
-		if (stack_dump_trace[i] == ip)
+		if ((stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET) == ip)
 			break;
 	}
 
@@ -133,7 +133,8 @@ check_stack(unsigned long ip, unsigned long *stack)
 		for (; p < top && i < max_stack_trace.nr_entries; p++) {
 			if (stack_dump_trace[i] == ULONG_MAX)
 				break;
-			if (*p == stack_dump_trace[i]) {
+			if (*p == (stack_dump_trace[i]
+					+ FTRACE_STACK_FRAME_OFFSET)) {
 				stack_dump_trace[x] = stack_dump_trace[i++];
 				this_size = stack_dump_index[x++] =
 					(top - p) * sizeof(unsigned long);
----8<----

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 15:34                                         ` Jungseok Lee
@ 2015-07-17 16:01                                           ` Steven Rostedt
  2015-07-20 16:20                                           ` Will Deacon
  1 sibling, 0 replies; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 18 Jul 2015 00:34:21 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:


> 
> The [RFC 1/3] patch used in my environment is shaped as follows.
> I leave the hunk for *only* clear synchronization. This is why I choose this format
> instead of reposting a patch. I hope it would help to track down this thread.

You could just add:

 Link: http://lkml.kernel.org/r/1436765375-7119-3-git-send-email-takahiro.akashi at linaro.org

to your change log, and that will point to this thread. That's what the
Link tag was made for.

> 
> Steve,
> 
> I think this is my best at this point.
> 

OK, I'll just wait for the others to give their ack.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-17 15:34                                         ` Jungseok Lee
  2015-07-17 16:01                                           ` Steven Rostedt
@ 2015-07-20 16:20                                           ` Will Deacon
  2015-07-20 23:53                                             ` AKASHI Takahiro
  2015-08-03  9:09                                             ` Will Deacon
  1 sibling, 2 replies; 53+ messages in thread
From: Will Deacon @ 2015-07-20 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

On Fri, Jul 17, 2015 at 04:34:21PM +0100, Jungseok Lee wrote:
> On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
> > On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
> >> Thanks! Can you repost patch 1 with the changes I recommended, so that
> >> I can get an Acked-by from the arm64 maintainers and pull all the
> >> changes in together. This is fine for a 4.3 release, right? That is, it
> >> doesn't need to go into 4.2-rcs.
> >> 
> > 
> > It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
> > Also, it might be needed to consider Mark's comment on arch part.
> > 
> > If they are okay, I will proceed.
> 
> The [RFC 1/3] patch used in my environment is shaped as follows.
> I leave the hunk for *only* clear synchronization. This is why I choose this format
> instead of reposting a patch. I hope it would help to track down this thread.
> 
> I think this is my best at this point.
> 
> ----8<----
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index c5534fa..2b43e20 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -13,8 +13,9 @@
>  
>  #include <asm/insn.h>
>  
> -#define MCOUNT_ADDR		((unsigned long)_mcount)
> -#define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
> +#define MCOUNT_ADDR			((unsigned long)_mcount)
> +#define MCOUNT_INSN_SIZE		AARCH64_INSN_SIZE
> +#define FTRACE_STACK_FRAME_OFFSET	AARCH64_INSN_SIZE
>  
>  #ifndef __ASSEMBLY__
>  #include <linux/compat.h>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 407991b..9ab67af 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -20,6 +20,7 @@
>  #include <linux/sched.h>
>  #include <linux/stacktrace.h>
>  
> +#include <asm/insn.h>
>  #include <asm/stacktrace.h>
>  
>  /*
> @@ -52,7 +53,7 @@ int notrace unwind_frame(struct stackframe *frame)
>  	 * -4 here because we care about the PC at time of bl,
>  	 * not where the return will go.
>  	 */
> -	frame->pc = *(unsigned long *)(fp + 8) - 4;
> +	frame->pc = *(unsigned long *)(fp + 8) - AARCH64_INSN_SIZE;
>  
>  	return 0;
>  }

The arm64 bits look fine to me:

  Acked-by: Will Deacon <will.deacon@arm.com>

Steve: feel free to take this along with the other ftrace changes. I don't
anticipate any conflicts, but if anything crops up in -next we can sort
it out then.

Thanks!

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-20 16:20                                           ` Will Deacon
@ 2015-07-20 23:53                                             ` AKASHI Takahiro
  2015-07-21 10:26                                               ` AKASHI Takahiro
  2015-08-03  9:09                                             ` Will Deacon
  1 sibling, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-20 23:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

So i don't have to repost my patch here. Please use the original
commit log message[1/3] as is.
But please keep in mind that there is still an issue that I mentioned
in the cover letter; 'Size' field is inaccurate.
  <reported size> = <its own dynamic local variables>
                         + <child's local variables>
and
  <real size> = <reported size> + <its local variables>
                                - <child's local variables>
where "dynamic" means, for example, a variable allocated like the below:
   int foo(int num) {
     int array[num];
     ...
   }
(See more details in my ascii art.)

Such usage is seldom seen in the kernel, and <reported size> is
likely equal to <child's local variables>. In other words, we will
see one-line *displacement* in most cases.

(We'd better mention it explicitly in the commmit?)

Thanks,
-Takahiro AKASHI


On 07/21/2015 01:20 AM, Will Deacon wrote:
> Hi all,
>
> On Fri, Jul 17, 2015 at 04:34:21PM +0100, Jungseok Lee wrote:
>> On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
>>> On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
>>>> Thanks! Can you repost patch 1 with the changes I recommended, so that
>>>> I can get an Acked-by from the arm64 maintainers and pull all the
>>>> changes in together. This is fine for a 4.3 release, right? That is, it
>>>> doesn't need to go into 4.2-rcs.
>>>>
>>>
>>> It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
>>> Also, it might be needed to consider Mark's comment on arch part.
>>>
>>> If they are okay, I will proceed.
>>
>> The [RFC 1/3] patch used in my environment is shaped as follows.
>> I leave the hunk for *only* clear synchronization. This is why I choose this format
>> instead of reposting a patch. I hope it would help to track down this thread.
>>
>> I think this is my best at this point.
>>
>> ----8<----
>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>> index c5534fa..2b43e20 100644
>> --- a/arch/arm64/include/asm/ftrace.h
>> +++ b/arch/arm64/include/asm/ftrace.h
>> @@ -13,8 +13,9 @@
>>
>>   #include <asm/insn.h>
>>
>> -#define MCOUNT_ADDR		((unsigned long)_mcount)
>> -#define MCOUNT_INSN_SIZE	AARCH64_INSN_SIZE
>> +#define MCOUNT_ADDR			((unsigned long)_mcount)
>> +#define MCOUNT_INSN_SIZE		AARCH64_INSN_SIZE
>> +#define FTRACE_STACK_FRAME_OFFSET	AARCH64_INSN_SIZE
>>
>>   #ifndef __ASSEMBLY__
>>   #include <linux/compat.h>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index 407991b..9ab67af 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -20,6 +20,7 @@
>>   #include <linux/sched.h>
>>   #include <linux/stacktrace.h>
>>
>> +#include <asm/insn.h>
>>   #include <asm/stacktrace.h>
>>
>>   /*
>> @@ -52,7 +53,7 @@ int notrace unwind_frame(struct stackframe *frame)
>>   	 * -4 here because we care about the PC at time of bl,
>>   	 * not where the return will go.
>>   	 */
>> -	frame->pc = *(unsigned long *)(fp + 8) - 4;
>> +	frame->pc = *(unsigned long *)(fp + 8) - AARCH64_INSN_SIZE;
>>
>>   	return 0;
>>   }
>
> The arm64 bits look fine to me:
>
>    Acked-by: Will Deacon <will.deacon@arm.com>
>
> Steve: feel free to take this along with the other ftrace changes. I don't
> anticipate any conflicts, but if anything crops up in -next we can sort
> it out then.
>
> Thanks!
>
> Will
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-20 23:53                                             ` AKASHI Takahiro
@ 2015-07-21 10:26                                               ` AKASHI Takahiro
  2015-07-21 14:34                                                 ` Jungseok Lee
  0 siblings, 1 reply; 53+ messages in thread
From: AKASHI Takahiro @ 2015-07-21 10:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/21/2015 08:53 AM, AKASHI Takahiro wrote:
> Hi
>
> So i don't have to repost my patch here. Please use the original
> commit log message[1/3] as is.
> But please keep in mind that there is still an issue that I mentioned
> in the cover letter; 'Size' field is inaccurate.
>   <reported size> = <its own dynamic local variables>
>                          + <child's local variables>
> and
>   <real size> = <reported size> + <its local variables>
>                                 - <child's local variables>
> where "dynamic" means, for example, a variable allocated like the below:
>    int foo(int num) {
>      int array[num];
>      ...
>    }
> (See more details in my ascii art.)
>
> Such usage is seldom seen in the kernel, and <reported size> is
> likely equal to <child's local variables>. In other words, we will
> see one-line *displacement* in most cases.

Well, I have a quick fix now, but it looks ugly.

In addition, I found another issue; With function_graph tracer,
the output is like:
# cat /sys/kernel/tracing/stack_trace
         Depth    Size   Location    (78 entries)
         -----    ----   --------
   0)     6184      32   update_min_vruntime+0x14/0x74
   1)     6152      48   update_curr+0x6c/0x150
   2)     6104     128   enqueue_task_fair+0x2f4/0xb9c
   3)     5976      48   enqueue_task+0x48/0x90
   ...
  18)     5160     112   hrtimer_interrupt+0xa0/0x214
  19)     5048      32   arch_timer_handler_phys+0x38/0x48
  20)     5016       0   ftrace_graph_caller+0x2c/0x30
  21)     5016      64   ftrace_graph_caller+0x2c/0x30
  22)     4952      32   ftrace_graph_caller+0x2c/0x30
  23)     4920      64   ftrace_graph_caller+0x2c/0x30
   ...

Since, with function_graph tracer, we modify LR register in a stack frame
when we enter into a function, we have to manage such special cases
in save_stack_trace() or check_stack() as x86 does in
print_ftrace_graph_addr().

Thanks,
-Takahiro AKASHI


> (We'd better mention it explicitly in the commmit?)
>
> Thanks,
> -Takahiro AKASHI
>
>
> On 07/21/2015 01:20 AM, Will Deacon wrote:
>> Hi all,
>>
>> On Fri, Jul 17, 2015 at 04:34:21PM +0100, Jungseok Lee wrote:
>>> On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
>>>> On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
>>>>> Thanks! Can you repost patch 1 with the changes I recommended, so that
>>>>> I can get an Acked-by from the arm64 maintainers and pull all the
>>>>> changes in together. This is fine for a 4.3 release, right? That is, it
>>>>> doesn't need to go into 4.2-rcs.
>>>>>
>>>>
>>>> It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
>>>> Also, it might be needed to consider Mark's comment on arch part.
>>>>
>>>> If they are okay, I will proceed.
>>>
>>> The [RFC 1/3] patch used in my environment is shaped as follows.
>>> I leave the hunk for *only* clear synchronization. This is why I choose this format
>>> instead of reposting a patch. I hope it would help to track down this thread.
>>>
>>> I think this is my best at this point.
>>>
>>> ----8<----
>>> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
>>> index c5534fa..2b43e20 100644
>>> --- a/arch/arm64/include/asm/ftrace.h
>>> +++ b/arch/arm64/include/asm/ftrace.h
>>> @@ -13,8 +13,9 @@
>>>
>>>   #include <asm/insn.h>
>>>
>>> -#define MCOUNT_ADDR        ((unsigned long)_mcount)
>>> -#define MCOUNT_INSN_SIZE    AARCH64_INSN_SIZE
>>> +#define MCOUNT_ADDR            ((unsigned long)_mcount)
>>> +#define MCOUNT_INSN_SIZE        AARCH64_INSN_SIZE
>>> +#define FTRACE_STACK_FRAME_OFFSET    AARCH64_INSN_SIZE
>>>
>>>   #ifndef __ASSEMBLY__
>>>   #include <linux/compat.h>
>>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>>> index 407991b..9ab67af 100644
>>> --- a/arch/arm64/kernel/stacktrace.c
>>> +++ b/arch/arm64/kernel/stacktrace.c
>>> @@ -20,6 +20,7 @@
>>>   #include <linux/sched.h>
>>>   #include <linux/stacktrace.h>
>>>
>>> +#include <asm/insn.h>
>>>   #include <asm/stacktrace.h>
>>>
>>>   /*
>>> @@ -52,7 +53,7 @@ int notrace unwind_frame(struct stackframe *frame)
>>>        * -4 here because we care about the PC at time of bl,
>>>        * not where the return will go.
>>>        */
>>> -    frame->pc = *(unsigned long *)(fp + 8) - 4;
>>> +    frame->pc = *(unsigned long *)(fp + 8) - AARCH64_INSN_SIZE;
>>>
>>>       return 0;
>>>   }
>>
>> The arm64 bits look fine to me:
>>
>>    Acked-by: Will Deacon <will.deacon@arm.com>
>>
>> Steve: feel free to take this along with the other ftrace changes. I don't
>> anticipate any conflicts, but if anything crops up in -next we can sort
>> it out then.
>>
>> Thanks!
>>
>> Will
>>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-21 10:26                                               ` AKASHI Takahiro
@ 2015-07-21 14:34                                                 ` Jungseok Lee
  0 siblings, 0 replies; 53+ messages in thread
From: Jungseok Lee @ 2015-07-21 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Jul 21, 2015, at 7:26 PM, AKASHI Takahiro wrote:
> On 07/21/2015 08:53 AM, AKASHI Takahiro wrote:
>> Hi
>> 
>> So i don't have to repost my patch here. Please use the original
>> commit log message[1/3] as is.
>> But please keep in mind that there is still an issue that I mentioned
>> in the cover letter; 'Size' field is inaccurate.
>>  <reported size> = <its own dynamic local variables>
>>                         + <child's local variables>
>> and
>>  <real size> = <reported size> + <its local variables>
>>                                - <child's local variables>
>> where "dynamic" means, for example, a variable allocated like the below:
>>   int foo(int num) {
>>     int array[num];
>>     ...
>>   }
>> (See more details in my ascii art.)
>> 
>> Such usage is seldom seen in the kernel, and <reported size> is
>> likely equal to <child's local variables>. In other words, we will
>> see one-line *displacement* in most cases.
> 
> Well, I have a quick fix now, but it looks ugly.

AFAIU, stack_max_size would be more accurate if a separate stack
is introduced for interrupt context. However, it might be unnecessary
at this point due to complexity.

> In addition, I found another issue; With function_graph tracer,
> the output is like:
> # cat /sys/kernel/tracing/stack_trace
>        Depth    Size   Location    (78 entries)
>        -----    ----   --------
>  0)     6184      32   update_min_vruntime+0x14/0x74
>  1)     6152      48   update_curr+0x6c/0x150
>  2)     6104     128   enqueue_task_fair+0x2f4/0xb9c
>  3)     5976      48   enqueue_task+0x48/0x90
>  ...
> 18)     5160     112   hrtimer_interrupt+0xa0/0x214
> 19)     5048      32   arch_timer_handler_phys+0x38/0x48
> 20)     5016       0   ftrace_graph_caller+0x2c/0x30
> 21)     5016      64   ftrace_graph_caller+0x2c/0x30
> 22)     4952      32   ftrace_graph_caller+0x2c/0x30
> 23)     4920      64   ftrace_graph_caller+0x2c/0x30
>  ...
> 
> Since, with function_graph tracer, we modify LR register in a stack frame
> when we enter into a function, we have to manage such special cases
> in save_stack_trace() or check_stack() as x86 does in
> print_ftrace_graph_addr().

I should have run it with function_graph. The issue is reproduced easily
on my environment. I don't see other issues yet when enabling other tracers.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-07-20 16:20                                           ` Will Deacon
  2015-07-20 23:53                                             ` AKASHI Takahiro
@ 2015-08-03  9:09                                             ` Will Deacon
  2015-08-03 14:01                                               ` Steven Rostedt
  2015-08-03 16:30                                               ` Jungseok Lee
  1 sibling, 2 replies; 53+ messages in thread
From: Will Deacon @ 2015-08-03  9:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,

On Mon, Jul 20, 2015 at 05:20:04PM +0100, Will Deacon wrote:
> On Fri, Jul 17, 2015 at 04:34:21PM +0100, Jungseok Lee wrote:
> > On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
> > > On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
> > >> Thanks! Can you repost patch 1 with the changes I recommended, so that
> > >> I can get an Acked-by from the arm64 maintainers and pull all the
> > >> changes in together. This is fine for a 4.3 release, right? That is, it
> > >> doesn't need to go into 4.2-rcs.
> > >> 
> > > 
> > > It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
> > > Also, it might be needed to consider Mark's comment on arch part.
> > > 
> > > If they are okay, I will proceed.
> > 
> > The [RFC 1/3] patch used in my environment is shaped as follows.
> > I leave the hunk for *only* clear synchronization. This is why I choose this format
> > instead of reposting a patch. I hope it would help to track down this thread.
> > 
> > I think this is my best at this point.

[...]

> The arm64 bits look fine to me:
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>

What happened to this? Is it queued someplace, or are we waiting for a
new version?

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03  9:09                                             ` Will Deacon
@ 2015-08-03 14:01                                               ` Steven Rostedt
  2015-08-03 14:04                                                 ` Will Deacon
  2015-08-03 16:30                                               ` Jungseok Lee
  1 sibling, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-08-03 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 3 Aug 2015 10:09:51 +0100
Will Deacon <will.deacon@arm.com> wrote:

 
> > The arm64 bits look fine to me:
> > 
> >   Acked-by: Will Deacon <will.deacon@arm.com>
> 
> What happened to this? Is it queued someplace, or are we waiting for a
> new version?

I went on vacation :-)  I'm back and trying to catch up. I'll be
pulling in a bunch of things today. Hopefully I can get to this.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03 14:01                                               ` Steven Rostedt
@ 2015-08-03 14:04                                                 ` Will Deacon
  0 siblings, 0 replies; 53+ messages in thread
From: Will Deacon @ 2015-08-03 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 03, 2015 at 03:01:40PM +0100, Steven Rostedt wrote:
> On Mon, 3 Aug 2015 10:09:51 +0100
> Will Deacon <will.deacon@arm.com> wrote:
> 
>  
> > > The arm64 bits look fine to me:
> > > 
> > >   Acked-by: Will Deacon <will.deacon@arm.com>
> > 
> > What happened to this? Is it queued someplace, or are we waiting for a
> > new version?
> 
> I went on vacation :-)  I'm back and trying to catch up. I'll be
> pulling in a bunch of things today. Hopefully I can get to this.

Cheers, Steve. Just wanted to make sure it didn't got lost (I'd briefly
forgotten about it myself!).

I hope you had a good holiday.

Will

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03  9:09                                             ` Will Deacon
  2015-08-03 14:01                                               ` Steven Rostedt
@ 2015-08-03 16:30                                               ` Jungseok Lee
  2015-08-03 16:57                                                 ` Steven Rostedt
  1 sibling, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-08-03 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Aug 3, 2015, at 6:09 PM, Will Deacon wrote:
> Hi guys,

Hi Will,

> On Mon, Jul 20, 2015 at 05:20:04PM +0100, Will Deacon wrote:
>> On Fri, Jul 17, 2015 at 04:34:21PM +0100, Jungseok Lee wrote:
>>> On Jul 17, 2015, at 11:59 PM, Jungseok Lee wrote:
>>>> On Jul 17, 2015, at 11:41 PM, Steven Rostedt wrote:
>>>>> Thanks! Can you repost patch 1 with the changes I recommended, so that
>>>>> I can get an Acked-by from the arm64 maintainers and pull all the
>>>>> changes in together. This is fine for a 4.3 release, right? That is, it
>>>>> doesn't need to go into 4.2-rcs.
>>>>> 
>>>> 
>>>> It's not hard to repost a patch, but I feel like we have to wait for Akashi's response.
>>>> Also, it might be needed to consider Mark's comment on arch part.
>>>> 
>>>> If they are okay, I will proceed.
>>> 
>>> The [RFC 1/3] patch used in my environment is shaped as follows.
>>> I leave the hunk for *only* clear synchronization. This is why I choose this format
>>> instead of reposting a patch. I hope it would help to track down this thread.
>>> 
>>> I think this is my best at this point.
> 
> [...]
> 
>> The arm64 bits look fine to me:
>> 
>>  Acked-by: Will Deacon <will.deacon@arm.com>
> 
> What happened to this? Is it queued someplace, or are we waiting for a
> new version?

There are two issues in the current version.
1) The change does not work correctly when function_graph feature is enabled.
2) Akashi have raised an issue that size field of stack tracer is inaccurate.

So, I think this patch set is not ready yet.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03 16:30                                               ` Jungseok Lee
@ 2015-08-03 16:57                                                 ` Steven Rostedt
  2015-08-03 17:22                                                   ` Jungseok Lee
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-08-03 16:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Aug 2015 01:30:50 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:


> There are two issues in the current version.
> 1) The change does not work correctly when function_graph feature is enabled.
> 2) Akashi have raised an issue that size field of stack tracer is inaccurate.
> 
> So, I think this patch set is not ready yet.

Do you still want me to add code that does:

 if (*p == (stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET)) {

?

If you expect to need that, I can get it into the next merge window and
you can base the code of that in the merge window after that.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03 16:57                                                 ` Steven Rostedt
@ 2015-08-03 17:22                                                   ` Jungseok Lee
  2015-08-03 17:32                                                     ` Steven Rostedt
  0 siblings, 1 reply; 53+ messages in thread
From: Jungseok Lee @ 2015-08-03 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Aug 4, 2015, at 1:57 AM, Steven Rostedt wrote:
> On Tue, 4 Aug 2015 01:30:50 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:
> 
> 
>> There are two issues in the current version.
>> 1) The change does not work correctly when function_graph feature is enabled.
>> 2) Akashi have raised an issue that size field of stack tracer is inaccurate.
>> 
>> So, I think this patch set is not ready yet.
> 
> Do you still want me to add code that does:
> 
> if (*p == (stack_dump_trace[i] + FTRACE_STACK_FRAME_OFFSET)) {
> 
> ?
> 
> If you expect to need that, I can get it into the next merge window and
> you can base the code of that in the merge window after that.

It would be better to add the snippet when a new version is ready.
That way might help to figure out easily why the macro is introduced and how
it can be used in architecture code.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03 17:22                                                   ` Jungseok Lee
@ 2015-08-03 17:32                                                     ` Steven Rostedt
  2015-08-04  7:41                                                       ` AKASHI Takahiro
  0 siblings, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-08-03 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Aug 2015 02:22:16 +0900
Jungseok Lee <jungseoklee85@gmail.com> wrote:


> It would be better to add the snippet when a new version is ready.
> That way might help to figure out easily why the macro is introduced and how
> it can be used in architecture code.


OK, which ever.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 2/3] arm64: refactor save_stack_trace()
  2015-08-03 17:32                                                     ` Steven Rostedt
@ 2015-08-04  7:41                                                       ` AKASHI Takahiro
  0 siblings, 0 replies; 53+ messages in thread
From: AKASHI Takahiro @ 2015-08-04  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/04/2015 02:32 AM, Steven Rostedt wrote:
> On Tue, 4 Aug 2015 02:22:16 +0900
> Jungseok Lee <jungseoklee85@gmail.com> wrote:
>
>
>> It would be better to add the snippet when a new version is ready.
>> That way might help to figure out easily why the macro is introduced and how
>> it can be used in architecture code.
>
>
> OK, which ever.

I'm making best efforts to fix the problem, but it seems to me that the current
check_stack() doesn't fit to arm64 due to the differences in stack frame usage
on architectures.
Although my current patch doesn't solve all the issues I've noticed so far,
I'd like to get any comments about whether I'm doing it the right way.
Please take a look at my new vesion (v2) of RFC.

Thanks,
-Takahiro AKASHI


> -- Steve
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2015-08-04  7:41 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-13  5:29 [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer AKASHI Takahiro
2015-07-13  5:29 ` [RFC 1/3] ftrace: adjust a function's pc to search for in check_stack() for arm64 AKASHI Takahiro
2015-07-13 15:24   ` Steven Rostedt
2015-07-15  0:22     ` AKASHI Takahiro
2015-07-13  5:29 ` [RFC 2/3] arm64: refactor save_stack_trace() AKASHI Takahiro
2015-07-14 12:47   ` Jungseok Lee
2015-07-14 13:31     ` Steven Rostedt
2015-07-15  0:20       ` AKASHI Takahiro
2015-07-15  2:51         ` Steven Rostedt
2015-07-15 11:41           ` AKASHI Takahiro
2015-07-15 14:55             ` Steven Rostedt
2015-07-15 16:13               ` Steven Rostedt
2015-07-16  0:27                 ` AKASHI Takahiro
2015-07-16  1:08                   ` AKASHI Takahiro
2015-07-16  1:38                     ` Steven Rostedt
2015-07-17 10:46                       ` Will Deacon
2015-07-16 13:29                     ` Jungseok Lee
2015-07-16 13:54                       ` Jungseok Lee
2015-07-16 14:24                       ` Steven Rostedt
2015-07-16 15:01                         ` Jungseok Lee
2015-07-16 15:31                           ` Steven Rostedt
2015-07-16 15:52                             ` Jungseok Lee
2015-07-16 20:22                               ` Steven Rostedt
2015-07-17  2:49                                 ` AKASHI Takahiro
2015-07-17  3:21                                   ` Steven Rostedt
2015-07-16 16:16                             ` Steven Rostedt
2015-07-17 12:40                               ` Mark Rutland
2015-07-17 12:51                                 ` Steven Rostedt
2015-07-17 13:00                                 ` Steven Rostedt
2015-07-17 14:28                                   ` Jungseok Lee
2015-07-17 14:41                                     ` Steven Rostedt
2015-07-17 14:59                                       ` Jungseok Lee
2015-07-17 15:34                                         ` Jungseok Lee
2015-07-17 16:01                                           ` Steven Rostedt
2015-07-20 16:20                                           ` Will Deacon
2015-07-20 23:53                                             ` AKASHI Takahiro
2015-07-21 10:26                                               ` AKASHI Takahiro
2015-07-21 14:34                                                 ` Jungseok Lee
2015-08-03  9:09                                             ` Will Deacon
2015-08-03 14:01                                               ` Steven Rostedt
2015-08-03 14:04                                                 ` Will Deacon
2015-08-03 16:30                                               ` Jungseok Lee
2015-08-03 16:57                                                 ` Steven Rostedt
2015-08-03 17:22                                                   ` Jungseok Lee
2015-08-03 17:32                                                     ` Steven Rostedt
2015-08-04  7:41                                                       ` AKASHI Takahiro
2015-07-17  2:04                       ` AKASHI Takahiro
2015-07-17 14:38                         ` Jungseok Lee
2015-07-16 14:28                     ` Mark Rutland
2015-07-16 14:34                       ` Steven Rostedt
2015-07-17  2:09                         ` AKASHI Takahiro
2015-07-13  5:29 ` [RFC 3/3] arm64: ftrace: mcount() should not create a stack frame AKASHI Takahiro
2015-07-13 15:01 ` [RFC 0/3] arm64: ftrace: fix incorrect output from stack tracer Jungseok Lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).