All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
@ 2021-10-27 13:25 Mark Rutland
  2021-10-28  8:52 ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2021-10-27 13:25 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: broonie, catalin.marinas, madvenka, mark.rutland, will

When CONFIG_FUNCTION_GRAPH_TRACER is selected, and the function graph:
tracer is in use, unwind_frame() may erroneously asscociate a traced
function with an incorrect return address. This can happen when starting
an unwind from a pt_regs, or when unwinding across an exception
boundary.

The underlying problem is that ftrace_graph_get_ret_stack() takes an
index offset from the most recent entry added to the fgraph return
stack. We start an unwind at offset 0, and increment the offset each
time we encounter a rewritten return address (i.e. when we see
`return_to_handler`). This is broken in two cases:

1) Between creating a pt_regs and starting the unwind, function calls
   may place entries on the stack, leaving an abitrary offset which we
   can only determine by performing a full unwind from the caller of the
   unwind code (and relying on none of the unwind code being
   instrumented). While this initial unwind is open-coded in
   dump_backtrace(), this is not performed for other unwinders such as
   perf_callchain_kernel().

2) When unwinding across an exception boundary (whether continuing an
   unwind or starting a new unwind from regs), we currently always skip
   the LR of the interrupted context. Where this was live and contained
   a rewritten address, we won't consume the corresponding fgraph ret
   stack entry, leaving subsequent entries off-by-one.

To fix this, we need to be able to uniquely identify each rewritten
return address such that we can map this back to the original reeturn
address. We can do so using HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, and
associating each rewritten return address with a unique location on the
stack. As the return address is passed in the LR (and so is not
guaranteed a unique location in memory), we use the FP upon entry to the
function (i.e. the address of the caller's frame record) as the return
address pointer. Any nested call will have a different FP value as the
caller must create its own frame record and update FP to point to this.

This doesn't fix the issue of skipping a live LR at an exception
boundary, which is a more general problem and requires more substantial
rework. Note that were we to consume the LR in all cases this would in
warnings where the interrupted context's LR contains
`return_to_handler`, but the FP has been altered, e.g.

| func:
|	<--- ftrace entry ---> 	// logs FP & LR, rewrites FP
| 	STP	FP, LR, [SP, #-16]!
| 	MOV	FP, SP
| 	<--- INTERRUPT --->

... as ftrace_graph_get_ret_stack() fill not find a matching entry,
triggering the WARN_ON_ONCE() in unwind_frame().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/ftrace.h     | 11 +++++++++++
 arch/arm64/include/asm/stacktrace.h |  6 ------
 arch/arm64/kernel/ftrace.c          |  6 +++---
 arch/arm64/kernel/stacktrace.c      | 12 +++++-------
 4 files changed, 19 insertions(+), 16 deletions(-)

I spotted this latent issue when reviewing Madhavan's stack tracing rework:

  https://lore.kernel.org/r/20211015025847.17694-1-madvenka@linux.microsoft.com

This supercedes the initial version I posted as an inline reply in:

  https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local

I believe we will need this as a prerequisite for other cleanups in that
series. As it fixes an existing bug I'm posting it on its own.

Mark.

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 91fa4baa1a93..c96d47cb8f46 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -12,6 +12,17 @@
 
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 
+/*
+ * HAVE_FUNCTION_GRAPH_RET_ADDR_PTR means that the architecture can provide a
+ * "return address pointer" which can be used to uniquely identify a return
+ * address which has been overwritten.
+ *
+ * On arm64 we use the address of the caller's frame record, which remains the
+ * same for the lifetime of the instrumented function, unlike the return
+ * address in the LR.
+ */
+#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
+
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 #define ARCH_SUPPORTS_FTRACE_OPS 1
 #else
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 8aebc00c1718..9a319eca5cab 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -46,9 +46,6 @@ struct stack_info {
  * @prev_type:   The type of stack this frame record was on, or a synthetic
  *               value of STACK_TYPE_UNKNOWN. This is used to detect a
  *               transition from one stack to another.
- *
- * @graph:       When FUNCTION_GRAPH_TRACER is selected, holds the index of a
- *               replacement lr value in the ftrace graph stack.
  */
 struct stackframe {
 	unsigned long fp;
@@ -56,9 +53,6 @@ struct stackframe {
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
 	unsigned long prev_fp;
 	enum stack_type prev_type;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	int graph;
-#endif
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 7f467bd9db7a..3e5d0ed63eb7 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -249,8 +249,6 @@ int __init ftrace_dyn_arch_init(void)
  * on the way back to parent. For this purpose, this function is called
  * in _mcount() or ftrace_caller() to replace return address (*parent) on
  * the call stack to return_to_handler.
- *
- * Note that @frame_pointer is used only for sanity check later.
  */
 void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 			   unsigned long frame_pointer)
@@ -268,8 +266,10 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 	 */
 	old = *parent;
 
-	if (!function_graph_enter(old, self_addr, frame_pointer, NULL))
+	if (!function_graph_enter(old, self_addr, frame_pointer,
+	    (void *)frame_pointer)) {
 		*parent = return_hooker;
+	}
 }
 
 #ifdef CONFIG_DYNAMIC_FTRACE
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 8982a2b78acf..749b680b4999 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -38,9 +38,6 @@ void start_backtrace(struct stackframe *frame, unsigned long fp,
 {
 	frame->fp = fp;
 	frame->pc = pc;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	frame->graph = 0;
-#endif
 
 	/*
 	 * Prime the first unwind.
@@ -116,17 +113,18 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&
 		(ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
-		struct ftrace_ret_stack *ret_stack;
+		unsigned long orig_pc;
 		/*
 		 * This is a case where function graph tracer has
 		 * modified a return address (LR) in a stack frame
 		 * to hook a function return.
 		 * So replace it to an original value.
 		 */
-		ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
-		if (WARN_ON_ONCE(!ret_stack))
+		orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
+						(void *)frame->fp);
+		if (WARN_ON_ONCE(frame->pc == orig_pc))
 			return -EINVAL;
-		frame->pc = ret_stack->ret;
+		frame->pc = orig_pc;
 	}
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 
-- 
2.11.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-27 13:25 [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR Mark Rutland
@ 2021-10-28  8:52 ` Will Deacon
  2021-10-28 13:45   ` Mark Rutland
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2021-10-28  8:52 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, broonie, catalin.marinas, madvenka, rostedt

On Wed, Oct 27, 2021 at 02:25:28PM +0100, Mark Rutland wrote:
> When CONFIG_FUNCTION_GRAPH_TRACER is selected, and the function graph:

Weird ':' on the end of "graph"

> tracer is in use, unwind_frame() may erroneously asscociate a traced

associate

> function with an incorrect return address. This can happen when starting
> an unwind from a pt_regs, or when unwinding across an exception
> boundary.
> 
> The underlying problem is that ftrace_graph_get_ret_stack() takes an
> index offset from the most recent entry added to the fgraph return
> stack. We start an unwind at offset 0, and increment the offset each
> time we encounter a rewritten return address (i.e. when we see
> `return_to_handler`). This is broken in two cases:
> 
> 1) Between creating a pt_regs and starting the unwind, function calls
>    may place entries on the stack, leaving an abitrary offset which we
>    can only determine by performing a full unwind from the caller of the
>    unwind code (and relying on none of the unwind code being
>    instrumented). While this initial unwind is open-coded in
>    dump_backtrace(), this is not performed for other unwinders such as
>    perf_callchain_kernel().

What is the user-visible impact of this?

> 2) When unwinding across an exception boundary (whether continuing an
>    unwind or starting a new unwind from regs), we currently always skip
>    the LR of the interrupted context. Where this was live and contained
>    a rewritten address, we won't consume the corresponding fgraph ret
>    stack entry, leaving subsequent entries off-by-one.
> 
> To fix this, we need to be able to uniquely identify each rewritten
> return address such that we can map this back to the original reeturn

return

> address. We can do so using HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, and
> associating each rewritten return address with a unique location on the
> stack. As the return address is passed in the LR (and so is not
> guaranteed a unique location in memory), we use the FP upon entry to the
> function (i.e. the address of the caller's frame record) as the return
> address pointer. Any nested call will have a different FP value as the
> caller must create its own frame record and update FP to point to this.
> 
> This doesn't fix the issue of skipping a live LR at an exception
> boundary, which is a more general problem and requires more substantial
> rework. Note that were we to consume the LR in all cases this would in

would in?

> warnings where the interrupted context's LR contains
> `return_to_handler`, but the FP has been altered, e.g.
> 
> | func:
> |	<--- ftrace entry ---> 	// logs FP & LR, rewrites FP
> | 	STP	FP, LR, [SP, #-16]!
> | 	MOV	FP, SP
> | 	<--- INTERRUPT --->
> 
> ... as ftrace_graph_get_ret_stack() fill not find a matching entry,
> triggering the WARN_ON_ONCE() in unwind_frame().

Did you run the ftrace selftests with this? If not, please can you do it?

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Will Deacon <will@kernel.org>

(Adding rostedt FYI)

> ---
>  arch/arm64/include/asm/ftrace.h     | 11 +++++++++++
>  arch/arm64/include/asm/stacktrace.h |  6 ------
>  arch/arm64/kernel/ftrace.c          |  6 +++---
>  arch/arm64/kernel/stacktrace.c      | 12 +++++-------
>  4 files changed, 19 insertions(+), 16 deletions(-)
> 
> I spotted this latent issue when reviewing Madhavan's stack tracing rework:
> 
>   https://lore.kernel.org/r/20211015025847.17694-1-madvenka@linux.microsoft.com
> 
> This supercedes the initial version I posted as an inline reply in:
> 
>   https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local

Please add a Link: tag referring to this.

> I believe we will need this as a prerequisite for other cleanups in that
> series. As it fixes an existing bug I'm posting it on its own.
> 
> Mark.
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index 91fa4baa1a93..c96d47cb8f46 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -12,6 +12,17 @@
>  
>  #define HAVE_FUNCTION_GRAPH_FP_TEST
>  
> +/*
> + * HAVE_FUNCTION_GRAPH_RET_ADDR_PTR means that the architecture can provide a
> + * "return address pointer" which can be used to uniquely identify a return
> + * address which has been overwritten.
> + *
> + * On arm64 we use the address of the caller's frame record, which remains the
> + * same for the lifetime of the instrumented function, unlike the return
> + * address in the LR.
> + */
> +#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>  #define ARCH_SUPPORTS_FTRACE_OPS 1
>  #else
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index 8aebc00c1718..9a319eca5cab 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -46,9 +46,6 @@ struct stack_info {
>   * @prev_type:   The type of stack this frame record was on, or a synthetic
>   *               value of STACK_TYPE_UNKNOWN. This is used to detect a
>   *               transition from one stack to another.
> - *
> - * @graph:       When FUNCTION_GRAPH_TRACER is selected, holds the index of a
> - *               replacement lr value in the ftrace graph stack.
>   */
>  struct stackframe {
>  	unsigned long fp;
> @@ -56,9 +53,6 @@ struct stackframe {
>  	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
>  	unsigned long prev_fp;
>  	enum stack_type prev_type;
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	int graph;
> -#endif
>  };
>  
>  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 7f467bd9db7a..3e5d0ed63eb7 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -249,8 +249,6 @@ int __init ftrace_dyn_arch_init(void)
>   * on the way back to parent. For this purpose, this function is called
>   * in _mcount() or ftrace_caller() to replace return address (*parent) on
>   * the call stack to return_to_handler.
> - *
> - * Note that @frame_pointer is used only for sanity check later.
>   */
>  void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
>  			   unsigned long frame_pointer)
> @@ -268,8 +266,10 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
>  	 */
>  	old = *parent;
>  
> -	if (!function_graph_enter(old, self_addr, frame_pointer, NULL))
> +	if (!function_graph_enter(old, self_addr, frame_pointer,
> +	    (void *)frame_pointer)) {
>  		*parent = return_hooker;
> +	}
>  }
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 8982a2b78acf..749b680b4999 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -38,9 +38,6 @@ void start_backtrace(struct stackframe *frame, unsigned long fp,
>  {
>  	frame->fp = fp;
>  	frame->pc = pc;
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -	frame->graph = 0;
> -#endif
>  
>  	/*
>  	 * Prime the first unwind.
> @@ -116,17 +113,18 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	if (tsk->ret_stack &&
>  		(ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
> -		struct ftrace_ret_stack *ret_stack;
> +		unsigned long orig_pc;
>  		/*
>  		 * This is a case where function graph tracer has
>  		 * modified a return address (LR) in a stack frame
>  		 * to hook a function return.
>  		 * So replace it to an original value.
>  		 */
> -		ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
> -		if (WARN_ON_ONCE(!ret_stack))
> +		orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
> +						(void *)frame->fp);

Should we be using ptrauth_strip_insn_pac(frame->pc) again here?
ftrace_graph_ret_addr() appears to repeat the comparison against
return_to_handler.

> +		if (WARN_ON_ONCE(frame->pc == orig_pc))
>  			return -EINVAL;
> -		frame->pc = ret_stack->ret;
> +		frame->pc = orig_pc;
>  	}
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */

This will conflict (trivially) with the kretprobe stuff in -next but might
be worth routing via the tracing tree anyway.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-28  8:52 ` Will Deacon
@ 2021-10-28 13:45   ` Mark Rutland
  2021-10-28 14:04     ` Mark Rutland
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Mark Rutland @ 2021-10-28 13:45 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, broonie, catalin.marinas, madvenka, rostedt

Hi Will,

Looks like there are a bunch of things that warrant a v2, so I'll go
prepare that. It also seems there are some latent issue hit by the
ftrace selftests, which I'm taking a look at now.

On Thu, Oct 28, 2021 at 09:52:25AM +0100, Will Deacon wrote:
> On Wed, Oct 27, 2021 at 02:25:28PM +0100, Mark Rutland wrote:
> > When CONFIG_FUNCTION_GRAPH_TRACER is selected, and the function graph:
> 
> Weird ':' on the end of "graph"

Whoops; I'll go fix that with the other typos.

[...]

> > The underlying problem is that ftrace_graph_get_ret_stack() takes an
> > index offset from the most recent entry added to the fgraph return
> > stack. We start an unwind at offset 0, and increment the offset each
> > time we encounter a rewritten return address (i.e. when we see
> > `return_to_handler`). This is broken in two cases:
> > 
> > 1) Between creating a pt_regs and starting the unwind, function calls
> >    may place entries on the stack, leaving an abitrary offset which we
> >    can only determine by performing a full unwind from the caller of the
> >    unwind code (and relying on none of the unwind code being
> >    instrumented). While this initial unwind is open-coded in
> >    dump_backtrace(), this is not performed for other unwinders such as
> >    perf_callchain_kernel().
> 
> What is the user-visible impact of this?

Kernel callchains reported by perf may contain erroneous entries.

Case 2 means that dump_backtrace() may report entries in the wrong location in
some cases.

I'll see if I can get a clear example to put in the commit message, and I'll
explicitly mention those cases.

[...]

> > This doesn't fix the issue of skipping a live LR at an exception
> > boundary, which is a more general problem and requires more substantial
> > rework. Note that were we to consume the LR in all cases this would in
> 
> would in?
> 
> > warnings where the interrupted context's LR contains
> > `return_to_handler`, but the FP has been altered, e.g.
> > 
> > | func:
> > |	<--- ftrace entry ---> 	// logs FP & LR, rewrites FP
> > | 	STP	FP, LR, [SP, #-16]!
> > | 	MOV	FP, SP
> > | 	<--- INTERRUPT --->
> > 
> > ... as ftrace_graph_get_ret_stack() fill not find a matching entry,
> > triggering the WARN_ON_ONCE() in unwind_frame().
> 
> Did you run the ftrace selftests with this? If not, please can you do it?

I had not; I'd done a bunch of ad-hoc testing to deliberately trigger the
failing cases with a magic-sysrq backtrace (with that unwinder patched to start
from the regs).

I had a go with the selftests and there are some latent issues which show up on
a pristine v5.15-rc4 (or v5.15-rc7), using defconfig + the ftrace selftest
config fragment, including an OOM that could be a memory leak.

I see the following warning:

| [15] Generic dynamic event - check if duplicate events are caught
| ------------[ cut here ]------------
| WARNING: CPU: 24 PID: 912 at kernel/trace/trace_eprobe.c:853 __trace_eprobe_create+0x3ec/0x71c
| Modules linked in:
| CPU: 24 PID: 912 Comm: ftracetest Not tainted 5.15.0-rc4 #1
| Hardware name: linux,dummy-virt (DT)
| pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __trace_eprobe_create+0x3ec/0x71c
| lr : __trace_eprobe_create+0x3ec/0x71c
| sp : ffff80001370bb60
| x29: ffff80001370bb60 x28: ffff5ee1944cc299 x27: ffff80001370bc28
| x26: 00000000000001c8 x25: ffffa8db50eece78 x24: 0000000000000010
| x23: ffffa8db50eece48 x22: 00000000ffffffed x21: 0000000000000003
| x20: ffff5ee1944cc008 x19: ffffa8db4fd2e398 x18: 0000000000000000
| x17: 7473753a29656d61 x16: 6e656c6966242830 x15: 2b3d656c69662074
| x14: 616e65706f5f7265 x13: 676e69727473753a x12: 29656d616e656c69
| x11: 0000000000000003 x10: 0101010101010101 x9 : ffffa8db4edda690
| x8 : 7f7f7f7f7f7f7f7f x7 : 726b6b6062727872 x6 : 1a171a17020d001f
| x5 : 1f000d02171a171a x4 : ffff5ee18d2b3900 x3 : ffffa8db50eece48
| x2 : 0000000000000000 x1 : ffff5ee18d2b3900 x0 : ffff5ee18d2b3900
| Call trace:
|  __trace_eprobe_create+0x3ec/0x71c
|  trace_probe_create+0x90/0xb0
|  eprobe_dyn_event_create+0x1c/0x30
|  create_dyn_event+0x74/0x114
|  trace_parse_run_command+0x104/0x7d0
|  dyn_event_write+0x1c/0x30
|  vfs_write+0xfc/0x290
|  ksys_write+0x74/0x100
|  __arm64_sys_write+0x28/0x3c
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0x104/0x124
|  do_el0_svc+0x34/0x9c
|  el0_svc+0x2c/0x90
|  el0t_64_sync_handler+0xa8/0x12c
|  el0t_64_sync+0x1a0/0x1a4
| ---[ end trace 24f9a9e45b173e0b ]---
|         [FAIL]

That seems to be the following in __trace_eprobe_create():

| mutex_lock(&event_mutex);
| event_call = find_and_get_event(sys_name, sys_event);
| ep = alloc_event_probe(group, event, event_call, argc - 2);
| mutex_unlock(&event_mutex);
| 
| if (IS_ERR(ep)) {
| 	ret = PTR_ERR(ep);
| 	/* This must return -ENOMEM, else there is a bug */
| 	WARN_ON_ONCE(ret != -ENOMEM);
| 	ep = NULL;
| 	goto error;
| }

... but I haven't yet dug into exactly why that's returning an
unexpected error code.
 
Later the tests hang in:

| [19] event tracing - enable/disable with subsystem level files

... and from trying:

| # ./ftracetest -vvv test.d/event/subsystem-enable.tc 

... it seems to hang after the usual reset, on the first part of the
test, with the last output being:

| + . /root/ftrace/test.d/event/subsystem-enable.tc
| ++ echo 'sched:*'
| ++ yield
| ++ ping 127.0.0.1 -c 1
| PING 127.0.0.1 (127.0.0.1): 56 data bytes
| 64 bytes from 127.0.0.1: seq=0 ttl=64 time=2.442 ms
| 
| --- 127.0.0.1 ping statistics ---
| 1 packets transmitted, 1 packets received, 0% packet loss
| round-trip min/avg/max = 2.442/2.442/2.442 ms
| +++ cat trace
| +++ grep -v '^#'
| +++ awk '{ print $5 }'
| +++ sort -u
| +++ wc -l            

I'll see if I can figure out what's actually blocking, and see if the
remaining tests work.

> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Will Deacon <will@kernel.org>
> 
> (Adding rostedt FYI)

I'll add Steve to the Cc list for v2.

> > ---
> >  arch/arm64/include/asm/ftrace.h     | 11 +++++++++++
> >  arch/arm64/include/asm/stacktrace.h |  6 ------
> >  arch/arm64/kernel/ftrace.c          |  6 +++---
> >  arch/arm64/kernel/stacktrace.c      | 12 +++++-------
> >  4 files changed, 19 insertions(+), 16 deletions(-)
> > 
> > I spotted this latent issue when reviewing Madhavan's stack tracing rework:
> > 
> >   https://lore.kernel.org/r/20211015025847.17694-1-madvenka@linux.microsoft.com
> > 
> > This supercedes the initial version I posted as an inline reply in:
> > 
> >   https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local
> 
> Please add a Link: tag referring to this.


Just to check, does that need to be a Link tag, or is it fine to do that
inline as:

| This supersedes an initial version posted as an inline reply in:
|
|   https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local  

... which is a bit clearer than saying which link to look at when there
are multiple.

... and should I drop the v1 link in there too?

> 
> > I believe we will need this as a prerequisite for other cleanups in that
> > series. As it fixes an existing bug I'm posting it on its own.
> > 
> > Mark.
> > 
> > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > index 91fa4baa1a93..c96d47cb8f46 100644
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -12,6 +12,17 @@
> >  
> >  #define HAVE_FUNCTION_GRAPH_FP_TEST
> >  
> > +/*
> > + * HAVE_FUNCTION_GRAPH_RET_ADDR_PTR means that the architecture can provide a
> > + * "return address pointer" which can be used to uniquely identify a return
> > + * address which has been overwritten.
> > + *
> > + * On arm64 we use the address of the caller's frame record, which remains the
> > + * same for the lifetime of the instrumented function, unlike the return
> > + * address in the LR.
> > + */
> > +#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
> > +
> >  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> >  #define ARCH_SUPPORTS_FTRACE_OPS 1
> >  #else
> > diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> > index 8aebc00c1718..9a319eca5cab 100644
> > --- a/arch/arm64/include/asm/stacktrace.h
> > +++ b/arch/arm64/include/asm/stacktrace.h
> > @@ -46,9 +46,6 @@ struct stack_info {
> >   * @prev_type:   The type of stack this frame record was on, or a synthetic
> >   *               value of STACK_TYPE_UNKNOWN. This is used to detect a
> >   *               transition from one stack to another.
> > - *
> > - * @graph:       When FUNCTION_GRAPH_TRACER is selected, holds the index of a
> > - *               replacement lr value in the ftrace graph stack.
> >   */
> >  struct stackframe {
> >  	unsigned long fp;
> > @@ -56,9 +53,6 @@ struct stackframe {
> >  	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
> >  	unsigned long prev_fp;
> >  	enum stack_type prev_type;
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	int graph;
> > -#endif
> >  };
> >  
> >  extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
> > diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> > index 7f467bd9db7a..3e5d0ed63eb7 100644
> > --- a/arch/arm64/kernel/ftrace.c
> > +++ b/arch/arm64/kernel/ftrace.c
> > @@ -249,8 +249,6 @@ int __init ftrace_dyn_arch_init(void)
> >   * on the way back to parent. For this purpose, this function is called
> >   * in _mcount() or ftrace_caller() to replace return address (*parent) on
> >   * the call stack to return_to_handler.
> > - *
> > - * Note that @frame_pointer is used only for sanity check later.
> >   */
> >  void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> >  			   unsigned long frame_pointer)
> > @@ -268,8 +266,10 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> >  	 */
> >  	old = *parent;
> >  
> > -	if (!function_graph_enter(old, self_addr, frame_pointer, NULL))
> > +	if (!function_graph_enter(old, self_addr, frame_pointer,
> > +	    (void *)frame_pointer)) {
> >  		*parent = return_hooker;
> > +	}
> >  }
> >  
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index 8982a2b78acf..749b680b4999 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -38,9 +38,6 @@ void start_backtrace(struct stackframe *frame, unsigned long fp,
> >  {
> >  	frame->fp = fp;
> >  	frame->pc = pc;
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -	frame->graph = 0;
> > -#endif
> >  
> >  	/*
> >  	 * Prime the first unwind.
> > @@ -116,17 +113,18 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> >  	if (tsk->ret_stack &&
> >  		(ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
> > -		struct ftrace_ret_stack *ret_stack;
> > +		unsigned long orig_pc;
> >  		/*
> >  		 * This is a case where function graph tracer has
> >  		 * modified a return address (LR) in a stack frame
> >  		 * to hook a function return.
> >  		 * So replace it to an original value.
> >  		 */
> > -		ret_stack = ftrace_graph_get_ret_stack(tsk, frame->graph++);
> > -		if (WARN_ON_ONCE(!ret_stack))
> > +		orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
> > +						(void *)frame->fp);
> 
> Should we be using ptrauth_strip_insn_pac(frame->pc) again here?
> ftrace_graph_ret_addr() appears to repeat the comparison against
> return_to_handler.

Yes, that'll need the stripped pointer, and I had not tested with PAC.

Given we unconditionally strip the PAC immediately after the
CONFIG_FUNCTION_GRAPH_TRACER, it'd be simpler to pull that up, i.e. as a fixup:

| diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
| index 749b680b4999..13ea4e4a4d27 100644
| --- a/arch/arm64/kernel/stacktrace.c
| +++ b/arch/arm64/kernel/stacktrace.c
| @@ -110,9 +110,11 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
|         frame->prev_fp = fp;
|         frame->prev_type = info.type;
|  
| +       frame->pc = ptrauth_strip_insn_pac(frame->pc);
| +
|  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
|         if (tsk->ret_stack &&
| -               (ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) {
| +               (frame->pc == (unsigned long)return_to_handler)) {
|                 unsigned long orig_pc;
|                 /*
|                  * This is a case where function graph tracer has
| @@ -128,8 +130,6 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
|         }
|  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|  
| -       frame->pc = ptrauth_strip_insn_pac(frame->pc);
| -
|         return 0;
|  }
|  NOKPROBE_SYMBOL(unwind_frame);

I'll spin a v2 with that and the commit message cleanups.

> > +		if (WARN_ON_ONCE(frame->pc == orig_pc))
> >  			return -EINVAL;
> > -		frame->pc = ret_stack->ret;
> > +		frame->pc = orig_pc;
> >  	}
> >  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> 
> This will conflict (trivially) with the kretprobe stuff in -next but might
> be worth routing via the tracing tree anyway.
> 
> Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-28 13:45   ` Mark Rutland
@ 2021-10-28 14:04     ` Mark Rutland
  2021-10-29 11:20     ` Mark Rutland
  2021-10-29 15:06     ` Mark Rutland
  2 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2021-10-28 14:04 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, broonie, catalin.marinas, madvenka, rostedt

On Thu, Oct 28, 2021 at 02:45:42PM +0100, Mark Rutland wrote:
> Hi Will,
> 
> Looks like there are a bunch of things that warrant a v2, so I'll go
> prepare that. It also seems there are some latent issue hit by the
> ftrace selftests, which I'm taking a look at now.

> I had a go with the selftests and there are some latent issues which show up on
> a pristine v5.15-rc4 (or v5.15-rc7), using defconfig + the ftrace selftest
> config fragment, including an OOM that could be a memory leak.

> Later the tests hang in:
> 
> | [19] event tracing - enable/disable with subsystem level files
> 
> ... and from trying:
> 
> | # ./ftracetest -vvv test.d/event/subsystem-enable.tc 
> 
> ... it seems to hang after the usual reset, on the first part of the
> test, with the last output being:
> 
> | + . /root/ftrace/test.d/event/subsystem-enable.tc
> | ++ echo 'sched:*'
> | ++ yield
> | ++ ping 127.0.0.1 -c 1
> | PING 127.0.0.1 (127.0.0.1): 56 data bytes
> | 64 bytes from 127.0.0.1: seq=0 ttl=64 time=2.442 ms
> | 
> | --- 127.0.0.1 ping statistics ---
> | 1 packets transmitted, 1 packets received, 0% packet loss
> | round-trip min/avg/max = 2.442/2.442/2.442 ms
> | +++ cat trace
> | +++ grep -v '^#'
> | +++ awk '{ print $5 }'
> | +++ sort -u
> | +++ wc -l            
> 
> I'll see if I can figure out what's actually blocking, and see if the
> remaining tests work.

Whoops; I forgot the OOM bits I mentioned above.

After ~30 mins of sitting here, I get the following:

| cat invoked oom-killer: gfp_mask=0x500cc2(GFP_HIGHUSER|__GFP_ACCOUNT), order=0, oom_score_adj=0
| CPU: 12 PID: 1999 Comm: cat Tainted: G        W         5.15.0-rc7 #1
| Hardware name: linux,dummy-virt (DT)
| Call trace:
|  dump_backtrace+0x0/0x1a4
|  show_stack+0x24/0x30
|  dump_stack_lvl+0x8c/0xb8
|  dump_stack+0x1c/0x38
|  dump_header+0x50/0x200
|  oom_kill_process+0x208/0x210
|  out_of_memory+0x21c/0x59c
|  __alloc_pages+0xccc/0xdb4
|  alloc_pages+0xb8/0x1a0
|  pipe_write+0x298/0x700
|  new_sync_write+0x17c/0x190
|  vfs_write+0x224/0x290
|  ksys_write+0xe8/0x100
|  __arm64_sys_write+0x28/0x3c
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0x104/0x124
|  do_el0_svc+0x34/0x9c
|  el0_svc+0x2c/0x90
|  el0t_64_sync_handler+0xa8/0x12c
|  el0t_64_sync+0x1a0/0x1a4
| Mem-Info:
| active_anon:18 inactive_anon:172307 isolated_anon:0
|  active_file:0 inactive_file:0 isolated_file:0
|  unevictable:0 dirty:0 writeback:1
|  slab_reclaimable:6665 slab_unreclaimable:11051
|  mapped:0 shmem:101 pagetables:645 bounce:0
|  kernel_misc_reclaimable:0
|  free:10800 free_pcp:0 free_cma:5198
| Node 0 active_anon:72kB inactive_anon:689228kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:4kB shmem:404kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 176128kB writeback_tmp:0kB kernel_stack:5108kB pagetables:2580kB all_unreclaimable? yes
| Node 0 DMA free:42512kB min:22528kB low:28160kB high:33792kB reserved_highatomic:0KB active_anon:72kB inactive_anon:689236kB active_file:0kB inactive_file:260kB unevictable:0kB writepending:0kB present:1048576kB managed:992188kB mlocked:0kB bounce:0kB free_pcp:132kB local_pcp:0kB free_cma:20792kB
| lowmem_reserve[]: 0 0 0 0
| Node 0 DMA: 419*4kB (UEC) 94*8kB (UMEC) 54*16kB (UMEC) 32*32kB (UMEC) 19*64kB (UEC) 12*128kB (UMEC) 6*256kB (UMC) 3*512kB (MC) 1*1024kB (C) 4*2048kB (UMEC) 6*4096kB (MC) = 43932kB
| Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
| Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB
| Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
| Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB
| 148 total pagecache pages
| 0 pages in swap cache
| Swap cache stats: add 0, delete 0, find 0/0
| Free swap  = 0kB
| Total swap = 0kB
| 262144 pages RAM
| 0 pages HighMem/MovableOnly
| 14097 pages reserved
| 8192 pages cma reserved
| 0 pages hwpoisoned
| Tasks state (memory values in pages):
| [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
| [    337]     0   337    17700      290    86016        0          -250 systemd-journal
| [    364]     0   364     1787      139    53248        0         -1000 systemd-udevd
| [    372]  1002   372     3803      101    45056        0             0 systemd-network
| [    376]  1004   376    20106       91    57344        0             0 systemd-timesyn
| [    378]  1003   378     1615       88    49152        0             0 systemd-resolve
| [    394]  1000   394     1320       85    45056        0          -900 dbus-daemon
| [    397]     0   397      984      132    45056        0             0 bash
| [    406]     0   406     1690      113    53248        0         -1000 sshd
| [   1899]     0  1899      988      161    49152        0             0 ftracetest
| [   1981]     0  1981      988      160    45056        0             0 ftracetest
| [   1982]     0  1982      692       11    40960        0             0 tee
| [   1983]     0  1983     1047      248    49152        0             0 ftracetest
| [   1998]     0  1998     1047      247    45056        0             0 ftracetest
| [   1999]     0  1999      659       22    32768        0             0 cat
| [   2000]     0  2000      692       56    40960        0             0 grep
| [   2001]     0  2001      692       29    40960        0             0 awk
| [   2002]     0  2002   171186   170542  1413120        0             0 sort
| [   2003]     0  2003      692       12    40960        0             0 wc
| oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/system-serial\x2dgetty.slice/serial-getty@ttyAMA0.service,task=sort,pid=2002,uid=0
| Out of memory: Killed process 2002 (sort) total-vm:684744kB, anon-rss:682164kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:1380kB oom_score_adj:0
| oom_reaper: reaped process 2002 (sort), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-28 13:45   ` Mark Rutland
  2021-10-28 14:04     ` Mark Rutland
@ 2021-10-29 11:20     ` Mark Rutland
  2021-10-29 13:15       ` Steven Rostedt
  2021-10-29 15:06     ` Mark Rutland
  2 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2021-10-29 11:20 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, broonie, catalin.marinas, madvenka, rostedt

On Thu, Oct 28, 2021 at 02:45:42PM +0100, Mark Rutland wrote:
> I had a go with the selftests and there are some latent issues which show up on
> a pristine v5.15-rc4 (or v5.15-rc7), using defconfig + the ftrace selftest
> config fragment, including an OOM that could be a memory leak.
> 
> I see the following warning:
> 
> | [15] Generic dynamic event - check if duplicate events are caught
> | ------------[ cut here ]------------
> | WARNING: CPU: 24 PID: 912 at kernel/trace/trace_eprobe.c:853 __trace_eprobe_create+0x3ec/0x71c
> | Modules linked in:
> | CPU: 24 PID: 912 Comm: ftracetest Not tainted 5.15.0-rc4 #1
> | Hardware name: linux,dummy-virt (DT)
> | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> | pc : __trace_eprobe_create+0x3ec/0x71c
> | lr : __trace_eprobe_create+0x3ec/0x71c
> | sp : ffff80001370bb60
> | x29: ffff80001370bb60 x28: ffff5ee1944cc299 x27: ffff80001370bc28
> | x26: 00000000000001c8 x25: ffffa8db50eece78 x24: 0000000000000010
> | x23: ffffa8db50eece48 x22: 00000000ffffffed x21: 0000000000000003
> | x20: ffff5ee1944cc008 x19: ffffa8db4fd2e398 x18: 0000000000000000
> | x17: 7473753a29656d61 x16: 6e656c6966242830 x15: 2b3d656c69662074
> | x14: 616e65706f5f7265 x13: 676e69727473753a x12: 29656d616e656c69
> | x11: 0000000000000003 x10: 0101010101010101 x9 : ffffa8db4edda690
> | x8 : 7f7f7f7f7f7f7f7f x7 : 726b6b6062727872 x6 : 1a171a17020d001f
> | x5 : 1f000d02171a171a x4 : ffff5ee18d2b3900 x3 : ffffa8db50eece48
> | x2 : 0000000000000000 x1 : ffff5ee18d2b3900 x0 : ffff5ee18d2b3900
> | Call trace:
> |  __trace_eprobe_create+0x3ec/0x71c
> |  trace_probe_create+0x90/0xb0
> |  eprobe_dyn_event_create+0x1c/0x30
> |  create_dyn_event+0x74/0x114
> |  trace_parse_run_command+0x104/0x7d0
> |  dyn_event_write+0x1c/0x30
> |  vfs_write+0xfc/0x290
> |  ksys_write+0x74/0x100
> |  __arm64_sys_write+0x28/0x3c
> |  invoke_syscall+0x50/0x120
> |  el0_svc_common.constprop.0+0x104/0x124
> |  do_el0_svc+0x34/0x9c
> |  el0_svc+0x2c/0x90
> |  el0t_64_sync_handler+0xa8/0x12c
> |  el0t_64_sync+0x1a0/0x1a4
> | ---[ end trace 24f9a9e45b173e0b ]---
> |         [FAIL]
> 
> That seems to be the following in __trace_eprobe_create():
> 
> | mutex_lock(&event_mutex);
> | event_call = find_and_get_event(sys_name, sys_event);
> | ep = alloc_event_probe(group, event, event_call, argc - 2);
> | mutex_unlock(&event_mutex);
> | 
> | if (IS_ERR(ep)) {
> | 	ret = PTR_ERR(ep);
> | 	/* This must return -ENOMEM, else there is a bug */
> | 	WARN_ON_ONCE(ret != -ENOMEM);
> | 	ep = NULL;
> | 	goto error;
> | }
> 
> ... but I haven't yet dug into exactly why that's returning an
> unexpected error code.

It turns out this is a latent bug when the user requests a bogus event,
find_and_get_event() fails and returns NULL, causing alloc_event_probe()
to return -ENODEV.

The right thing to do there is to check the result of
find_and_get_event(); I have a local patch for that.

The reason that was bogus at all was that I didn't have
CONFIG_FTRACE_SYSCALLS selected and so `syscalls:sys_enter_openat`
didn't exist. I didn't have that selected because it wasn't in the
Kconfig fragment for the ftrace selftests; I have a local patch for
that.

I'm still working through the other bits. If I sovle that shortly I'll
send that as a series with the two patches above, otherwise I'll send
those out on their own.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-29 11:20     ` Mark Rutland
@ 2021-10-29 13:15       ` Steven Rostedt
  2021-10-29 13:33         ` Mark Rutland
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2021-10-29 13:15 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, linux-arm-kernel, broonie, catalin.marinas, madvenka

On Fri, 29 Oct 2021 12:20:34 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> > That seems to be the following in __trace_eprobe_create():
> > 
> > | mutex_lock(&event_mutex);
> > | event_call = find_and_get_event(sys_name, sys_event);
> > | ep = alloc_event_probe(group, event, event_call, argc - 2);
> > | mutex_unlock(&event_mutex);
> > | 
> > | if (IS_ERR(ep)) {
> > | 	ret = PTR_ERR(ep);
> > | 	/* This must return -ENOMEM, else there is a bug */
> > | 	WARN_ON_ONCE(ret != -ENOMEM);
> > | 	ep = NULL;
> > | 	goto error;
> > | }
> > 
> > ... but I haven't yet dug into exactly why that's returning an
> > unexpected error code.  
> 
> It turns out this is a latent bug when the user requests a bogus event,
> find_and_get_event() fails and returns NULL, causing alloc_event_probe()
> to return -ENODEV.
> 
> The right thing to do there is to check the result of
> find_and_get_event(); I have a local patch for that.

It's been fixed in mainline:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fa598f9706d40bd16f2ab286bdf5808e1393d35

-- Steve

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-29 13:15       ` Steven Rostedt
@ 2021-10-29 13:33         ` Mark Rutland
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2021-10-29 13:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Will Deacon, linux-arm-kernel, broonie, catalin.marinas, madvenka

On Fri, Oct 29, 2021 at 09:15:45AM -0400, Steven Rostedt wrote:
> On Fri, 29 Oct 2021 12:20:34 +0100
> Mark Rutland <mark.rutland@arm.com> wrote:
> 
> > > That seems to be the following in __trace_eprobe_create():
> > > 
> > > | mutex_lock(&event_mutex);
> > > | event_call = find_and_get_event(sys_name, sys_event);
> > > | ep = alloc_event_probe(group, event, event_call, argc - 2);
> > > | mutex_unlock(&event_mutex);
> > > | 
> > > | if (IS_ERR(ep)) {
> > > | 	ret = PTR_ERR(ep);
> > > | 	/* This must return -ENOMEM, else there is a bug */
> > > | 	WARN_ON_ONCE(ret != -ENOMEM);
> > > | 	ep = NULL;
> > > | 	goto error;
> > > | }
> > > 
> > > ... but I haven't yet dug into exactly why that's returning an
> > > unexpected error code.  
> > 
> > It turns out this is a latent bug when the user requests a bogus event,
> > find_and_get_event() fails and returns NULL, causing alloc_event_probe()
> > to return -ENODEV.
> > 
> > The right thing to do there is to check the result of
> > find_and_get_event(); I have a local patch for that.
> 
> It's been fixed in mainline:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fa598f9706d40bd16f2ab286bdf5808e1393d35

Ah, even better; thanks for the pointer!

Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  2021-10-28 13:45   ` Mark Rutland
  2021-10-28 14:04     ` Mark Rutland
  2021-10-29 11:20     ` Mark Rutland
@ 2021-10-29 15:06     ` Mark Rutland
  2 siblings, 0 replies; 8+ messages in thread
From: Mark Rutland @ 2021-10-29 15:06 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, broonie, catalin.marinas, madvenka, rostedt

On Thu, Oct 28, 2021 at 02:45:42PM +0100, Mark Rutland wrote:
> I had a go with the selftests and there are some latent issues which show up on
> a pristine v5.15-rc4 (or v5.15-rc7), using defconfig + the ftrace selftest
> config fragment, including an OOM that could be a memory leak.

> Later the tests hang in:
> 
> | [19] event tracing - enable/disable with subsystem level files
> 
> ... and from trying:
> 
> | # ./ftracetest -vvv test.d/event/subsystem-enable.tc 
> 
> ... it seems to hang after the usual reset, on the first part of the
> test, with the last output being:
> 
> | + . /root/ftrace/test.d/event/subsystem-enable.tc
> | ++ echo 'sched:*'
> | ++ yield
> | ++ ping 127.0.0.1 -c 1
> | PING 127.0.0.1 (127.0.0.1): 56 data bytes
> | 64 bytes from 127.0.0.1: seq=0 ttl=64 time=2.442 ms
> | 
> | --- 127.0.0.1 ping statistics ---
> | 1 packets transmitted, 1 packets received, 0% packet loss
> | round-trip min/avg/max = 2.442/2.442/2.442 ms
> | +++ cat trace
> | +++ grep -v '^#'
> | +++ awk '{ print $5 }'
> | +++ sort -u
> | +++ wc -l            

In the end, this is because my test VM has 32 vCPUs and can create a
tremendous amount of trace information. So much so that userspace spends
an incredibly long time parsing it and trying to sort it.

If I boot a 2 vCPU VM, this passes after a couple of seconds.

I have a local patch to make the test more robust to this, which I'll
send out in a bit. There are some other tests that want to scan the
entire trace stream though, so practically it's necessary to run those
with fewer CPUs.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-29 15:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27 13:25 [PATCH] arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR Mark Rutland
2021-10-28  8:52 ` Will Deacon
2021-10-28 13:45   ` Mark Rutland
2021-10-28 14:04     ` Mark Rutland
2021-10-29 11:20     ` Mark Rutland
2021-10-29 13:15       ` Steven Rostedt
2021-10-29 13:33         ` Mark Rutland
2021-10-29 15:06     ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.