All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Chengming Zhou <zhouchengming@bytedance.com>
Cc: rostedt@goodmis.org, mingo@redhat.com, catalin.marinas@arm.com,
	will@kernel.org, tglx@linutronix.de, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	broonie@kernel.org, ardb@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com,
	songmuchun@bytedance.com
Subject: Re: [PATCH v4 2/2] arm64/ftrace: Make function graph use ftrace directly
Date: Tue, 19 Apr 2022 13:55:19 +0100	[thread overview]
Message-ID: <Yl6xN2qf7k5YeEdl@FVFF77S0Q05N> (raw)
In-Reply-To: <20220409153554.14470-2-zhouchengming@bytedance.com>

On Sat, Apr 09, 2022 at 11:35:54PM +0800, Chengming Zhou wrote:
> As we do in commit 0c0593b45c9b ("x86/ftrace: Make function graph
> use ftrace directly"), we don't need special hook for graph tracer,
> but instead we use graph_ops:func function to install return_hooker.
> 
> Since commit 3b23e4991fb6 ("arm64: implement ftrace with regs") add
> implementation for FTRACE_WITH_REGS on arm64, we can easily adopt
> the same cleanup on arm64.
> 
> And this cleanup only changes the FTRACE_WITH_REGS implementation,
> so the mcount-based implementation is unaffected.

Could you please say *why* we only do this for FTRACE_WITH_REGS? IIUC that's
possible, but would require more invasive refactoring of the core code; have I
understood correctly?

If so, could we please make this:

| While in theory it would be possible to make a similar cleanup for
| !FTRACE_WITH_REGS, this will require rework of the core code, and so for now
| we only change the FTRACE_WITH_REGS implementation.

It'd be quite nice if we could clean up the !FTRACE_WITH_REGS case similarly,
but as it appeass that would require far more invasive changes, I'm happy to
leave that for future work.

>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
> v3:
>  - Add comments in ftrace_graph_func() as suggested by Steve. Thanks.
> 
> v2:
>  - Remove FTRACE_WITH_REGS ftrace_graph_caller asm, thanks Mark.
> ---
>  arch/arm64/include/asm/ftrace.h  |  7 +++++++
>  arch/arm64/kernel/entry-ftrace.S | 17 -----------------
>  arch/arm64/kernel/ftrace.c       | 17 +++++++++++++++++
>  3 files changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index 1494cfa8639b..dbc45a4157fa 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -80,8 +80,15 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>  struct dyn_ftrace;
> +struct ftrace_ops;
> +struct ftrace_regs;
> +
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
>  #define ftrace_init_nop ftrace_init_nop
> +
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +		       struct ftrace_ops *op, struct ftrace_regs *fregs);
> +#define ftrace_graph_func ftrace_graph_func
>  #endif
>  
>  #define ftrace_return_address(n) return_address(n)
> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> index e535480a4069..d42a205ef625 100644
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -97,12 +97,6 @@ SYM_CODE_START(ftrace_common)
>  SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
>  	bl	ftrace_stub
>  
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller();
> -	nop				// If enabled, this will be replaced
> -					// "b ftrace_graph_caller"
> -#endif
> -
>  /*
>   * At the callsite x0-x8 and x19-x30 were live. Any C code will have preserved
>   * x19-x29 per the AAPCS, and we created frame records upon entry, so we need
> @@ -127,17 +121,6 @@ ftrace_common_return:
>  	ret	x9
>  SYM_CODE_END(ftrace_common)
>  
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_CODE_START(ftrace_graph_caller)
> -	ldr	x0, [sp, #S_PC]
> -	sub	x0, x0, #AARCH64_INSN_SIZE	// ip (callsite's BL insn)
> -	add	x1, sp, #S_LR			// parent_ip (callsite's LR)
> -	ldr	x2, [sp, #PT_REGS_SIZE]	   	// parent fp (callsite's FP)
> -	bl	prepare_ftrace_return
> -	b	ftrace_common_return
> -SYM_CODE_END(ftrace_graph_caller)
> -#endif
> -
>  #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
>  
>  /*
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 4506c4a90ac1..35eb7c9b5e53 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -268,6 +268,22 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
>  }
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +		       struct ftrace_ops *op, struct ftrace_regs *fregs)
> +{
> +	/*
> +	 * Athough graph_ops doesn't have FTRACE_OPS_FL_SAVE_REGS set in flags,
> +	 * regs can't be NULL in DYNAMIC_FTRACE_WITH_REGS. By design, it should
> +	 * be fixed when DYNAMIC_FTRACE_WITH_ARGS is implemented.
> +	 */

This is a bit confusing, since it makes it sound like there's an bug in the
current implementation, rather than something that would need to change if
support for DYNAMIC_FTRACE_WITH_ARGS is added.

Could we please make this:

	/*
	 * When DYNAMIC_FTRACE_WITH_REGS is selected, `fregs` can never be NULL
	 * and arch_ftrace_get_regs(fregs) will always give a non-NULL pt_regs
	 * in which we can safely modify the LR.
	 */

Other than that, this looks good to me. I gave it a spin under QEMU atop
v5.18-rc3. The CONFIG_FTRACE_STARTUP_TEST tests all pass, and I played with the
graph tracer with:

| # echo do_el0_svc > /sys/kernel/tracing/set_graph_function 
| # echo function_graph > /sys/kernel/tracing/current_tracer

... for which the resutls looks sane.

To make sure this didn't adversely affect the return address rewriting, I also
concurrently ran perf with:

| # perf record -g -e raw_syscalls:sys_enter:k /bin/true
| # perf report

... for which the results also looked fine.

I also tested the !DYNAMIC_FTRACE_WITH_REGS modes by building with an older
compiler and also building with !DYNAMIC_FTRACE, which all looked good.

So FWIW:

Tested-by: Mark Rutland <mark.rutland@arm.com>

... and if you make the changes I requested above:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

If you could spin a v5 with that folded in, that would be great.

Thanks,
Mark.

WARNING: multiple messages have this Message-ID
From: Mark Rutland <mark.rutland@arm.com>
To: Chengming Zhou <zhouchengming@bytedance.com>
Cc: rostedt@goodmis.org, mingo@redhat.com, catalin.marinas@arm.com,
	will@kernel.org, tglx@linutronix.de, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	broonie@kernel.org, ardb@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com,
	songmuchun@bytedance.com
Subject: Re: [PATCH v4 2/2] arm64/ftrace: Make function graph use ftrace directly
Date: Tue, 19 Apr 2022 13:55:19 +0100	[thread overview]
Message-ID: <Yl6xN2qf7k5YeEdl@FVFF77S0Q05N> (raw)
In-Reply-To: <20220409153554.14470-2-zhouchengming@bytedance.com>

On Sat, Apr 09, 2022 at 11:35:54PM +0800, Chengming Zhou wrote:
> As we do in commit 0c0593b45c9b ("x86/ftrace: Make function graph
> use ftrace directly"), we don't need special hook for graph tracer,
> but instead we use graph_ops:func function to install return_hooker.
> 
> Since commit 3b23e4991fb6 ("arm64: implement ftrace with regs") add
> implementation for FTRACE_WITH_REGS on arm64, we can easily adopt
> the same cleanup on arm64.
> 
> And this cleanup only changes the FTRACE_WITH_REGS implementation,
> so the mcount-based implementation is unaffected.

Could you please say *why* we only do this for FTRACE_WITH_REGS? IIUC that's
possible, but would require more invasive refactoring of the core code; have I
understood correctly?

If so, could we please make this:

| While in theory it would be possible to make a similar cleanup for
| !FTRACE_WITH_REGS, this will require rework of the core code, and so for now
| we only change the FTRACE_WITH_REGS implementation.

It'd be quite nice if we could clean up the !FTRACE_WITH_REGS case similarly,
but as it appeass that would require far more invasive changes, I'm happy to
leave that for future work.

>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
> v3:
>  - Add comments in ftrace_graph_func() as suggested by Steve. Thanks.
> 
> v2:
>  - Remove FTRACE_WITH_REGS ftrace_graph_caller asm, thanks Mark.
> ---
>  arch/arm64/include/asm/ftrace.h  |  7 +++++++
>  arch/arm64/kernel/entry-ftrace.S | 17 -----------------
>  arch/arm64/kernel/ftrace.c       | 17 +++++++++++++++++
>  3 files changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index 1494cfa8639b..dbc45a4157fa 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -80,8 +80,15 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>  struct dyn_ftrace;
> +struct ftrace_ops;
> +struct ftrace_regs;
> +
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
>  #define ftrace_init_nop ftrace_init_nop
> +
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +		       struct ftrace_ops *op, struct ftrace_regs *fregs);
> +#define ftrace_graph_func ftrace_graph_func
>  #endif
>  
>  #define ftrace_return_address(n) return_address(n)
> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> index e535480a4069..d42a205ef625 100644
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -97,12 +97,6 @@ SYM_CODE_START(ftrace_common)
>  SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
>  	bl	ftrace_stub
>  
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller();
> -	nop				// If enabled, this will be replaced
> -					// "b ftrace_graph_caller"
> -#endif
> -
>  /*
>   * At the callsite x0-x8 and x19-x30 were live. Any C code will have preserved
>   * x19-x29 per the AAPCS, and we created frame records upon entry, so we need
> @@ -127,17 +121,6 @@ ftrace_common_return:
>  	ret	x9
>  SYM_CODE_END(ftrace_common)
>  
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_CODE_START(ftrace_graph_caller)
> -	ldr	x0, [sp, #S_PC]
> -	sub	x0, x0, #AARCH64_INSN_SIZE	// ip (callsite's BL insn)
> -	add	x1, sp, #S_LR			// parent_ip (callsite's LR)
> -	ldr	x2, [sp, #PT_REGS_SIZE]	   	// parent fp (callsite's FP)
> -	bl	prepare_ftrace_return
> -	b	ftrace_common_return
> -SYM_CODE_END(ftrace_graph_caller)
> -#endif
> -
>  #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
>  
>  /*
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 4506c4a90ac1..35eb7c9b5e53 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -268,6 +268,22 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
>  }
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +		       struct ftrace_ops *op, struct ftrace_regs *fregs)
> +{
> +	/*
> +	 * Athough graph_ops doesn't have FTRACE_OPS_FL_SAVE_REGS set in flags,
> +	 * regs can't be NULL in DYNAMIC_FTRACE_WITH_REGS. By design, it should
> +	 * be fixed when DYNAMIC_FTRACE_WITH_ARGS is implemented.
> +	 */

This is a bit confusing, since it makes it sound like there's an bug in the
current implementation, rather than something that would need to change if
support for DYNAMIC_FTRACE_WITH_ARGS is added.

Could we please make this:

	/*
	 * When DYNAMIC_FTRACE_WITH_REGS is selected, `fregs` can never be NULL
	 * and arch_ftrace_get_regs(fregs) will always give a non-NULL pt_regs
	 * in which we can safely modify the LR.
	 */

Other than that, this looks good to me. I gave it a spin under QEMU atop
v5.18-rc3. The CONFIG_FTRACE_STARTUP_TEST tests all pass, and I played with the
graph tracer with:

| # echo do_el0_svc > /sys/kernel/tracing/set_graph_function 
| # echo function_graph > /sys/kernel/tracing/current_tracer

... for which the resutls looks sane.

To make sure this didn't adversely affect the return address rewriting, I also
concurrently ran perf with:

| # perf record -g -e raw_syscalls:sys_enter:k /bin/true
| # perf report

... for which the results also looked fine.

I also tested the !DYNAMIC_FTRACE_WITH_REGS modes by building with an older
compiler and also building with !DYNAMIC_FTRACE, which all looked good.

So FWIW:

Tested-by: Mark Rutland <mark.rutland@arm.com>

... and if you make the changes I requested above:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

If you could spin a v5 with that folded in, that would be great.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-04-19 12:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-09 15:35 [PATCH v4 1/2] ftrace: cleanup ftrace_graph_caller enable and disable Chengming Zhou
2022-04-09 15:35 ` Chengming Zhou
2022-04-09 15:35 ` [PATCH v4 2/2] arm64/ftrace: Make function graph use ftrace directly Chengming Zhou
2022-04-09 15:35   ` Chengming Zhou
2022-04-19 12:55   ` Mark Rutland [this message]
2022-04-19 12:55     ` Mark Rutland
2022-04-19 13:27     ` [External] " Chengming Zhou
2022-04-19 13:27       ` Chengming Zhou
2022-04-18 13:24 ` [PATCH v4 1/2] ftrace: cleanup ftrace_graph_caller enable and disable Chengming Zhou
2022-04-18 13:24   ` Chengming Zhou
2022-04-19 11:41 ` Mark Rutland
2022-04-19 11:41   ` Mark Rutland
2022-04-20 15:30   ` [External] " Chengming Zhou
2022-04-20 15:30     ` Chengming Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yl6xN2qf7k5YeEdl@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=songmuchun@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.