All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-06-27 11:16 ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Changes in V11:

- append a patch that makes the DIRECT_CALL samples support RV32I in
  this series fixing the rv32 build failure reported by Palmer

- validated with ftrace boottime selftest and manual sample modules test
  in qemu-system for RV32I and RV64I

This series optimizes function trace. The first 3 independent 
patches has been picked in the V7 version of this series, the
subsequent version continues the following 4 patches:

select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1] (patch 1)
==========================================================

In RISC-V, -fpatchable-function-entry option is used to support
dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
don't have to be called to create the __mcount_loc section before
the vmlinux linking.

Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
Makefile not to run recordmcount.

Make function graph use ftrace directly [2] (patch 2)
======================================================== 

In RISC-V architecture, when we enable the ftrace_graph tracer on some
functions, the function tracings on other functions will suffer extra
graph tracing work. In essence, graph_ops isn't limited by its func_hash
due to the global ftrace_graph_[regs]_call label. That should be
corrected.

What inspires me is the commit 0c0593b45c9b ("x86/ftrace: Make function
graph use ftrace directly") that uses graph_ops::func function to
install return_hooker and makes the function called against its
func_hash.

Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
==============================================

This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.
SAMPLE_FTRACE_DIRECT and SAMPLE_FTRACE_DIRECT_MULTI are also included
here as the samples for testing DIRECT_CALLS related interface.

First, select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide 
register_ftrace_direct[_multi] interfaces allowing user to register 
the customed trampoline (direct_caller) as the mcount for one or 
more target functions. And modify_ftrace_direct[_multi] are also 
provided for modify direct_caller.

At the same time, the samples in ./samples/ftrace/ can be built
as kerenl module for testing these interfaces with SAMPLE_FTRACE_DIRECT
and SAMPLE_FTRACE_DIRECT_MULTI selected.

Second, to make the direct_caller and the other ftrace hooks
(eg. function/fgraph tracer, k[ret]probes) co-exist, a temporary
register
are nominated to store the address of direct_caller in
ftrace_regs_caller.
After the setting of the address direct_caller by direct_ops->func and
the RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
by the `jr` inst.

The series's old changes related these patches
==========================================

Changes in v10:
https://lore.kernel.org/all/20230511093234.3123181-1-suagrfillet@gmail.com/

- add Acked-by from Björn Töpel in patch 2 and patch 4 
- replace `move` with `mv` in patch3 
- prettify patch 2/4 with proper tabs

Changes in v9:
https://lore.kernel.org/linux-riscv/20230510101857.2953955-1-suagrfillet@gmail.com/

1. add Acked-by from Björn Töpel in patch 1

2. rebase patch2/patch3 on Linux v6.4-rc1

  - patch 2: to make the `SAVE_ABI_REGS` configurable, revert the
    modification of mcount-dyn.S from commit (45b32b946a97 "riscv:
entry: Consolidate general regs saving/restoring")

  - patch 3: to pass the trace_selftest, add the implement of
    `ftrace_stub_direct_tramp` from commit (fee86a4ed536 "ftrace:
selftest: remove broken trace_direct_tramp") ; and fixup the context
conflict in Kconfig 

Changes in v8:
https://lore.kernel.org/linux-riscv/20230324033342.3177979-1-suagrfillet@gmail.com/
 - Fix incorrect address values in the 4nd patch 
 - Rebased on v6.3-rc2

Changes in v7:
https://lore.kernel.org/linux-riscv/20230112090603.1295340-1-guoren@kernel.org/
 - Fixup RESTORE_ABI_REGS by remove PT_T0(sp) overwrite.
 - Add FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1]
 - Fixup kconfig with HAVE_SAMPLE_FTRACE_DIRECT &
   HAVE_SAMPLE_FTRACE_DIRECT_MULTI

Changes in v6:
https://lore.kernel.org/linux-riscv/20230107133549.4192639-1-guoren@kernel.org/
 - Replace 8 with MCOUNT_INSN_SIZE
 - Replace "REG_L a1, PT_RA(sp)" with "mv a1, ra"
 - Add Evgenii Shatokhin comment

Changes in v5:
https://lore.kernel.org/linux-riscv/20221208091244.203407-1-guoren@kernel.org/
 - Sort Kconfig entries in alphabetical order.

Changes in v4:
https://lore.kernel.org/linux-riscv/20221129033230.255947-1-guoren@kernel.org/
 - Include [3] for maintenance. [Song Shuai]

Changes in V3:
https://lore.kernel.org/linux-riscv/20221123153950.2911981-1-guoren@kernel.org/
 - Include [2] for maintenance. [Song Shuai]

[1]: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
[2]: https://lore.kernel.org/lkml/20221120084230.910152-1-suagrfillet@gmail.com/
[3]: https://lore.kernel.org/linux-riscv/20221123142025.1504030-1-suagrfillet@gmail.com/ 

Song Shuai (5):
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  riscv: ftrace: Add ftrace_graph_func
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
  samples: ftrace: Make the riscv samples support RV32I

 arch/riscv/Kconfig                          |   4 +
 arch/riscv/include/asm/ftrace.h             |  19 +-
 arch/riscv/kernel/ftrace.c                  |  30 ++-
 arch/riscv/kernel/mcount-dyn.S              | 200 ++++++++++++++++----
 samples/ftrace/ftrace-direct-modify.c       |  35 ++++
 samples/ftrace/ftrace-direct-multi-modify.c |  41 ++++
 samples/ftrace/ftrace-direct-multi.c        |  25 +++
 samples/ftrace/ftrace-direct-too.c          |  28 +++
 samples/ftrace/ftrace-direct.c              |  24 +++
 9 files changed, 350 insertions(+), 56 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-06-27 11:16 ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Changes in V11:

- append a patch that makes the DIRECT_CALL samples support RV32I in
  this series fixing the rv32 build failure reported by Palmer

- validated with ftrace boottime selftest and manual sample modules test
  in qemu-system for RV32I and RV64I

This series optimizes function trace. The first 3 independent 
patches has been picked in the V7 version of this series, the
subsequent version continues the following 4 patches:

select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1] (patch 1)
==========================================================

In RISC-V, -fpatchable-function-entry option is used to support
dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
don't have to be called to create the __mcount_loc section before
the vmlinux linking.

Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
Makefile not to run recordmcount.

Make function graph use ftrace directly [2] (patch 2)
======================================================== 

In RISC-V architecture, when we enable the ftrace_graph tracer on some
functions, the function tracings on other functions will suffer extra
graph tracing work. In essence, graph_ops isn't limited by its func_hash
due to the global ftrace_graph_[regs]_call label. That should be
corrected.

What inspires me is the commit 0c0593b45c9b ("x86/ftrace: Make function
graph use ftrace directly") that uses graph_ops::func function to
install return_hooker and makes the function called against its
func_hash.

Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
==============================================

This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.
SAMPLE_FTRACE_DIRECT and SAMPLE_FTRACE_DIRECT_MULTI are also included
here as the samples for testing DIRECT_CALLS related interface.

First, select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide 
register_ftrace_direct[_multi] interfaces allowing user to register 
the customed trampoline (direct_caller) as the mcount for one or 
more target functions. And modify_ftrace_direct[_multi] are also 
provided for modify direct_caller.

At the same time, the samples in ./samples/ftrace/ can be built
as kerenl module for testing these interfaces with SAMPLE_FTRACE_DIRECT
and SAMPLE_FTRACE_DIRECT_MULTI selected.

Second, to make the direct_caller and the other ftrace hooks
(eg. function/fgraph tracer, k[ret]probes) co-exist, a temporary
register
are nominated to store the address of direct_caller in
ftrace_regs_caller.
After the setting of the address direct_caller by direct_ops->func and
the RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
by the `jr` inst.

The series's old changes related these patches
==========================================

Changes in v10:
https://lore.kernel.org/all/20230511093234.3123181-1-suagrfillet@gmail.com/

- add Acked-by from Björn Töpel in patch 2 and patch 4 
- replace `move` with `mv` in patch3 
- prettify patch 2/4 with proper tabs

Changes in v9:
https://lore.kernel.org/linux-riscv/20230510101857.2953955-1-suagrfillet@gmail.com/

1. add Acked-by from Björn Töpel in patch 1

2. rebase patch2/patch3 on Linux v6.4-rc1

  - patch 2: to make the `SAVE_ABI_REGS` configurable, revert the
    modification of mcount-dyn.S from commit (45b32b946a97 "riscv:
entry: Consolidate general regs saving/restoring")

  - patch 3: to pass the trace_selftest, add the implement of
    `ftrace_stub_direct_tramp` from commit (fee86a4ed536 "ftrace:
selftest: remove broken trace_direct_tramp") ; and fixup the context
conflict in Kconfig 

Changes in v8:
https://lore.kernel.org/linux-riscv/20230324033342.3177979-1-suagrfillet@gmail.com/
 - Fix incorrect address values in the 4nd patch 
 - Rebased on v6.3-rc2

Changes in v7:
https://lore.kernel.org/linux-riscv/20230112090603.1295340-1-guoren@kernel.org/
 - Fixup RESTORE_ABI_REGS by remove PT_T0(sp) overwrite.
 - Add FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1]
 - Fixup kconfig with HAVE_SAMPLE_FTRACE_DIRECT &
   HAVE_SAMPLE_FTRACE_DIRECT_MULTI

Changes in v6:
https://lore.kernel.org/linux-riscv/20230107133549.4192639-1-guoren@kernel.org/
 - Replace 8 with MCOUNT_INSN_SIZE
 - Replace "REG_L a1, PT_RA(sp)" with "mv a1, ra"
 - Add Evgenii Shatokhin comment

Changes in v5:
https://lore.kernel.org/linux-riscv/20221208091244.203407-1-guoren@kernel.org/
 - Sort Kconfig entries in alphabetical order.

Changes in v4:
https://lore.kernel.org/linux-riscv/20221129033230.255947-1-guoren@kernel.org/
 - Include [3] for maintenance. [Song Shuai]

Changes in V3:
https://lore.kernel.org/linux-riscv/20221123153950.2911981-1-guoren@kernel.org/
 - Include [2] for maintenance. [Song Shuai]

[1]: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
[2]: https://lore.kernel.org/lkml/20221120084230.910152-1-suagrfillet@gmail.com/
[3]: https://lore.kernel.org/linux-riscv/20221123142025.1504030-1-suagrfillet@gmail.com/ 

Song Shuai (5):
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  riscv: ftrace: Add ftrace_graph_func
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
  samples: ftrace: Make the riscv samples support RV32I

 arch/riscv/Kconfig                          |   4 +
 arch/riscv/include/asm/ftrace.h             |  19 +-
 arch/riscv/kernel/ftrace.c                  |  30 ++-
 arch/riscv/kernel/mcount-dyn.S              | 200 ++++++++++++++++----
 samples/ftrace/ftrace-direct-modify.c       |  35 ++++
 samples/ftrace/ftrace-direct-multi-modify.c |  41 ++++
 samples/ftrace/ftrace-direct-multi.c        |  25 +++
 samples/ftrace/ftrace-direct-too.c          |  28 +++
 samples/ftrace/ftrace-direct.c              |  24 +++
 9 files changed, 350 insertions(+), 56 deletions(-)

-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V11 1/5] riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  2023-06-27 11:16 ` Song Shuai
@ 2023-06-27 11:16   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

In RISC-V, -fpatchable-function-entry option is used to support
dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
don't have to be called to create the __mcount_loc section before
the vmlinux linking.

Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
Makefile not to run recordmcount.

Link: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
Link: https://lore.kernel.org/linux-riscv/Y4jtfrJt+%2FQ5nMOz@spud/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5966ad97c30c..756d854e6cdd 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -59,6 +59,7 @@ config RISCV
 	select COMMON_CLK
 	select CPU_PM if CPU_IDLE || HIBERNATION
 	select EDAC_SUPPORT
+	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY if DYNAMIC_FTRACE
 	select GENERIC_ARCH_TOPOLOGY
 	select GENERIC_ATOMIC64 if !64BIT
 	select GENERIC_CLOCKEVENTS_BROADCAST if SMP
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 1/5] riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
@ 2023-06-27 11:16   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

In RISC-V, -fpatchable-function-entry option is used to support
dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
don't have to be called to create the __mcount_loc section before
the vmlinux linking.

Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
Makefile not to run recordmcount.

Link: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
Link: https://lore.kernel.org/linux-riscv/Y4jtfrJt+%2FQ5nMOz@spud/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5966ad97c30c..756d854e6cdd 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -59,6 +59,7 @@ config RISCV
 	select COMMON_CLK
 	select CPU_PM if CPU_IDLE || HIBERNATION
 	select EDAC_SUPPORT
+	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY if DYNAMIC_FTRACE
 	select GENERIC_ARCH_TOPOLOGY
 	select GENERIC_ATOMIC64 if !64BIT
 	select GENERIC_CLOCKEVENTS_BROADCAST if SMP
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 2/5] riscv: ftrace: Add ftrace_graph_func
  2023-06-27 11:16 ` Song Shuai
@ 2023-06-27 11:16   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Here implements ftrace_graph_func as the function graph tracing function
with FTRACE_WITH_REGS defined.

function_graph_func gets the point of the parent IP and the frame pointer
from fregs and call prepare_ftrace_return for function graph tracing.

If FTRACE_WITH_REGS isn't defined, the enable/disable helpers of
ftrace_graph_[regs]_call are revised for serving only ftrace_graph_call
in the !FTRACE_WITH_REGS version ftrace_caller.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/include/asm/ftrace.h |  11 +-
 arch/riscv/kernel/ftrace.c      |  30 +++--
 arch/riscv/kernel/mcount-dyn.S  | 190 +++++++++++++++++++++++++-------
 3 files changed, 175 insertions(+), 56 deletions(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index d47d87c2d7e3..84f856a3286e 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -107,7 +107,16 @@ do {									\
 struct dyn_ftrace;
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 #define ftrace_init_nop ftrace_init_nop
-#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+struct ftrace_ops;
+struct ftrace_regs;
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs);
+#define ftrace_graph_func ftrace_graph_func
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+#endif /* __ASSEMBLY__ */
 
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 03a6434a8cdd..f5aa24d9e1c1 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -178,32 +178,28 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 }
 
 #ifdef CONFIG_DYNAMIC_FTRACE
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs)
+{
+	struct pt_regs *regs = arch_ftrace_get_regs(fregs);
+	unsigned long *parent = (unsigned long *)&regs->ra;
+
+	prepare_ftrace_return(parent, ip, frame_pointer(regs));
+}
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 extern void ftrace_graph_call(void);
-extern void ftrace_graph_regs_call(void);
 int ftrace_enable_ftrace_graph_caller(void)
 {
-	int ret;
-
-	ret = __ftrace_modify_call((unsigned long)&ftrace_graph_call,
-				    (unsigned long)&prepare_ftrace_return, true, true);
-	if (ret)
-		return ret;
-
-	return __ftrace_modify_call((unsigned long)&ftrace_graph_regs_call,
+	return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
 				    (unsigned long)&prepare_ftrace_return, true, true);
 }
 
 int ftrace_disable_ftrace_graph_caller(void)
 {
-	int ret;
-
-	ret = __ftrace_modify_call((unsigned long)&ftrace_graph_call,
-				    (unsigned long)&prepare_ftrace_return, false, true);
-	if (ret)
-		return ret;
-
-	return __ftrace_modify_call((unsigned long)&ftrace_graph_regs_call,
+	return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
 				    (unsigned long)&prepare_ftrace_return, false, true);
 }
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 #endif /* CONFIG_DYNAMIC_FTRACE */
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 669b8697aa38..fb8286b80cfc 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -57,31 +57,150 @@
 	.endm
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
-	.macro SAVE_ALL
+
+/**
+* SAVE_ABI_REGS - save regs against the pt_regs struct
+*
+* @all: tell if saving all the regs
+*
+* If all is set, all the regs will be saved, otherwise only ABI
+* related regs (a0-a7,epc,ra and optional s0) will be saved.
+*
+* After the stack is established,
+*
+* 0(sp) stores the PC of the traced function which can be accessed
+* by &(fregs)->regs->epc in tracing function. Note that the real
+* function entry address should be computed with -FENTRY_RA_OFFSET.
+*
+* 8(sp) stores the function return address (i.e. parent IP) that
+* can be accessed by &(fregs)->regs->ra in tracing function.
+*
+* The other regs are saved at the respective localtion and accessed
+* by the respective pt_regs member.
+*
+* Here is the layout of stack for your reference.
+*
+* PT_SIZE_ON_STACK  ->  +++++++++
+*                       + ..... +
+*                       + t3-t6 +
+*                       + s2-s11+
+*                       + a0-a7 + --++++-> ftrace_caller saved
+*                       + s1    +   +
+*                       + s0    + --+
+*                       + t0-t2 +   +
+*                       + tp    +   +
+*                       + gp    +   +
+*                       + sp    +   +
+*                       + ra    + --+ // parent IP
+*               sp  ->  + epc   + --+ // PC
+*                       +++++++++
+**/
+	.macro SAVE_ABI_REGS, all=0
 	addi	sp, sp, -PT_SIZE_ON_STACK
 
-	REG_S t0,  PT_EPC(sp)
-	REG_S x1,  PT_RA(sp)
-	REG_S x2,  PT_SP(sp)
-	REG_S x3,  PT_GP(sp)
-	REG_S x4,  PT_TP(sp)
-	REG_S x5,  PT_T0(sp)
-	save_from_x6_to_x31
+	REG_S	t0,  PT_EPC(sp)
+	REG_S	x1,  PT_RA(sp)
+
+	// save the ABI regs
+
+	REG_S	x10, PT_A0(sp)
+	REG_S	x11, PT_A1(sp)
+	REG_S	x12, PT_A2(sp)
+	REG_S	x13, PT_A3(sp)
+	REG_S	x14, PT_A4(sp)
+	REG_S	x15, PT_A5(sp)
+	REG_S	x16, PT_A6(sp)
+	REG_S	x17, PT_A7(sp)
+
+	// save the leftover regs
+
+	.if \all == 1
+	REG_S	x2, PT_SP(sp)
+	REG_S	x3, PT_GP(sp)
+	REG_S	x4, PT_TP(sp)
+	REG_S	x5, PT_T0(sp)
+	REG_S	x6, PT_T1(sp)
+	REG_S	x7, PT_T2(sp)
+	REG_S	x8, PT_S0(sp)
+	REG_S	x9, PT_S1(sp)
+	REG_S	x18, PT_S2(sp)
+	REG_S	x19, PT_S3(sp)
+	REG_S	x20, PT_S4(sp)
+	REG_S	x21, PT_S5(sp)
+	REG_S	x22, PT_S6(sp)
+	REG_S	x23, PT_S7(sp)
+	REG_S	x24, PT_S8(sp)
+	REG_S	x25, PT_S9(sp)
+	REG_S	x26, PT_S10(sp)
+	REG_S	x27, PT_S11(sp)
+	REG_S	x28, PT_T3(sp)
+	REG_S	x29, PT_T4(sp)
+	REG_S	x30, PT_T5(sp)
+	REG_S	x31, PT_T6(sp)
+
+	// save s0 if FP_TEST defined
+
+	.else
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+	REG_S	x8, PT_S0(sp)
+#endif
+	.endif
 	.endm
 
-	.macro RESTORE_ALL
-	REG_L x1,  PT_RA(sp)
-	REG_L x2,  PT_SP(sp)
-	REG_L x3,  PT_GP(sp)
-	REG_L x4,  PT_TP(sp)
-	/* Restore t0 with PT_EPC */
-	REG_L x5,  PT_EPC(sp)
-	restore_from_x6_to_x31
+	.macro RESTORE_ABI_REGS, all=0
+	REG_L	t0, PT_EPC(sp)
+	REG_L	x1, PT_RA(sp)
+	REG_L	x10, PT_A0(sp)
+	REG_L	x11, PT_A1(sp)
+	REG_L	x12, PT_A2(sp)
+	REG_L	x13, PT_A3(sp)
+	REG_L	x14, PT_A4(sp)
+	REG_L	x15, PT_A5(sp)
+	REG_L	x16, PT_A6(sp)
+	REG_L	x17, PT_A7(sp)
 
+	.if \all == 1
+	REG_L	x2, PT_SP(sp)
+	REG_L	x3, PT_GP(sp)
+	REG_L	x4, PT_TP(sp)
+	REG_L	x6, PT_T1(sp)
+	REG_L	x7, PT_T2(sp)
+	REG_L	x8, PT_S0(sp)
+	REG_L	x9, PT_S1(sp)
+	REG_L	x18, PT_S2(sp)
+	REG_L	x19, PT_S3(sp)
+	REG_L	x20, PT_S4(sp)
+	REG_L	x21, PT_S5(sp)
+	REG_L	x22, PT_S6(sp)
+	REG_L	x23, PT_S7(sp)
+	REG_L	x24, PT_S8(sp)
+	REG_L	x25, PT_S9(sp)
+	REG_L	x26, PT_S10(sp)
+	REG_L	x27, PT_S11(sp)
+	REG_L	x28, PT_T3(sp)
+	REG_L	x29, PT_T4(sp)
+	REG_L	x30, PT_T5(sp)
+	REG_L	x31, PT_T6(sp)
+
+	.else
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+	REG_L	x8, PT_S0(sp)
+#endif
+	.endif
 	addi	sp, sp, PT_SIZE_ON_STACK
 	.endm
+
+	.macro PREPARE_ARGS
+	addi	a0, t0, -FENTRY_RA_OFFSET
+	la	a1, function_trace_op
+	REG_L	a2, 0(a1)
+	mv	a1, ra
+	mv	a3, sp
+	.endm
+
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 ENTRY(ftrace_caller)
 	SAVE_ABI
 
@@ -107,36 +226,31 @@ ftrace_graph_call:
 	call	ftrace_stub
 #endif
 	RESTORE_ABI
-	jr t0
+	jr	t0
 ENDPROC(ftrace_caller)
 
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 ENTRY(ftrace_regs_caller)
-	SAVE_ALL
-
-	addi	a0, t0, -FENTRY_RA_OFFSET
-	la	a1, function_trace_op
-	REG_L	a2, 0(a1)
-	mv	a1, ra
-	mv	a3, sp
+	SAVE_ABI_REGS 1
+	PREPARE_ARGS
 
 ftrace_regs_call:
 	.global ftrace_regs_call
 	call	ftrace_stub
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	addi	a0, sp, PT_RA
-	REG_L	a1, PT_EPC(sp)
-	addi	a1, a1, -FENTRY_RA_OFFSET
-#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
-	mv	a2, s0
-#endif
-ftrace_graph_regs_call:
-	.global ftrace_graph_regs_call
+	RESTORE_ABI_REGS 1
+	jr	t0
+ENDPROC(ftrace_regs_caller)
+
+ENTRY(ftrace_caller)
+	SAVE_ABI_REGS 0
+	PREPARE_ARGS
+
+ftrace_call:
+	.global ftrace_call
 	call	ftrace_stub
-#endif
 
-	RESTORE_ALL
-	jr t0
-ENDPROC(ftrace_regs_caller)
+	RESTORE_ABI_REGS 0
+	jr	t0
+ENDPROC(ftrace_caller)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 2/5] riscv: ftrace: Add ftrace_graph_func
@ 2023-06-27 11:16   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Here implements ftrace_graph_func as the function graph tracing function
with FTRACE_WITH_REGS defined.

function_graph_func gets the point of the parent IP and the frame pointer
from fregs and call prepare_ftrace_return for function graph tracing.

If FTRACE_WITH_REGS isn't defined, the enable/disable helpers of
ftrace_graph_[regs]_call are revised for serving only ftrace_graph_call
in the !FTRACE_WITH_REGS version ftrace_caller.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/include/asm/ftrace.h |  11 +-
 arch/riscv/kernel/ftrace.c      |  30 +++--
 arch/riscv/kernel/mcount-dyn.S  | 190 +++++++++++++++++++++++++-------
 3 files changed, 175 insertions(+), 56 deletions(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index d47d87c2d7e3..84f856a3286e 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -107,7 +107,16 @@ do {									\
 struct dyn_ftrace;
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 #define ftrace_init_nop ftrace_init_nop
-#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+struct ftrace_ops;
+struct ftrace_regs;
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs);
+#define ftrace_graph_func ftrace_graph_func
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+#endif /* __ASSEMBLY__ */
 
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 03a6434a8cdd..f5aa24d9e1c1 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -178,32 +178,28 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 }
 
 #ifdef CONFIG_DYNAMIC_FTRACE
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs)
+{
+	struct pt_regs *regs = arch_ftrace_get_regs(fregs);
+	unsigned long *parent = (unsigned long *)&regs->ra;
+
+	prepare_ftrace_return(parent, ip, frame_pointer(regs));
+}
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 extern void ftrace_graph_call(void);
-extern void ftrace_graph_regs_call(void);
 int ftrace_enable_ftrace_graph_caller(void)
 {
-	int ret;
-
-	ret = __ftrace_modify_call((unsigned long)&ftrace_graph_call,
-				    (unsigned long)&prepare_ftrace_return, true, true);
-	if (ret)
-		return ret;
-
-	return __ftrace_modify_call((unsigned long)&ftrace_graph_regs_call,
+	return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
 				    (unsigned long)&prepare_ftrace_return, true, true);
 }
 
 int ftrace_disable_ftrace_graph_caller(void)
 {
-	int ret;
-
-	ret = __ftrace_modify_call((unsigned long)&ftrace_graph_call,
-				    (unsigned long)&prepare_ftrace_return, false, true);
-	if (ret)
-		return ret;
-
-	return __ftrace_modify_call((unsigned long)&ftrace_graph_regs_call,
+	return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
 				    (unsigned long)&prepare_ftrace_return, false, true);
 }
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 #endif /* CONFIG_DYNAMIC_FTRACE */
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 669b8697aa38..fb8286b80cfc 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -57,31 +57,150 @@
 	.endm
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
-	.macro SAVE_ALL
+
+/**
+* SAVE_ABI_REGS - save regs against the pt_regs struct
+*
+* @all: tell if saving all the regs
+*
+* If all is set, all the regs will be saved, otherwise only ABI
+* related regs (a0-a7,epc,ra and optional s0) will be saved.
+*
+* After the stack is established,
+*
+* 0(sp) stores the PC of the traced function which can be accessed
+* by &(fregs)->regs->epc in tracing function. Note that the real
+* function entry address should be computed with -FENTRY_RA_OFFSET.
+*
+* 8(sp) stores the function return address (i.e. parent IP) that
+* can be accessed by &(fregs)->regs->ra in tracing function.
+*
+* The other regs are saved at the respective localtion and accessed
+* by the respective pt_regs member.
+*
+* Here is the layout of stack for your reference.
+*
+* PT_SIZE_ON_STACK  ->  +++++++++
+*                       + ..... +
+*                       + t3-t6 +
+*                       + s2-s11+
+*                       + a0-a7 + --++++-> ftrace_caller saved
+*                       + s1    +   +
+*                       + s0    + --+
+*                       + t0-t2 +   +
+*                       + tp    +   +
+*                       + gp    +   +
+*                       + sp    +   +
+*                       + ra    + --+ // parent IP
+*               sp  ->  + epc   + --+ // PC
+*                       +++++++++
+**/
+	.macro SAVE_ABI_REGS, all=0
 	addi	sp, sp, -PT_SIZE_ON_STACK
 
-	REG_S t0,  PT_EPC(sp)
-	REG_S x1,  PT_RA(sp)
-	REG_S x2,  PT_SP(sp)
-	REG_S x3,  PT_GP(sp)
-	REG_S x4,  PT_TP(sp)
-	REG_S x5,  PT_T0(sp)
-	save_from_x6_to_x31
+	REG_S	t0,  PT_EPC(sp)
+	REG_S	x1,  PT_RA(sp)
+
+	// save the ABI regs
+
+	REG_S	x10, PT_A0(sp)
+	REG_S	x11, PT_A1(sp)
+	REG_S	x12, PT_A2(sp)
+	REG_S	x13, PT_A3(sp)
+	REG_S	x14, PT_A4(sp)
+	REG_S	x15, PT_A5(sp)
+	REG_S	x16, PT_A6(sp)
+	REG_S	x17, PT_A7(sp)
+
+	// save the leftover regs
+
+	.if \all == 1
+	REG_S	x2, PT_SP(sp)
+	REG_S	x3, PT_GP(sp)
+	REG_S	x4, PT_TP(sp)
+	REG_S	x5, PT_T0(sp)
+	REG_S	x6, PT_T1(sp)
+	REG_S	x7, PT_T2(sp)
+	REG_S	x8, PT_S0(sp)
+	REG_S	x9, PT_S1(sp)
+	REG_S	x18, PT_S2(sp)
+	REG_S	x19, PT_S3(sp)
+	REG_S	x20, PT_S4(sp)
+	REG_S	x21, PT_S5(sp)
+	REG_S	x22, PT_S6(sp)
+	REG_S	x23, PT_S7(sp)
+	REG_S	x24, PT_S8(sp)
+	REG_S	x25, PT_S9(sp)
+	REG_S	x26, PT_S10(sp)
+	REG_S	x27, PT_S11(sp)
+	REG_S	x28, PT_T3(sp)
+	REG_S	x29, PT_T4(sp)
+	REG_S	x30, PT_T5(sp)
+	REG_S	x31, PT_T6(sp)
+
+	// save s0 if FP_TEST defined
+
+	.else
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+	REG_S	x8, PT_S0(sp)
+#endif
+	.endif
 	.endm
 
-	.macro RESTORE_ALL
-	REG_L x1,  PT_RA(sp)
-	REG_L x2,  PT_SP(sp)
-	REG_L x3,  PT_GP(sp)
-	REG_L x4,  PT_TP(sp)
-	/* Restore t0 with PT_EPC */
-	REG_L x5,  PT_EPC(sp)
-	restore_from_x6_to_x31
+	.macro RESTORE_ABI_REGS, all=0
+	REG_L	t0, PT_EPC(sp)
+	REG_L	x1, PT_RA(sp)
+	REG_L	x10, PT_A0(sp)
+	REG_L	x11, PT_A1(sp)
+	REG_L	x12, PT_A2(sp)
+	REG_L	x13, PT_A3(sp)
+	REG_L	x14, PT_A4(sp)
+	REG_L	x15, PT_A5(sp)
+	REG_L	x16, PT_A6(sp)
+	REG_L	x17, PT_A7(sp)
 
+	.if \all == 1
+	REG_L	x2, PT_SP(sp)
+	REG_L	x3, PT_GP(sp)
+	REG_L	x4, PT_TP(sp)
+	REG_L	x6, PT_T1(sp)
+	REG_L	x7, PT_T2(sp)
+	REG_L	x8, PT_S0(sp)
+	REG_L	x9, PT_S1(sp)
+	REG_L	x18, PT_S2(sp)
+	REG_L	x19, PT_S3(sp)
+	REG_L	x20, PT_S4(sp)
+	REG_L	x21, PT_S5(sp)
+	REG_L	x22, PT_S6(sp)
+	REG_L	x23, PT_S7(sp)
+	REG_L	x24, PT_S8(sp)
+	REG_L	x25, PT_S9(sp)
+	REG_L	x26, PT_S10(sp)
+	REG_L	x27, PT_S11(sp)
+	REG_L	x28, PT_T3(sp)
+	REG_L	x29, PT_T4(sp)
+	REG_L	x30, PT_T5(sp)
+	REG_L	x31, PT_T6(sp)
+
+	.else
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+	REG_L	x8, PT_S0(sp)
+#endif
+	.endif
 	addi	sp, sp, PT_SIZE_ON_STACK
 	.endm
+
+	.macro PREPARE_ARGS
+	addi	a0, t0, -FENTRY_RA_OFFSET
+	la	a1, function_trace_op
+	REG_L	a2, 0(a1)
+	mv	a1, ra
+	mv	a3, sp
+	.endm
+
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 ENTRY(ftrace_caller)
 	SAVE_ABI
 
@@ -107,36 +226,31 @@ ftrace_graph_call:
 	call	ftrace_stub
 #endif
 	RESTORE_ABI
-	jr t0
+	jr	t0
 ENDPROC(ftrace_caller)
 
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 ENTRY(ftrace_regs_caller)
-	SAVE_ALL
-
-	addi	a0, t0, -FENTRY_RA_OFFSET
-	la	a1, function_trace_op
-	REG_L	a2, 0(a1)
-	mv	a1, ra
-	mv	a3, sp
+	SAVE_ABI_REGS 1
+	PREPARE_ARGS
 
 ftrace_regs_call:
 	.global ftrace_regs_call
 	call	ftrace_stub
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	addi	a0, sp, PT_RA
-	REG_L	a1, PT_EPC(sp)
-	addi	a1, a1, -FENTRY_RA_OFFSET
-#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
-	mv	a2, s0
-#endif
-ftrace_graph_regs_call:
-	.global ftrace_graph_regs_call
+	RESTORE_ABI_REGS 1
+	jr	t0
+ENDPROC(ftrace_regs_caller)
+
+ENTRY(ftrace_caller)
+	SAVE_ABI_REGS 0
+	PREPARE_ARGS
+
+ftrace_call:
+	.global ftrace_call
 	call	ftrace_stub
-#endif
 
-	RESTORE_ALL
-	jr t0
-ENDPROC(ftrace_regs_caller)
+	RESTORE_ABI_REGS 0
+	jr	t0
+ENDPROC(ftrace_caller)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 3/5] riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  2023-06-27 11:16 ` Song Shuai
@ 2023-06-27 11:16   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

This patch adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.

select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the
register_ftrace_direct[_multi] interfaces allowing users to register
the customed trampoline (direct_caller) as the mcount for one or
more target functions. And modify_ftrace_direct[_multi] are also
provided for modifying direct_caller.

To make the direct_caller and the other ftrace hooks (eg. function/fgraph
tracer, k[ret]probes) co-exist, a temporary register is nominated to
store the address of direct_caller in ftrace_regs_caller. After the
setting of the address direct_caller by direct_ops->func and the
RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
by the `jr` inst.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig              |  1 +
 arch/riscv/include/asm/ftrace.h |  8 ++++++++
 arch/riscv/kernel/mcount-dyn.S  | 10 ++++++++++
 3 files changed, 19 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 756d854e6cdd..c3e678450acf 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -145,6 +145,7 @@ config RISCV
 	select UACCESS_MEMCPY if !MMU
 	select ZONE_DMA32 if 64BIT
 	select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
 	select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
 	select HAVE_FUNCTION_GRAPH_TRACER
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 84f856a3286e..84904c1e4369 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -114,6 +114,14 @@ struct ftrace_regs;
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *op, struct ftrace_regs *fregs);
 #define ftrace_graph_func ftrace_graph_func
+
+static inline void
+__arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
+{
+		regs->t1 = addr;
+}
+#define arch_ftrace_set_direct_caller(fregs, addr) \
+	__arch_ftrace_set_direct_caller(&(fregs)->regs, addr)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index fb8286b80cfc..b6f4e1847d61 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -231,6 +231,7 @@ ENDPROC(ftrace_caller)
 
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 ENTRY(ftrace_regs_caller)
+	mv	t1, zero
 	SAVE_ABI_REGS 1
 	PREPARE_ARGS
 
@@ -239,7 +240,10 @@ ftrace_regs_call:
 	call	ftrace_stub
 
 	RESTORE_ABI_REGS 1
+	bnez	t1,.Ldirect
 	jr	t0
+.Ldirect:
+	jr	t1
 ENDPROC(ftrace_regs_caller)
 
 ENTRY(ftrace_caller)
@@ -254,3 +258,9 @@ ftrace_call:
 	jr	t0
 ENDPROC(ftrace_caller)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	jr	t0
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 3/5] riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
@ 2023-06-27 11:16   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

This patch adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.

select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the
register_ftrace_direct[_multi] interfaces allowing users to register
the customed trampoline (direct_caller) as the mcount for one or
more target functions. And modify_ftrace_direct[_multi] are also
provided for modifying direct_caller.

To make the direct_caller and the other ftrace hooks (eg. function/fgraph
tracer, k[ret]probes) co-exist, a temporary register is nominated to
store the address of direct_caller in ftrace_regs_caller. After the
setting of the address direct_caller by direct_ops->func and the
RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
by the `jr` inst.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig              |  1 +
 arch/riscv/include/asm/ftrace.h |  8 ++++++++
 arch/riscv/kernel/mcount-dyn.S  | 10 ++++++++++
 3 files changed, 19 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 756d854e6cdd..c3e678450acf 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -145,6 +145,7 @@ config RISCV
 	select UACCESS_MEMCPY if !MMU
 	select ZONE_DMA32 if 64BIT
 	select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
 	select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
 	select HAVE_FUNCTION_GRAPH_TRACER
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 84f856a3286e..84904c1e4369 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -114,6 +114,14 @@ struct ftrace_regs;
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *op, struct ftrace_regs *fregs);
 #define ftrace_graph_func ftrace_graph_func
+
+static inline void
+__arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
+{
+		regs->t1 = addr;
+}
+#define arch_ftrace_set_direct_caller(fregs, addr) \
+	__arch_ftrace_set_direct_caller(&(fregs)->regs, addr)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index fb8286b80cfc..b6f4e1847d61 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -231,6 +231,7 @@ ENDPROC(ftrace_caller)
 
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 ENTRY(ftrace_regs_caller)
+	mv	t1, zero
 	SAVE_ABI_REGS 1
 	PREPARE_ARGS
 
@@ -239,7 +240,10 @@ ftrace_regs_call:
 	call	ftrace_stub
 
 	RESTORE_ABI_REGS 1
+	bnez	t1,.Ldirect
 	jr	t0
+.Ldirect:
+	jr	t1
 ENDPROC(ftrace_regs_caller)
 
 ENTRY(ftrace_caller)
@@ -254,3 +258,9 @@ ftrace_call:
 	jr	t0
 ENDPROC(ftrace_caller)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	jr	t0
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 4/5] samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
  2023-06-27 11:16 ` Song Shuai
@ 2023-06-27 11:16   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	Evgenii Shatokhin

select HAVE_SAMPLE_FTRACE_DIRECT and HAVE_SAMPLE_FTRACE_DIRECT_MULTI
for ARCH_RV64I in arch/riscv/Kconfig. And add riscv asm code for
the ftrace-direct*.c files in samples/ftrace/.

Link: https://lore.kernel.org/linux-riscv/c68bac83-5c88-80b1-bac9-e1fd4ea8f07e@yadro.com/T/#ma13012560331c66b051b580b3ab4a04ba44455ec
Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig                          |  2 ++
 samples/ftrace/ftrace-direct-modify.c       | 34 ++++++++++++++++++
 samples/ftrace/ftrace-direct-multi-modify.c | 40 +++++++++++++++++++++
 samples/ftrace/ftrace-direct-multi.c        | 24 +++++++++++++
 samples/ftrace/ftrace-direct-too.c          | 27 ++++++++++++++
 samples/ftrace/ftrace-direct.c              | 23 ++++++++++++
 6 files changed, 150 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c3e678450acf..35d8255a12c6 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -122,6 +122,8 @@ config RISCV
 	select HAVE_POSIX_CPU_TIMERS_TASK_WORK
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAMPLE_FTRACE_DIRECT
+	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_DOMAIN
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 06d889149012..e90ca7b68314 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -22,6 +22,40 @@ extern void my_tramp2(void *);
 
 static unsigned long my_ip = (unsigned long)schedule;
 
+#ifdef CONFIG_RISCV
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:\n"
+"	addi	sp,sp,-16\n"
+"	sd	t0,0(sp)\n"
+"	sd	ra,8(sp)\n"
+"	call	my_direct_func1\n"
+"	ld	t0,0(sp)\n"
+"	ld	ra,8(sp)\n"
+"	addi	sp,sp,16\n"
+"	jr	t0\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:\n"
+"	addi	sp,sp,-16\n"
+"	sd	t0,0(sp)\n"
+"	sd	ra,8(sp)\n"
+"	call	my_direct_func2\n"
+"	ld	t0,0(sp)\n"
+"	ld	ra,8(sp)\n"
+"	addi	sp,sp,16\n"
+"	jr	t0\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 62f6b681999e..5a81af7b3af3 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -20,6 +20,46 @@ void my_direct_func2(unsigned long ip)
 extern void my_tramp1(void *);
 extern void my_tramp2(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func1\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"	jr	t0\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func2\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"	jr	t0\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 5482cf616b43..0e9bb94edade 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -15,6 +15,30 @@ void my_direct_func(unsigned long ip)
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index a05bc2cc2261..5c319db48af2 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -17,6 +17,33 @@ void my_direct_func(struct vm_area_struct *vma,
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-40\n"
+"       sd	a0,0(sp)\n"
+"       sd	a1,8(sp)\n"
+"       sd	a2,16(sp)\n"
+"       sd	t0,24(sp)\n"
+"       sd	ra,32(sp)\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	a1,8(sp)\n"
+"       ld	a2,16(sp)\n"
+"       ld	t0,24(sp)\n"
+"       ld	ra,32(sp)\n"
+"       addi	sp,sp,40\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 06879bbd3399..ca95506b0350 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -14,6 +14,29 @@ void my_direct_func(struct task_struct *p)
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 4/5] samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
@ 2023-06-27 11:16   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	Evgenii Shatokhin

select HAVE_SAMPLE_FTRACE_DIRECT and HAVE_SAMPLE_FTRACE_DIRECT_MULTI
for ARCH_RV64I in arch/riscv/Kconfig. And add riscv asm code for
the ftrace-direct*.c files in samples/ftrace/.

Link: https://lore.kernel.org/linux-riscv/c68bac83-5c88-80b1-bac9-e1fd4ea8f07e@yadro.com/T/#ma13012560331c66b051b580b3ab4a04ba44455ec
Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/Kconfig                          |  2 ++
 samples/ftrace/ftrace-direct-modify.c       | 34 ++++++++++++++++++
 samples/ftrace/ftrace-direct-multi-modify.c | 40 +++++++++++++++++++++
 samples/ftrace/ftrace-direct-multi.c        | 24 +++++++++++++
 samples/ftrace/ftrace-direct-too.c          | 27 ++++++++++++++
 samples/ftrace/ftrace-direct.c              | 23 ++++++++++++
 6 files changed, 150 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c3e678450acf..35d8255a12c6 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -122,6 +122,8 @@ config RISCV
 	select HAVE_POSIX_CPU_TIMERS_TASK_WORK
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAMPLE_FTRACE_DIRECT
+	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_DOMAIN
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 06d889149012..e90ca7b68314 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -22,6 +22,40 @@ extern void my_tramp2(void *);
 
 static unsigned long my_ip = (unsigned long)schedule;
 
+#ifdef CONFIG_RISCV
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:\n"
+"	addi	sp,sp,-16\n"
+"	sd	t0,0(sp)\n"
+"	sd	ra,8(sp)\n"
+"	call	my_direct_func1\n"
+"	ld	t0,0(sp)\n"
+"	ld	ra,8(sp)\n"
+"	addi	sp,sp,16\n"
+"	jr	t0\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:\n"
+"	addi	sp,sp,-16\n"
+"	sd	t0,0(sp)\n"
+"	sd	ra,8(sp)\n"
+"	call	my_direct_func2\n"
+"	ld	t0,0(sp)\n"
+"	ld	ra,8(sp)\n"
+"	addi	sp,sp,16\n"
+"	jr	t0\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 62f6b681999e..5a81af7b3af3 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -20,6 +20,46 @@ void my_direct_func2(unsigned long ip)
 extern void my_tramp1(void *);
 extern void my_tramp2(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func1\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"	jr	t0\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func2\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"	jr	t0\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 5482cf616b43..0e9bb94edade 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -15,6 +15,30 @@ void my_direct_func(unsigned long ip)
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       mv	a0,t0\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index a05bc2cc2261..5c319db48af2 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -17,6 +17,33 @@ void my_direct_func(struct vm_area_struct *vma,
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-40\n"
+"       sd	a0,0(sp)\n"
+"       sd	a1,8(sp)\n"
+"       sd	a2,16(sp)\n"
+"       sd	t0,24(sp)\n"
+"       sd	ra,32(sp)\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	a1,8(sp)\n"
+"       ld	a2,16(sp)\n"
+"       ld	t0,24(sp)\n"
+"       ld	ra,32(sp)\n"
+"       addi	sp,sp,40\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 06879bbd3399..ca95506b0350 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -14,6 +14,29 @@ void my_direct_func(struct task_struct *p)
 
 extern void my_tramp(void *);
 
+#ifdef CONFIG_RISCV
+
+asm (
+"       .pushsection    .text, \"ax\", @progbits\n"
+"       .type           my_tramp, @function\n"
+"       .globl          my_tramp\n"
+"   my_tramp:\n"
+"       addi	sp,sp,-24\n"
+"       sd	a0,0(sp)\n"
+"       sd	t0,8(sp)\n"
+"       sd	ra,16(sp)\n"
+"       call	my_direct_func\n"
+"       ld	a0,0(sp)\n"
+"       ld	t0,8(sp)\n"
+"       ld	ra,16(sp)\n"
+"       addi	sp,sp,24\n"
+"       jr	t0\n"
+"       .size           my_tramp, .-my_tramp\n"
+"       .popsection\n"
+);
+
+#endif /* CONFIG_RISCV */
+
 #ifdef CONFIG_X86_64
 
 #include <asm/ibt.h>
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 5/5] samples: ftrace: Make the riscv samples support RV32I
  2023-06-27 11:16 ` Song Shuai
@ 2023-06-27 11:16   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Since the commit f32b4b467ebd ("RISC-V: enable dynamic ftrace for
RV32I") enables dynamic ftrace for RV32I, make these riscv samples
also support RV32I.

Link: https://lore.kernel.org/all/mhng-29a592bf-1b25-4c6c-8f37-0d05d39bc093@palmer-ri-x1c9a/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
---
 samples/ftrace/ftrace-direct-modify.c       | 27 +++++++++--------
 samples/ftrace/ftrace-direct-multi-modify.c | 33 +++++++++++----------
 samples/ftrace/ftrace-direct-multi.c        | 17 ++++++-----
 samples/ftrace/ftrace-direct-too.c          | 25 ++++++++--------
 samples/ftrace/ftrace-direct.c              | 17 ++++++-----
 5 files changed, 62 insertions(+), 57 deletions(-)

diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index e90ca7b68314..071cf4093a24 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -23,32 +23,33 @@ extern void my_tramp2(void *);
 static unsigned long my_ip = (unsigned long)schedule;
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:\n"
-"	addi	sp,sp,-16\n"
-"	sd	t0,0(sp)\n"
-"	sd	ra,8(sp)\n"
+"	addi	sp,sp,-2*"SZREG"\n"
+"	"REG_S"	t0,0*"SZREG"(sp)\n"
+"	"REG_S"	ra,1*"SZREG"(sp)\n"
 "	call	my_direct_func1\n"
-"	ld	t0,0(sp)\n"
-"	ld	ra,8(sp)\n"
-"	addi	sp,sp,16\n"
+"	"REG_L"	t0,0*"SZREG"(sp)\n"
+"	"REG_L"	ra,1*"SZREG"(sp)\n"
+"	addi	sp,sp,2*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp1, .-my_tramp1\n"
-
 "	.type		my_tramp2, @function\n"
 "	.globl		my_tramp2\n"
+
 "   my_tramp2:\n"
-"	addi	sp,sp,-16\n"
-"	sd	t0,0(sp)\n"
-"	sd	ra,8(sp)\n"
+"	addi	sp,sp,-2*"SZREG"\n"
+"	"REG_S"	t0,0*"SZREG"(sp)\n"
+"	"REG_S"	ra,1*"SZREG"(sp)\n"
 "	call	my_direct_func2\n"
-"	ld	t0,0(sp)\n"
-"	ld	ra,8(sp)\n"
-"	addi	sp,sp,16\n"
+"	"REG_L"	t0,0*"SZREG"(sp)\n"
+"	"REG_L"	ra,1*"SZREG"(sp)\n"
+"	addi	sp,sp,2*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp2, .-my_tramp2\n"
 "	.popsection\n"
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 5a81af7b3af3..b754803d0a50 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -21,38 +21,39 @@ extern void my_tramp1(void *);
 extern void my_tramp2(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func1\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp1, .-my_tramp1\n"
 
 "	.type		my_tramp2, @function\n"
 "	.globl		my_tramp2\n"
 "   my_tramp2:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func2\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp2, .-my_tramp2\n"
 "	.popsection\n"
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 0e9bb94edade..a31f43ace85c 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -16,22 +16,23 @@ void my_direct_func(unsigned long ip)
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 5c319db48af2..a1f86dd48847 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -18,25 +18,26 @@ void my_direct_func(struct vm_area_struct *vma,
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-40\n"
-"       sd	a0,0(sp)\n"
-"       sd	a1,8(sp)\n"
-"       sd	a2,16(sp)\n"
-"       sd	t0,24(sp)\n"
-"       sd	ra,32(sp)\n"
+"       addi	sp,sp,-5*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	a1,1*"SZREG"(sp)\n"
+"       "REG_S"	a2,2*"SZREG"(sp)\n"
+"       "REG_S"	t0,3*"SZREG"(sp)\n"
+"       "REG_S"	ra,4*"SZREG"(sp)\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	a1,8(sp)\n"
-"       ld	a2,16(sp)\n"
-"       ld	t0,24(sp)\n"
-"       ld	ra,32(sp)\n"
-"       addi	sp,sp,40\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	a1,1*"SZREG"(sp)\n"
+"       "REG_L"	a2,2*"SZREG"(sp)\n"
+"       "REG_L"	t0,3*"SZREG"(sp)\n"
+"       "REG_L"	ra,4*"SZREG"(sp)\n"
+"       addi	sp,sp,5*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index ca95506b0350..fe6b7ef0a2d5 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -15,21 +15,22 @@ void my_direct_func(struct task_struct *p)
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V11 5/5] samples: ftrace: Make the riscv samples support RV32I
@ 2023-06-27 11:16   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-06-27 11:16 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, suagrfillet, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Since the commit f32b4b467ebd ("RISC-V: enable dynamic ftrace for
RV32I") enables dynamic ftrace for RV32I, make these riscv samples
also support RV32I.

Link: https://lore.kernel.org/all/mhng-29a592bf-1b25-4c6c-8f37-0d05d39bc093@palmer-ri-x1c9a/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
---
 samples/ftrace/ftrace-direct-modify.c       | 27 +++++++++--------
 samples/ftrace/ftrace-direct-multi-modify.c | 33 +++++++++++----------
 samples/ftrace/ftrace-direct-multi.c        | 17 ++++++-----
 samples/ftrace/ftrace-direct-too.c          | 25 ++++++++--------
 samples/ftrace/ftrace-direct.c              | 17 ++++++-----
 5 files changed, 62 insertions(+), 57 deletions(-)

diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index e90ca7b68314..071cf4093a24 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -23,32 +23,33 @@ extern void my_tramp2(void *);
 static unsigned long my_ip = (unsigned long)schedule;
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:\n"
-"	addi	sp,sp,-16\n"
-"	sd	t0,0(sp)\n"
-"	sd	ra,8(sp)\n"
+"	addi	sp,sp,-2*"SZREG"\n"
+"	"REG_S"	t0,0*"SZREG"(sp)\n"
+"	"REG_S"	ra,1*"SZREG"(sp)\n"
 "	call	my_direct_func1\n"
-"	ld	t0,0(sp)\n"
-"	ld	ra,8(sp)\n"
-"	addi	sp,sp,16\n"
+"	"REG_L"	t0,0*"SZREG"(sp)\n"
+"	"REG_L"	ra,1*"SZREG"(sp)\n"
+"	addi	sp,sp,2*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp1, .-my_tramp1\n"
-
 "	.type		my_tramp2, @function\n"
 "	.globl		my_tramp2\n"
+
 "   my_tramp2:\n"
-"	addi	sp,sp,-16\n"
-"	sd	t0,0(sp)\n"
-"	sd	ra,8(sp)\n"
+"	addi	sp,sp,-2*"SZREG"\n"
+"	"REG_S"	t0,0*"SZREG"(sp)\n"
+"	"REG_S"	ra,1*"SZREG"(sp)\n"
 "	call	my_direct_func2\n"
-"	ld	t0,0(sp)\n"
-"	ld	ra,8(sp)\n"
-"	addi	sp,sp,16\n"
+"	"REG_L"	t0,0*"SZREG"(sp)\n"
+"	"REG_L"	ra,1*"SZREG"(sp)\n"
+"	addi	sp,sp,2*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp2, .-my_tramp2\n"
 "	.popsection\n"
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 5a81af7b3af3..b754803d0a50 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -21,38 +21,39 @@ extern void my_tramp1(void *);
 extern void my_tramp2(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "	.pushsection    .text, \"ax\", @progbits\n"
 "	.type		my_tramp1, @function\n"
 "	.globl		my_tramp1\n"
 "   my_tramp1:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func1\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp1, .-my_tramp1\n"
 
 "	.type		my_tramp2, @function\n"
 "	.globl		my_tramp2\n"
 "   my_tramp2:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func2\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "	jr	t0\n"
 "	.size		my_tramp2, .-my_tramp2\n"
 "	.popsection\n"
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 0e9bb94edade..a31f43ace85c 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -16,22 +16,23 @@ void my_direct_func(unsigned long ip)
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       mv	a0,t0\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 5c319db48af2..a1f86dd48847 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -18,25 +18,26 @@ void my_direct_func(struct vm_area_struct *vma,
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-40\n"
-"       sd	a0,0(sp)\n"
-"       sd	a1,8(sp)\n"
-"       sd	a2,16(sp)\n"
-"       sd	t0,24(sp)\n"
-"       sd	ra,32(sp)\n"
+"       addi	sp,sp,-5*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	a1,1*"SZREG"(sp)\n"
+"       "REG_S"	a2,2*"SZREG"(sp)\n"
+"       "REG_S"	t0,3*"SZREG"(sp)\n"
+"       "REG_S"	ra,4*"SZREG"(sp)\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	a1,8(sp)\n"
-"       ld	a2,16(sp)\n"
-"       ld	t0,24(sp)\n"
-"       ld	ra,32(sp)\n"
-"       addi	sp,sp,40\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	a1,1*"SZREG"(sp)\n"
+"       "REG_L"	a2,2*"SZREG"(sp)\n"
+"       "REG_L"	t0,3*"SZREG"(sp)\n"
+"       "REG_L"	ra,4*"SZREG"(sp)\n"
+"       addi	sp,sp,5*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index ca95506b0350..fe6b7ef0a2d5 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -15,21 +15,22 @@ void my_direct_func(struct task_struct *p)
 extern void my_tramp(void *);
 
 #ifdef CONFIG_RISCV
+#include <asm/asm.h>
 
 asm (
 "       .pushsection    .text, \"ax\", @progbits\n"
 "       .type           my_tramp, @function\n"
 "       .globl          my_tramp\n"
 "   my_tramp:\n"
-"       addi	sp,sp,-24\n"
-"       sd	a0,0(sp)\n"
-"       sd	t0,8(sp)\n"
-"       sd	ra,16(sp)\n"
+"       addi	sp,sp,-3*"SZREG"\n"
+"       "REG_S"	a0,0*"SZREG"(sp)\n"
+"       "REG_S"	t0,1*"SZREG"(sp)\n"
+"       "REG_S"	ra,2*"SZREG"(sp)\n"
 "       call	my_direct_func\n"
-"       ld	a0,0(sp)\n"
-"       ld	t0,8(sp)\n"
-"       ld	ra,16(sp)\n"
-"       addi	sp,sp,24\n"
+"       "REG_L"	a0,0*"SZREG"(sp)\n"
+"       "REG_L"	t0,1*"SZREG"(sp)\n"
+"       "REG_L"	ra,2*"SZREG"(sp)\n"
+"       addi	sp,sp,3*"SZREG"\n"
 "       jr	t0\n"
 "       .size           my_tramp, .-my_tramp\n"
 "       .popsection\n"
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-06-27 11:16 ` Song Shuai
@ 2023-07-06  9:35   ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-07-06  9:35 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Ping...

在 2023/6/27 19:16, Song Shuai 写道:
> Changes in V11:
> 
> - append a patch that makes the DIRECT_CALL samples support RV32I in
>    this series fixing the rv32 build failure reported by Palmer
> 
> - validated with ftrace boottime selftest and manual sample modules test
>    in qemu-system for RV32I and RV64I
> 
> This series optimizes function trace. The first 3 independent
> patches has been picked in the V7 version of this series, the
> subsequent version continues the following 4 patches:
> 
> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1] (patch 1)
> ==========================================================
> 
> In RISC-V, -fpatchable-function-entry option is used to support
> dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
> PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
> don't have to be called to create the __mcount_loc section before
> the vmlinux linking.
> 
> Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
> Makefile not to run recordmcount.
> 
> Make function graph use ftrace directly [2] (patch 2)
> ========================================================
> 
> In RISC-V architecture, when we enable the ftrace_graph tracer on some
> functions, the function tracings on other functions will suffer extra
> graph tracing work. In essence, graph_ops isn't limited by its func_hash
> due to the global ftrace_graph_[regs]_call label. That should be
> corrected.
> 
> What inspires me is the commit 0c0593b45c9b ("x86/ftrace: Make function
> graph use ftrace directly") that uses graph_ops::func function to
> install return_hooker and makes the function called against its
> func_hash.
> 
> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
> ==============================================
> 
> This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.
> SAMPLE_FTRACE_DIRECT and SAMPLE_FTRACE_DIRECT_MULTI are also included
> here as the samples for testing DIRECT_CALLS related interface.
> 
> First, select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide
> register_ftrace_direct[_multi] interfaces allowing user to register
> the customed trampoline (direct_caller) as the mcount for one or
> more target functions. And modify_ftrace_direct[_multi] are also
> provided for modify direct_caller.
> 
> At the same time, the samples in ./samples/ftrace/ can be built
> as kerenl module for testing these interfaces with SAMPLE_FTRACE_DIRECT
> and SAMPLE_FTRACE_DIRECT_MULTI selected.
> 
> Second, to make the direct_caller and the other ftrace hooks
> (eg. function/fgraph tracer, k[ret]probes) co-exist, a temporary
> register
> are nominated to store the address of direct_caller in
> ftrace_regs_caller.
> After the setting of the address direct_caller by direct_ops->func and
> the RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
> by the `jr` inst.
> 
> The series's old changes related these patches
> ==========================================
> 
> Changes in v10:
> https://lore.kernel.org/all/20230511093234.3123181-1-suagrfillet@gmail.com/
> 
> - add Acked-by from Björn Töpel in patch 2 and patch 4
> - replace `move` with `mv` in patch3
> - prettify patch 2/4 with proper tabs
> 
> Changes in v9:
> https://lore.kernel.org/linux-riscv/20230510101857.2953955-1-suagrfillet@gmail.com/
> 
> 1. add Acked-by from Björn Töpel in patch 1
> 
> 2. rebase patch2/patch3 on Linux v6.4-rc1
> 
>    - patch 2: to make the `SAVE_ABI_REGS` configurable, revert the
>      modification of mcount-dyn.S from commit (45b32b946a97 "riscv:
> entry: Consolidate general regs saving/restoring")
> 
>    - patch 3: to pass the trace_selftest, add the implement of
>      `ftrace_stub_direct_tramp` from commit (fee86a4ed536 "ftrace:
> selftest: remove broken trace_direct_tramp") ; and fixup the context
> conflict in Kconfig
> 
> Changes in v8:
> https://lore.kernel.org/linux-riscv/20230324033342.3177979-1-suagrfillet@gmail.com/
>   - Fix incorrect address values in the 4nd patch
>   - Rebased on v6.3-rc2
> 
> Changes in v7:
> https://lore.kernel.org/linux-riscv/20230112090603.1295340-1-guoren@kernel.org/
>   - Fixup RESTORE_ABI_REGS by remove PT_T0(sp) overwrite.
>   - Add FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1]
>   - Fixup kconfig with HAVE_SAMPLE_FTRACE_DIRECT &
>     HAVE_SAMPLE_FTRACE_DIRECT_MULTI
> 
> Changes in v6:
> https://lore.kernel.org/linux-riscv/20230107133549.4192639-1-guoren@kernel.org/
>   - Replace 8 with MCOUNT_INSN_SIZE
>   - Replace "REG_L a1, PT_RA(sp)" with "mv a1, ra"
>   - Add Evgenii Shatokhin comment
> 
> Changes in v5:
> https://lore.kernel.org/linux-riscv/20221208091244.203407-1-guoren@kernel.org/
>   - Sort Kconfig entries in alphabetical order.
> 
> Changes in v4:
> https://lore.kernel.org/linux-riscv/20221129033230.255947-1-guoren@kernel.org/
>   - Include [3] for maintenance. [Song Shuai]
> 
> Changes in V3:
> https://lore.kernel.org/linux-riscv/20221123153950.2911981-1-guoren@kernel.org/
>   - Include [2] for maintenance. [Song Shuai]
> 
> [1]: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
> [2]: https://lore.kernel.org/lkml/20221120084230.910152-1-suagrfillet@gmail.com/
> [3]: https://lore.kernel.org/linux-riscv/20221123142025.1504030-1-suagrfillet@gmail.com/
> 
> Song Shuai (5):
>    riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
>    riscv: ftrace: Add ftrace_graph_func
>    riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
>    samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
>    samples: ftrace: Make the riscv samples support RV32I
> 
>   arch/riscv/Kconfig                          |   4 +
>   arch/riscv/include/asm/ftrace.h             |  19 +-
>   arch/riscv/kernel/ftrace.c                  |  30 ++-
>   arch/riscv/kernel/mcount-dyn.S              | 200 ++++++++++++++++----
>   samples/ftrace/ftrace-direct-modify.c       |  35 ++++
>   samples/ftrace/ftrace-direct-multi-modify.c |  41 ++++
>   samples/ftrace/ftrace-direct-multi.c        |  25 +++
>   samples/ftrace/ftrace-direct-too.c          |  28 +++
>   samples/ftrace/ftrace-direct.c              |  24 +++
>   9 files changed, 350 insertions(+), 56 deletions(-)
> 

-- 
Thanks
Song Shuai

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-06  9:35   ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-07-06  9:35 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, conor.dooley
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai

Ping...

在 2023/6/27 19:16, Song Shuai 写道:
> Changes in V11:
> 
> - append a patch that makes the DIRECT_CALL samples support RV32I in
>    this series fixing the rv32 build failure reported by Palmer
> 
> - validated with ftrace boottime selftest and manual sample modules test
>    in qemu-system for RV32I and RV64I
> 
> This series optimizes function trace. The first 3 independent
> patches has been picked in the V7 version of this series, the
> subsequent version continues the following 4 patches:
> 
> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1] (patch 1)
> ==========================================================
> 
> In RISC-V, -fpatchable-function-entry option is used to support
> dynamic ftrace in this commit afc76b8b8011 ("riscv: Using
> PATCHABLE_FUNCTION_ENTRY instead of MCOUNT"). So recordmcount
> don't have to be called to create the __mcount_loc section before
> the vmlinux linking.
> 
> Here selects FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell
> Makefile not to run recordmcount.
> 
> Make function graph use ftrace directly [2] (patch 2)
> ========================================================
> 
> In RISC-V architecture, when we enable the ftrace_graph tracer on some
> functions, the function tracings on other functions will suffer extra
> graph tracing work. In essence, graph_ops isn't limited by its func_hash
> due to the global ftrace_graph_[regs]_call label. That should be
> corrected.
> 
> What inspires me is the commit 0c0593b45c9b ("x86/ftrace: Make function
> graph use ftrace directly") that uses graph_ops::func function to
> install return_hooker and makes the function called against its
> func_hash.
> 
> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
> ==============================================
> 
> This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.
> SAMPLE_FTRACE_DIRECT and SAMPLE_FTRACE_DIRECT_MULTI are also included
> here as the samples for testing DIRECT_CALLS related interface.
> 
> First, select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide
> register_ftrace_direct[_multi] interfaces allowing user to register
> the customed trampoline (direct_caller) as the mcount for one or
> more target functions. And modify_ftrace_direct[_multi] are also
> provided for modify direct_caller.
> 
> At the same time, the samples in ./samples/ftrace/ can be built
> as kerenl module for testing these interfaces with SAMPLE_FTRACE_DIRECT
> and SAMPLE_FTRACE_DIRECT_MULTI selected.
> 
> Second, to make the direct_caller and the other ftrace hooks
> (eg. function/fgraph tracer, k[ret]probes) co-exist, a temporary
> register
> are nominated to store the address of direct_caller in
> ftrace_regs_caller.
> After the setting of the address direct_caller by direct_ops->func and
> the RESTORE_REGS in ftrace_regs_caller, direct_caller will be jumped to
> by the `jr` inst.
> 
> The series's old changes related these patches
> ==========================================
> 
> Changes in v10:
> https://lore.kernel.org/all/20230511093234.3123181-1-suagrfillet@gmail.com/
> 
> - add Acked-by from Björn Töpel in patch 2 and patch 4
> - replace `move` with `mv` in patch3
> - prettify patch 2/4 with proper tabs
> 
> Changes in v9:
> https://lore.kernel.org/linux-riscv/20230510101857.2953955-1-suagrfillet@gmail.com/
> 
> 1. add Acked-by from Björn Töpel in patch 1
> 
> 2. rebase patch2/patch3 on Linux v6.4-rc1
> 
>    - patch 2: to make the `SAVE_ABI_REGS` configurable, revert the
>      modification of mcount-dyn.S from commit (45b32b946a97 "riscv:
> entry: Consolidate general regs saving/restoring")
> 
>    - patch 3: to pass the trace_selftest, add the implement of
>      `ftrace_stub_direct_tramp` from commit (fee86a4ed536 "ftrace:
> selftest: remove broken trace_direct_tramp") ; and fixup the context
> conflict in Kconfig
> 
> Changes in v8:
> https://lore.kernel.org/linux-riscv/20230324033342.3177979-1-suagrfillet@gmail.com/
>   - Fix incorrect address values in the 4nd patch
>   - Rebased on v6.3-rc2
> 
> Changes in v7:
> https://lore.kernel.org/linux-riscv/20230112090603.1295340-1-guoren@kernel.org/
>   - Fixup RESTORE_ABI_REGS by remove PT_T0(sp) overwrite.
>   - Add FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY [1]
>   - Fixup kconfig with HAVE_SAMPLE_FTRACE_DIRECT &
>     HAVE_SAMPLE_FTRACE_DIRECT_MULTI
> 
> Changes in v6:
> https://lore.kernel.org/linux-riscv/20230107133549.4192639-1-guoren@kernel.org/
>   - Replace 8 with MCOUNT_INSN_SIZE
>   - Replace "REG_L a1, PT_RA(sp)" with "mv a1, ra"
>   - Add Evgenii Shatokhin comment
> 
> Changes in v5:
> https://lore.kernel.org/linux-riscv/20221208091244.203407-1-guoren@kernel.org/
>   - Sort Kconfig entries in alphabetical order.
> 
> Changes in v4:
> https://lore.kernel.org/linux-riscv/20221129033230.255947-1-guoren@kernel.org/
>   - Include [3] for maintenance. [Song Shuai]
> 
> Changes in V3:
> https://lore.kernel.org/linux-riscv/20221123153950.2911981-1-guoren@kernel.org/
>   - Include [2] for maintenance. [Song Shuai]
> 
> [1]: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
> [2]: https://lore.kernel.org/lkml/20221120084230.910152-1-suagrfillet@gmail.com/
> [3]: https://lore.kernel.org/linux-riscv/20221123142025.1504030-1-suagrfillet@gmail.com/
> 
> Song Shuai (5):
>    riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
>    riscv: ftrace: Add ftrace_graph_func
>    riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
>    samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI]
>    samples: ftrace: Make the riscv samples support RV32I
> 
>   arch/riscv/Kconfig                          |   4 +
>   arch/riscv/include/asm/ftrace.h             |  19 +-
>   arch/riscv/kernel/ftrace.c                  |  30 ++-
>   arch/riscv/kernel/mcount-dyn.S              | 200 ++++++++++++++++----
>   samples/ftrace/ftrace-direct-modify.c       |  35 ++++
>   samples/ftrace/ftrace-direct-multi-modify.c |  41 ++++
>   samples/ftrace/ftrace-direct-multi.c        |  25 +++
>   samples/ftrace/ftrace-direct-too.c          |  28 +++
>   samples/ftrace/ftrace-direct.c              |  24 +++
>   9 files changed, 350 insertions(+), 56 deletions(-)
> 

-- 
Thanks
Song Shuai

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-07-06  9:35   ` Song Shuai
@ 2023-07-06  9:53     ` Conor Dooley
  -1 siblings, 0 replies; 28+ messages in thread
From: Conor Dooley @ 2023-07-06  9:53 UTC (permalink / raw)
  To: Song Shuai
  Cc: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, linux-riscv, linux-kernel,
	linux-trace-kernel, songshuaishuai

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

On Thu, Jul 06, 2023 at 05:35:49PM +0800, Song Shuai wrote:
> Ping...

A context-less ping is not very helpful - what are you looking for here?
More reviews? For example, someone to look at 5/5?

> 在 2023/6/27 19:16, Song Shuai 写道:

If it's application you want, you sent the patch only last week - which
was during the merge window, making it unlikely to be applied.

Either way, please try to explain what it is that you are looking for
when you do a ping!

Cheers,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-06  9:53     ` Conor Dooley
  0 siblings, 0 replies; 28+ messages in thread
From: Conor Dooley @ 2023-07-06  9:53 UTC (permalink / raw)
  To: Song Shuai
  Cc: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, linux-riscv, linux-kernel,
	linux-trace-kernel, songshuaishuai


[-- Attachment #1.1: Type: text/plain, Size: 499 bytes --]

On Thu, Jul 06, 2023 at 05:35:49PM +0800, Song Shuai wrote:
> Ping...

A context-less ping is not very helpful - what are you looking for here?
More reviews? For example, someone to look at 5/5?

> 在 2023/6/27 19:16, Song Shuai 写道:

If it's application you want, you sent the patch only last week - which
was during the merge window, making it unlikely to be applied.

Either way, please try to explain what it is that you are looking for
when you do a ping!

Cheers,
Conor.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-07-06  9:53     ` Conor Dooley
@ 2023-07-06 10:10       ` Song Shuai
  -1 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-07-06 10:10 UTC (permalink / raw)
  To: Conor Dooley, Song Shuai
  Cc: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, linux-riscv, linux-kernel,
	linux-trace-kernel



在 2023/7/6 17:53, Conor Dooley 写道:
> On Thu, Jul 06, 2023 at 05:35:49PM +0800, Song Shuai wrote:
>> Ping...
> 
> A context-less ping is not very helpful - what are you looking for here?
> More reviews? For example, someone to look at 5/5? >

Sorry for the context-less ping. I hoped someone could look at the 5th 
patch.

>> 在 2023/6/27 19:16, Song Shuai 写道:
> 
> If it's application you want, you sent the patch only last week - which
> was during the merge window, making it unlikely to be applied.
>  > Either way, please try to explain what it is that you are looking for
> when you do a ping!

Thanks for your correction, I'll follow this thread after the merge
window closes.

> 
> Cheers,
> Conor.
> 

-- 
Thanks
Song Shuai

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-06 10:10       ` Song Shuai
  0 siblings, 0 replies; 28+ messages in thread
From: Song Shuai @ 2023-07-06 10:10 UTC (permalink / raw)
  To: Conor Dooley, Song Shuai
  Cc: paul.walmsley, palmer, aou, rostedt, mhiramat, mark.rutland,
	guoren, bjorn, jszhang, linux-riscv, linux-kernel,
	linux-trace-kernel



在 2023/7/6 17:53, Conor Dooley 写道:
> On Thu, Jul 06, 2023 at 05:35:49PM +0800, Song Shuai wrote:
>> Ping...
> 
> A context-less ping is not very helpful - what are you looking for here?
> More reviews? For example, someone to look at 5/5? >

Sorry for the context-less ping. I hoped someone could look at the 5th 
patch.

>> 在 2023/6/27 19:16, Song Shuai 写道:
> 
> If it's application you want, you sent the patch only last week - which
> was during the merge window, making it unlikely to be applied.
>  > Either way, please try to explain what it is that you are looking for
> when you do a ping!

Thanks for your correction, I'll follow this thread after the merge
window closes.

> 
> Cheers,
> Conor.
> 

-- 
Thanks
Song Shuai

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-06-27 11:16 ` Song Shuai
@ 2023-07-12 18:11   ` Björn Töpel
  -1 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-07-12 18:11 UTC (permalink / raw)
  To: Song Shuai, paul.walmsley, palmer, aou, rostedt, mhiramat,
	mark.rutland, guoren, suagrfillet, bjorn, jszhang, conor.dooley,
	Pu Lehui, palmer
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai, bpf

Song Shuai <suagrfillet@gmail.com> writes:

[...]

> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
> ==============================================

We've had some offlist discussions, so here's some input for a wider
audience! Most importantly, this is for Palmer, so that this series is
not merged until a proper BPF trampoline fix is in place.

Note that what's currently usable from BPF trampoline *works*. It's
when this series is added that it breaks.

TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
fentry/fexit BPF trampoline support. Unfortunately the
fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
with this addition, and need to be addressed *prior* merging this
series. An easy way to reproduce, is just calling any of the kselftest
tests that uses fexit patching.

The issue is around the nop seld, and how a call is done; The nop sled
(patchable-function-entry) size changed from 16B to 8B in commit
6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
BPF code still uses the old 16B. So it'll work for BPF programs, but not
for regular kernel functions.

An example:

  | ffffffff80fa4150 <bpf_fentry_test1>:
  | ffffffff80fa4150:       0001                    nop
  | ffffffff80fa4152:       0001                    nop
  | ffffffff80fa4154:       0001                    nop
  | ffffffff80fa4156:       0001                    nop
  | ffffffff80fa4158:       1141                    add     sp,sp,-16
  | ffffffff80fa415a:       e422                    sd      s0,8(sp)
  | ffffffff80fa415c:       0800                    add     s0,sp,16
  | ffffffff80fa415e:       6422                    ld      s0,8(sp)
  | ffffffff80fa4160:       2505                    addw    a0,a0,1
  | ffffffff80fa4162:       0141                    add     sp,sp,16
  | ffffffff80fa4164:       8082                    ret

is patched to:

  | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
  | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336

The return address to bpf_fentry_test1 is stored in t0 at BPF
trampoline entry. Return to the *parent* is in ra. The trampline has
to deal with this.

For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
many bytes, and not correctly handle parent calls.

Further; The BPF trampoline currently has a different way of patching
the nops for BPF programs, than what ftrace does. That should be changed
to match what ftrace does (auipc/jalr t0).

To summarize:
 * Align BPF nop sled with patchable-function-entry: 8B.
 * Adapt BPF trampoline for 8B nop sleds.
 * Adapt BPF trampoline t0 return, ra parent scheme.
 

Cheers,
Björn



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-12 18:11   ` Björn Töpel
  0 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-07-12 18:11 UTC (permalink / raw)
  To: Song Shuai, paul.walmsley, palmer, aou, rostedt, mhiramat,
	mark.rutland, guoren, suagrfillet, bjorn, jszhang, conor.dooley,
	Pu Lehui, palmer
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai, bpf

Song Shuai <suagrfillet@gmail.com> writes:

[...]

> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
> ==============================================

We've had some offlist discussions, so here's some input for a wider
audience! Most importantly, this is for Palmer, so that this series is
not merged until a proper BPF trampoline fix is in place.

Note that what's currently usable from BPF trampoline *works*. It's
when this series is added that it breaks.

TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
fentry/fexit BPF trampoline support. Unfortunately the
fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
with this addition, and need to be addressed *prior* merging this
series. An easy way to reproduce, is just calling any of the kselftest
tests that uses fexit patching.

The issue is around the nop seld, and how a call is done; The nop sled
(patchable-function-entry) size changed from 16B to 8B in commit
6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
BPF code still uses the old 16B. So it'll work for BPF programs, but not
for regular kernel functions.

An example:

  | ffffffff80fa4150 <bpf_fentry_test1>:
  | ffffffff80fa4150:       0001                    nop
  | ffffffff80fa4152:       0001                    nop
  | ffffffff80fa4154:       0001                    nop
  | ffffffff80fa4156:       0001                    nop
  | ffffffff80fa4158:       1141                    add     sp,sp,-16
  | ffffffff80fa415a:       e422                    sd      s0,8(sp)
  | ffffffff80fa415c:       0800                    add     s0,sp,16
  | ffffffff80fa415e:       6422                    ld      s0,8(sp)
  | ffffffff80fa4160:       2505                    addw    a0,a0,1
  | ffffffff80fa4162:       0141                    add     sp,sp,16
  | ffffffff80fa4164:       8082                    ret

is patched to:

  | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
  | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336

The return address to bpf_fentry_test1 is stored in t0 at BPF
trampoline entry. Return to the *parent* is in ra. The trampline has
to deal with this.

For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
many bytes, and not correctly handle parent calls.

Further; The BPF trampoline currently has a different way of patching
the nops for BPF programs, than what ftrace does. That should be changed
to match what ftrace does (auipc/jalr t0).

To summarize:
 * Align BPF nop sled with patchable-function-entry: 8B.
 * Adapt BPF trampoline for 8B nop sleds.
 * Adapt BPF trampoline t0 return, ra parent scheme.
 

Cheers,
Björn



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-07-12 18:11   ` Björn Töpel
@ 2023-07-12 18:26     ` Palmer Dabbelt
  -1 siblings, 0 replies; 28+ messages in thread
From: Palmer Dabbelt @ 2023-07-12 18:26 UTC (permalink / raw)
  To: bjorn
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
> Song Shuai <suagrfillet@gmail.com> writes:
>
> [...]
>
>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>> ==============================================
>
> We've had some offlist discussions, so here's some input for a wider
> audience! Most importantly, this is for Palmer, so that this series is
> not merged until a proper BPF trampoline fix is in place.
>
> Note that what's currently usable from BPF trampoline *works*. It's
> when this series is added that it breaks.
>
> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
> fentry/fexit BPF trampoline support. Unfortunately the
> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
> with this addition, and need to be addressed *prior* merging this
> series. An easy way to reproduce, is just calling any of the kselftest
> tests that uses fexit patching.
>
> The issue is around the nop seld, and how a call is done; The nop sled
> (patchable-function-entry) size changed from 16B to 8B in commit
> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
> BPF code still uses the old 16B. So it'll work for BPF programs, but not
> for regular kernel functions.
>
> An example:
>
>   | ffffffff80fa4150 <bpf_fentry_test1>:
>   | ffffffff80fa4150:       0001                    nop
>   | ffffffff80fa4152:       0001                    nop
>   | ffffffff80fa4154:       0001                    nop
>   | ffffffff80fa4156:       0001                    nop
>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>   | ffffffff80fa4164:       8082                    ret
>
> is patched to:
>
>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>
> The return address to bpf_fentry_test1 is stored in t0 at BPF
> trampoline entry. Return to the *parent* is in ra. The trampline has
> to deal with this.
>
> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
> many bytes, and not correctly handle parent calls.
>
> Further; The BPF trampoline currently has a different way of patching
> the nops for BPF programs, than what ftrace does. That should be changed
> to match what ftrace does (auipc/jalr t0).
>
> To summarize:
>  * Align BPF nop sled with patchable-function-entry: 8B.
>  * Adapt BPF trampoline for 8B nop sleds.
>  * Adapt BPF trampoline t0 return, ra parent scheme.

Thanks for digging into this one, I agree we need to sort out the BPF 
breakages before we merge this.  Sounds like there's a rabbit hole here, 
but hopefully we can get it sorted out.

I've dropped this from patchwork and such, as we'll need at least 
another spin.

> Cheers,
> Björn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-12 18:26     ` Palmer Dabbelt
  0 siblings, 0 replies; 28+ messages in thread
From: Palmer Dabbelt @ 2023-07-12 18:26 UTC (permalink / raw)
  To: bjorn
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
> Song Shuai <suagrfillet@gmail.com> writes:
>
> [...]
>
>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>> ==============================================
>
> We've had some offlist discussions, so here's some input for a wider
> audience! Most importantly, this is for Palmer, so that this series is
> not merged until a proper BPF trampoline fix is in place.
>
> Note that what's currently usable from BPF trampoline *works*. It's
> when this series is added that it breaks.
>
> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
> fentry/fexit BPF trampoline support. Unfortunately the
> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
> with this addition, and need to be addressed *prior* merging this
> series. An easy way to reproduce, is just calling any of the kselftest
> tests that uses fexit patching.
>
> The issue is around the nop seld, and how a call is done; The nop sled
> (patchable-function-entry) size changed from 16B to 8B in commit
> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
> BPF code still uses the old 16B. So it'll work for BPF programs, but not
> for regular kernel functions.
>
> An example:
>
>   | ffffffff80fa4150 <bpf_fentry_test1>:
>   | ffffffff80fa4150:       0001                    nop
>   | ffffffff80fa4152:       0001                    nop
>   | ffffffff80fa4154:       0001                    nop
>   | ffffffff80fa4156:       0001                    nop
>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>   | ffffffff80fa4164:       8082                    ret
>
> is patched to:
>
>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>
> The return address to bpf_fentry_test1 is stored in t0 at BPF
> trampoline entry. Return to the *parent* is in ra. The trampline has
> to deal with this.
>
> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
> many bytes, and not correctly handle parent calls.
>
> Further; The BPF trampoline currently has a different way of patching
> the nops for BPF programs, than what ftrace does. That should be changed
> to match what ftrace does (auipc/jalr t0).
>
> To summarize:
>  * Align BPF nop sled with patchable-function-entry: 8B.
>  * Adapt BPF trampoline for 8B nop sleds.
>  * Adapt BPF trampoline t0 return, ra parent scheme.

Thanks for digging into this one, I agree we need to sort out the BPF 
breakages before we merge this.  Sounds like there's a rabbit hole here, 
but hopefully we can get it sorted out.

I've dropped this from patchwork and such, as we'll need at least 
another spin.

> Cheers,
> Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-07-12 18:11   ` Björn Töpel
@ 2023-07-15  9:10     ` Pu Lehui
  -1 siblings, 0 replies; 28+ messages in thread
From: Pu Lehui @ 2023-07-15  9:10 UTC (permalink / raw)
  To: Björn Töpel, Song Shuai, paul.walmsley, palmer, aou,
	rostedt, mhiramat, mark.rutland, guoren, bjorn, jszhang,
	conor.dooley, palmer
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai, bpf



On 2023/7/13 2:11, Björn Töpel wrote:
> Song Shuai <suagrfillet@gmail.com> writes:
> 
> [...]
> 
>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>> ==============================================
> 
> We've had some offlist discussions, so here's some input for a wider
> audience! Most importantly, this is for Palmer, so that this series is
> not merged until a proper BPF trampoline fix is in place.
> 
> Note that what's currently usable from BPF trampoline *works*. It's
> when this series is added that it breaks.
> 
> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
> fentry/fexit BPF trampoline support. Unfortunately the
> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
> with this addition, and need to be addressed *prior* merging this
> series. An easy way to reproduce, is just calling any of the kselftest
> tests that uses fexit patching.
> 
> The issue is around the nop seld, and how a call is done; The nop sled
> (patchable-function-entry) size changed from 16B to 8B in commit
> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
> BPF code still uses the old 16B. So it'll work for BPF programs, but not
> for regular kernel functions.
> 
> An example:
> 
>    | ffffffff80fa4150 <bpf_fentry_test1>:
>    | ffffffff80fa4150:       0001                    nop
>    | ffffffff80fa4152:       0001                    nop
>    | ffffffff80fa4154:       0001                    nop
>    | ffffffff80fa4156:       0001                    nop
>    | ffffffff80fa4158:       1141                    add     sp,sp,-16
>    | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>    | ffffffff80fa415c:       0800                    add     s0,sp,16
>    | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>    | ffffffff80fa4160:       2505                    addw    a0,a0,1
>    | ffffffff80fa4162:       0141                    add     sp,sp,16
>    | ffffffff80fa4164:       8082                    ret
> 
> is patched to:
> 
>    | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>    | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
> 
> The return address to bpf_fentry_test1 is stored in t0 at BPF
> trampoline entry. Return to the *parent* is in ra. The trampline has
> to deal with this.
> 
> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
> many bytes, and not correctly handle parent calls.
> 
> Further; The BPF trampoline currently has a different way of patching
> the nops for BPF programs, than what ftrace does. That should be changed
> to match what ftrace does (auipc/jalr t0).
> 
> To summarize:
>   * Align BPF nop sled with patchable-function-entry: 8B.
>   * Adapt BPF trampoline for 8B nop sleds.
>   * Adapt BPF trampoline t0 return, ra parent scheme.
> 

Thanks Björn, I make a adaptation as follows, looking forward to your 
review.

https://lore.kernel.org/bpf/20230715090137.2141358-1-pulehui@huaweicloud.com/

> 
> Cheers,
> Björn
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-07-15  9:10     ` Pu Lehui
  0 siblings, 0 replies; 28+ messages in thread
From: Pu Lehui @ 2023-07-15  9:10 UTC (permalink / raw)
  To: Björn Töpel, Song Shuai, paul.walmsley, palmer, aou,
	rostedt, mhiramat, mark.rutland, guoren, bjorn, jszhang,
	conor.dooley, palmer
  Cc: linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai, bpf



On 2023/7/13 2:11, Björn Töpel wrote:
> Song Shuai <suagrfillet@gmail.com> writes:
> 
> [...]
> 
>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>> ==============================================
> 
> We've had some offlist discussions, so here's some input for a wider
> audience! Most importantly, this is for Palmer, so that this series is
> not merged until a proper BPF trampoline fix is in place.
> 
> Note that what's currently usable from BPF trampoline *works*. It's
> when this series is added that it breaks.
> 
> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
> fentry/fexit BPF trampoline support. Unfortunately the
> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
> with this addition, and need to be addressed *prior* merging this
> series. An easy way to reproduce, is just calling any of the kselftest
> tests that uses fexit patching.
> 
> The issue is around the nop seld, and how a call is done; The nop sled
> (patchable-function-entry) size changed from 16B to 8B in commit
> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
> BPF code still uses the old 16B. So it'll work for BPF programs, but not
> for regular kernel functions.
> 
> An example:
> 
>    | ffffffff80fa4150 <bpf_fentry_test1>:
>    | ffffffff80fa4150:       0001                    nop
>    | ffffffff80fa4152:       0001                    nop
>    | ffffffff80fa4154:       0001                    nop
>    | ffffffff80fa4156:       0001                    nop
>    | ffffffff80fa4158:       1141                    add     sp,sp,-16
>    | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>    | ffffffff80fa415c:       0800                    add     s0,sp,16
>    | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>    | ffffffff80fa4160:       2505                    addw    a0,a0,1
>    | ffffffff80fa4162:       0141                    add     sp,sp,16
>    | ffffffff80fa4164:       8082                    ret
> 
> is patched to:
> 
>    | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>    | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
> 
> The return address to bpf_fentry_test1 is stored in t0 at BPF
> trampoline entry. Return to the *parent* is in ra. The trampline has
> to deal with this.
> 
> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
> many bytes, and not correctly handle parent calls.
> 
> Further; The BPF trampoline currently has a different way of patching
> the nops for BPF programs, than what ftrace does. That should be changed
> to match what ftrace does (auipc/jalr t0).
> 
> To summarize:
>   * Align BPF nop sled with patchable-function-entry: 8B.
>   * Adapt BPF trampoline for 8B nop sleds.
>   * Adapt BPF trampoline t0 return, ra parent scheme.
> 

Thanks Björn, I make a adaptation as follows, looking forward to your 
review.

https://lore.kernel.org/bpf/20230715090137.2141358-1-pulehui@huaweicloud.com/

> 
> Cheers,
> Björn
> 
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-07-12 18:26     ` Palmer Dabbelt
@ 2023-08-23 20:20       ` Björn Töpel
  -1 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-08-23 20:20 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

Palmer Dabbelt <palmer@rivosinc.com> writes:

> On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
>> Song Shuai <suagrfillet@gmail.com> writes:
>>
>> [...]
>>
>>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>>> ==============================================
>>
>> We've had some offlist discussions, so here's some input for a wider
>> audience! Most importantly, this is for Palmer, so that this series is
>> not merged until a proper BPF trampoline fix is in place.
>>
>> Note that what's currently usable from BPF trampoline *works*. It's
>> when this series is added that it breaks.
>>
>> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
>> fentry/fexit BPF trampoline support. Unfortunately the
>> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
>> with this addition, and need to be addressed *prior* merging this
>> series. An easy way to reproduce, is just calling any of the kselftest
>> tests that uses fexit patching.
>>
>> The issue is around the nop seld, and how a call is done; The nop sled
>> (patchable-function-entry) size changed from 16B to 8B in commit
>> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
>> BPF code still uses the old 16B. So it'll work for BPF programs, but not
>> for regular kernel functions.
>>
>> An example:
>>
>>   | ffffffff80fa4150 <bpf_fentry_test1>:
>>   | ffffffff80fa4150:       0001                    nop
>>   | ffffffff80fa4152:       0001                    nop
>>   | ffffffff80fa4154:       0001                    nop
>>   | ffffffff80fa4156:       0001                    nop
>>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>>   | ffffffff80fa4164:       8082                    ret
>>
>> is patched to:
>>
>>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>>
>> The return address to bpf_fentry_test1 is stored in t0 at BPF
>> trampoline entry. Return to the *parent* is in ra. The trampline has
>> to deal with this.
>>
>> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
>> many bytes, and not correctly handle parent calls.
>>
>> Further; The BPF trampoline currently has a different way of patching
>> the nops for BPF programs, than what ftrace does. That should be changed
>> to match what ftrace does (auipc/jalr t0).
>>
>> To summarize:
>>  * Align BPF nop sled with patchable-function-entry: 8B.
>>  * Adapt BPF trampoline for 8B nop sleds.
>>  * Adapt BPF trampoline t0 return, ra parent scheme.
>
> Thanks for digging into this one, I agree we need to sort out the BPF 
> breakages before we merge this.  Sounds like there's a rabbit hole here, 
> but hopefully we can get it sorted out.
>
> I've dropped this from patchwork and such, as we'll need at least 
> another spin.

Palmer,

The needed BPF patch is upstream in the bpf-next tree, and has been for
a couple of weeks.

I think this series is a candidate for RISC-V -next! It would help
RISC-V BPF a lot in terms of completeness.


Björn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-08-23 20:20       ` Björn Töpel
  0 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-08-23 20:20 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

Palmer Dabbelt <palmer@rivosinc.com> writes:

> On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
>> Song Shuai <suagrfillet@gmail.com> writes:
>>
>> [...]
>>
>>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>>> ==============================================
>>
>> We've had some offlist discussions, so here's some input for a wider
>> audience! Most importantly, this is for Palmer, so that this series is
>> not merged until a proper BPF trampoline fix is in place.
>>
>> Note that what's currently usable from BPF trampoline *works*. It's
>> when this series is added that it breaks.
>>
>> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
>> fentry/fexit BPF trampoline support. Unfortunately the
>> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
>> with this addition, and need to be addressed *prior* merging this
>> series. An easy way to reproduce, is just calling any of the kselftest
>> tests that uses fexit patching.
>>
>> The issue is around the nop seld, and how a call is done; The nop sled
>> (patchable-function-entry) size changed from 16B to 8B in commit
>> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
>> BPF code still uses the old 16B. So it'll work for BPF programs, but not
>> for regular kernel functions.
>>
>> An example:
>>
>>   | ffffffff80fa4150 <bpf_fentry_test1>:
>>   | ffffffff80fa4150:       0001                    nop
>>   | ffffffff80fa4152:       0001                    nop
>>   | ffffffff80fa4154:       0001                    nop
>>   | ffffffff80fa4156:       0001                    nop
>>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>>   | ffffffff80fa4164:       8082                    ret
>>
>> is patched to:
>>
>>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>>
>> The return address to bpf_fentry_test1 is stored in t0 at BPF
>> trampoline entry. Return to the *parent* is in ra. The trampline has
>> to deal with this.
>>
>> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
>> many bytes, and not correctly handle parent calls.
>>
>> Further; The BPF trampoline currently has a different way of patching
>> the nops for BPF programs, than what ftrace does. That should be changed
>> to match what ftrace does (auipc/jalr t0).
>>
>> To summarize:
>>  * Align BPF nop sled with patchable-function-entry: 8B.
>>  * Adapt BPF trampoline for 8B nop sleds.
>>  * Adapt BPF trampoline t0 return, ra parent scheme.
>
> Thanks for digging into this one, I agree we need to sort out the BPF 
> breakages before we merge this.  Sounds like there's a rabbit hole here, 
> but hopefully we can get it sorted out.
>
> I've dropped this from patchwork and such, as we'll need at least 
> another spin.

Palmer,

The needed BPF patch is upstream in the bpf-next tree, and has been for
a couple of weeks.

I think this series is a candidate for RISC-V -next! It would help
RISC-V BPF a lot in terms of completeness.


Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
  2023-08-23 20:20       ` Björn Töpel
@ 2023-08-30 15:28         ` Björn Töpel
  -1 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-08-30 15:28 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

Björn Töpel <bjorn@kernel.org> writes:

> Palmer Dabbelt <palmer@rivosinc.com> writes:
>
>> On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
>>> Song Shuai <suagrfillet@gmail.com> writes:
>>>
>>> [...]
>>>
>>>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>>>> ==============================================
>>>
>>> We've had some offlist discussions, so here's some input for a wider
>>> audience! Most importantly, this is for Palmer, so that this series is
>>> not merged until a proper BPF trampoline fix is in place.
>>>
>>> Note that what's currently usable from BPF trampoline *works*. It's
>>> when this series is added that it breaks.
>>>
>>> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
>>> fentry/fexit BPF trampoline support. Unfortunately the
>>> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
>>> with this addition, and need to be addressed *prior* merging this
>>> series. An easy way to reproduce, is just calling any of the kselftest
>>> tests that uses fexit patching.
>>>
>>> The issue is around the nop seld, and how a call is done; The nop sled
>>> (patchable-function-entry) size changed from 16B to 8B in commit
>>> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
>>> BPF code still uses the old 16B. So it'll work for BPF programs, but not
>>> for regular kernel functions.
>>>
>>> An example:
>>>
>>>   | ffffffff80fa4150 <bpf_fentry_test1>:
>>>   | ffffffff80fa4150:       0001                    nop
>>>   | ffffffff80fa4152:       0001                    nop
>>>   | ffffffff80fa4154:       0001                    nop
>>>   | ffffffff80fa4156:       0001                    nop
>>>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>>>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>>>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>>>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>>>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>>>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>>>   | ffffffff80fa4164:       8082                    ret
>>>
>>> is patched to:
>>>
>>>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>>>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>>>
>>> The return address to bpf_fentry_test1 is stored in t0 at BPF
>>> trampoline entry. Return to the *parent* is in ra. The trampline has
>>> to deal with this.
>>>
>>> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
>>> many bytes, and not correctly handle parent calls.
>>>
>>> Further; The BPF trampoline currently has a different way of patching
>>> the nops for BPF programs, than what ftrace does. That should be changed
>>> to match what ftrace does (auipc/jalr t0).
>>>
>>> To summarize:
>>>  * Align BPF nop sled with patchable-function-entry: 8B.
>>>  * Adapt BPF trampoline for 8B nop sleds.
>>>  * Adapt BPF trampoline t0 return, ra parent scheme.
>>
>> Thanks for digging into this one, I agree we need to sort out the BPF 
>> breakages before we merge this.  Sounds like there's a rabbit hole here, 
>> but hopefully we can get it sorted out.
>>
>> I've dropped this from patchwork and such, as we'll need at least 
>> another spin.
>
> Palmer,
>
> The needed BPF patch is upstream in the bpf-next tree, and has been for
> a couple of weeks.
>
> I think this series is a candidate for RISC-V -next! It would help
> RISC-V BPF a lot in terms of completeness.

Palmer,

The needed fix for BPF is now in Linus' tree, commit 25ad10658dc1
("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace
framework"). IOW, this ftrace series can be merged now.


Björn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V11 0/5] riscv: Optimize function trace
@ 2023-08-30 15:28         ` Björn Töpel
  0 siblings, 0 replies; 28+ messages in thread
From: Björn Töpel @ 2023-08-30 15:28 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: suagrfillet, Paul Walmsley, aou, rostedt, mhiramat, Mark Rutland,
	guoren, suagrfillet, Bjorn Topel, jszhang, Conor Dooley, pulehui,
	linux-riscv, linux-kernel, linux-trace-kernel, songshuaishuai,
	bpf

Björn Töpel <bjorn@kernel.org> writes:

> Palmer Dabbelt <palmer@rivosinc.com> writes:
>
>> On Wed, 12 Jul 2023 11:11:08 PDT (-0700), bjorn@kernel.org wrote:
>>> Song Shuai <suagrfillet@gmail.com> writes:
>>>
>>> [...]
>>>
>>>> Add WITH_DIRECT_CALLS support [3] (patch 3, 4)
>>>> ==============================================
>>>
>>> We've had some offlist discussions, so here's some input for a wider
>>> audience! Most importantly, this is for Palmer, so that this series is
>>> not merged until a proper BPF trampoline fix is in place.
>>>
>>> Note that what's currently usable from BPF trampoline *works*. It's
>>> when this series is added that it breaks.
>>>
>>> TL;DR This series adds DYNAMIC_FTRACE_WITH_DIRECT_CALLS, which enables
>>> fentry/fexit BPF trampoline support. Unfortunately the
>>> fexit/BPF_TRAMP_F_SKIP_FRAME parts of the RV BPF trampoline breaks
>>> with this addition, and need to be addressed *prior* merging this
>>> series. An easy way to reproduce, is just calling any of the kselftest
>>> tests that uses fexit patching.
>>>
>>> The issue is around the nop seld, and how a call is done; The nop sled
>>> (patchable-function-entry) size changed from 16B to 8B in commit
>>> 6724a76cff85 ("riscv: ftrace: Reduce the detour code size to half"), but
>>> BPF code still uses the old 16B. So it'll work for BPF programs, but not
>>> for regular kernel functions.
>>>
>>> An example:
>>>
>>>   | ffffffff80fa4150 <bpf_fentry_test1>:
>>>   | ffffffff80fa4150:       0001                    nop
>>>   | ffffffff80fa4152:       0001                    nop
>>>   | ffffffff80fa4154:       0001                    nop
>>>   | ffffffff80fa4156:       0001                    nop
>>>   | ffffffff80fa4158:       1141                    add     sp,sp,-16
>>>   | ffffffff80fa415a:       e422                    sd      s0,8(sp)
>>>   | ffffffff80fa415c:       0800                    add     s0,sp,16
>>>   | ffffffff80fa415e:       6422                    ld      s0,8(sp)
>>>   | ffffffff80fa4160:       2505                    addw    a0,a0,1
>>>   | ffffffff80fa4162:       0141                    add     sp,sp,16
>>>   | ffffffff80fa4164:       8082                    ret
>>>
>>> is patched to:
>>>
>>>   | ffffffff80fa4150:  f70c0297                     auipc   t0,-150208512
>>>   | ffffffff80fa4154:  eb0282e7                     jalr    t0,t0,-336
>>>
>>> The return address to bpf_fentry_test1 is stored in t0 at BPF
>>> trampoline entry. Return to the *parent* is in ra. The trampline has
>>> to deal with this.
>>>
>>> For BPF_TRAMP_F_SKIP_FRAME/CALL_ORIG, the BPF trampoline will skip too
>>> many bytes, and not correctly handle parent calls.
>>>
>>> Further; The BPF trampoline currently has a different way of patching
>>> the nops for BPF programs, than what ftrace does. That should be changed
>>> to match what ftrace does (auipc/jalr t0).
>>>
>>> To summarize:
>>>  * Align BPF nop sled with patchable-function-entry: 8B.
>>>  * Adapt BPF trampoline for 8B nop sleds.
>>>  * Adapt BPF trampoline t0 return, ra parent scheme.
>>
>> Thanks for digging into this one, I agree we need to sort out the BPF 
>> breakages before we merge this.  Sounds like there's a rabbit hole here, 
>> but hopefully we can get it sorted out.
>>
>> I've dropped this from patchwork and such, as we'll need at least 
>> another spin.
>
> Palmer,
>
> The needed BPF patch is upstream in the bpf-next tree, and has been for
> a couple of weeks.
>
> I think this series is a candidate for RISC-V -next! It would help
> RISC-V BPF a lot in terms of completeness.

Palmer,

The needed fix for BPF is now in Linus' tree, commit 25ad10658dc1
("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace
framework"). IOW, this ftrace series can be merged now.


Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-08-30 18:47 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 11:16 [PATCH V11 0/5] riscv: Optimize function trace Song Shuai
2023-06-27 11:16 ` Song Shuai
2023-06-27 11:16 ` [PATCH V11 1/5] riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY Song Shuai
2023-06-27 11:16   ` Song Shuai
2023-06-27 11:16 ` [PATCH V11 2/5] riscv: ftrace: Add ftrace_graph_func Song Shuai
2023-06-27 11:16   ` Song Shuai
2023-06-27 11:16 ` [PATCH V11 3/5] riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support Song Shuai
2023-06-27 11:16   ` Song Shuai
2023-06-27 11:16 ` [PATCH V11 4/5] samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI] Song Shuai
2023-06-27 11:16   ` Song Shuai
2023-06-27 11:16 ` [PATCH V11 5/5] samples: ftrace: Make the riscv samples support RV32I Song Shuai
2023-06-27 11:16   ` Song Shuai
2023-07-06  9:35 ` [PATCH V11 0/5] riscv: Optimize function trace Song Shuai
2023-07-06  9:35   ` Song Shuai
2023-07-06  9:53   ` Conor Dooley
2023-07-06  9:53     ` Conor Dooley
2023-07-06 10:10     ` Song Shuai
2023-07-06 10:10       ` Song Shuai
2023-07-12 18:11 ` Björn Töpel
2023-07-12 18:11   ` Björn Töpel
2023-07-12 18:26   ` Palmer Dabbelt
2023-07-12 18:26     ` Palmer Dabbelt
2023-08-23 20:20     ` Björn Töpel
2023-08-23 20:20       ` Björn Töpel
2023-08-30 15:28       ` Björn Töpel
2023-08-30 15:28         ` Björn Töpel
2023-07-15  9:10   ` Pu Lehui
2023-07-15  9:10     ` Pu Lehui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.