linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Add ftrace direct call for arm64
@ 2023-02-01 16:34 Florent Revest
  2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to apply on top of the arm64 tree which contains Mark Rutland's
series on CALL_OPS [1] under the for-next/ftrace tag.

The first three patches consolidate the two existing ftrace APIs for registering
direct calls. They are split to make the reviewers lives easier but if it'd be a
preferred style, I'd be happy to squash them in the next revision.
Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
from samples and selftests, there are no users of the _ftrace_direct API left
in-tree so this deletes it and renames the _ftrace_direct_multi API to
_ftrace_direct for simplicity.

The main benefit of this refactoring is that, with the API that's left, an
ftrace_ops backing a direct call will only ever point to one direct call. We can
therefore store the direct called trampoline address in the ops (patch 4) and
look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
destination would be out of reach of a BL instruction at the ftrace callsite.
(in this case, ftrace_caller acts as a lightweight intermediary trampoline)

This series has been tested on both arm64 and x86_64 with:
1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
2- samples/ftrace/*.ko (cf: patch 8)
3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

This follows up on prior art by Xu Kuohai [2].
The implementation here is totally different but the fix for ftrace selftests
(patch 6) is a trivial rebase of a patch originally by Xu so I kept his
authorship and trailers untouched on that patch, I hope that's ok.

1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/

Florent Revest (7):
  ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  ftrace: Remove the legacy _ftrace_direct API
  ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  ftrace: Store direct called addresses in their ops
  ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  arm64: ftrace: Add direct call support
  arm64: ftrace: Add direct called trampoline samples support

Xu Kuohai (1):
  ftrace: Fix dead loop caused by direct call in ftrace selftest

 arch/arm64/Kconfig                          |   4 +
 arch/arm64/include/asm/ftrace.h             |  24 ++
 arch/arm64/kernel/asm-offsets.c             |   6 +
 arch/arm64/kernel/entry-ftrace.S            |  70 +++-
 arch/arm64/kernel/ftrace.c                  |  36 +-
 include/linux/ftrace.h                      |  51 +--
 kernel/bpf/trampoline.c                     |  14 +-
 kernel/trace/Kconfig                        |   2 +-
 kernel/trace/ftrace.c                       | 433 +-------------------
 kernel/trace/trace_selftest.c               |  14 +-
 samples/Kconfig                             |   2 +-
 samples/ftrace/ftrace-direct-modify.c       |  41 +-
 samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
 samples/ftrace/ftrace-direct-multi.c        |  28 +-
 samples/ftrace/ftrace-direct-too.c          |  35 +-
 samples/ftrace/ftrace-direct.c              |  33 +-
 16 files changed, 333 insertions(+), 504 deletions(-)

-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 15:01   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API Florent Revest
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

The _multi API requires that users keep their own ops and can enforce
that an op is only associated to one direct call.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 kernel/trace/trace_selftest.c         |  9 ++++++---
 samples/ftrace/ftrace-direct-modify.c | 11 +++++++----
 samples/ftrace/ftrace-direct-too.c    | 12 +++++++-----
 samples/ftrace/ftrace-direct.c        | 12 +++++++-----
 4 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index ff0536cea968..9b7f10cbc51d 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -806,6 +806,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	int ret;
 	unsigned long count;
 	char *func_name __maybe_unused;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	struct ftrace_ops direct = {};
+#endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 	if (ftrace_filter_param) {
@@ -870,8 +873,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 * Register direct function together with graph tracer
 	 * and make sure we get graph trace.
 	 */
-	ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
-				     (unsigned long) trace_direct_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
+	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -891,7 +894,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 
 	unregister_ftrace_graph(&fgraph_ops);
 
-	ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
+	ret = unregister_ftrace_direct_multi(&direct,
 				       (unsigned long) trace_direct_tramp);
 	if (ret)
 		goto out;
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index de5a0f67f320..e1e6c286970c 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -96,6 +96,8 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static unsigned long my_tramp = (unsigned long)my_tramp1;
 static unsigned long tramps[2] = {
 	(unsigned long)my_tramp1,
@@ -114,7 +116,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
+		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -129,7 +131,8 @@ static int __init ftrace_direct_init(void)
 {
 	int ret;
 
-	ret = register_ftrace_direct(my_ip, my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
+	ret = register_ftrace_direct_multi(&direct, my_tramp);
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
 	return ret;
@@ -138,12 +141,12 @@ static int __init ftrace_direct_init(void)
 static void __exit ftrace_direct_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct(my_ip, my_tramp);
+	unregister_ftrace_direct_multi(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index e13fb59a2b47..0e907092e2c0 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -70,21 +70,23 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static int __init ftrace_direct_init(void)
 {
-	return register_ftrace_direct((unsigned long)handle_mm_fault,
-				     (unsigned long)my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
+
+	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct((unsigned long)handle_mm_fault,
-				 (unsigned long)my_tramp);
+	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
+MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 1f769d0db20f..e446c38f6b58 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -63,21 +63,23 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static int __init ftrace_direct_init(void)
 {
-	return register_ftrace_direct((unsigned long)wake_up_process,
-				     (unsigned long)my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
+
+	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct((unsigned long)wake_up_process,
-				 (unsigned long)my_tramp);
+	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
  2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 15:11   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

This API relies on a single global ops, used for all direct calls
registered with it. However, to implement arm64 direct calls, we need
each ops to point to a single direct call trampoline.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/ftrace.h |  32 ----
 kernel/trace/ftrace.c  | 393 -----------------------------------------
 2 files changed, 425 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 366c730beaa3..2d7c85e47c2d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -397,14 +397,6 @@ struct ftrace_func_entry {
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 extern int ftrace_direct_func_count;
-int register_ftrace_direct(unsigned long ip, unsigned long addr);
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr);
-int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr);
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr);
-int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-				struct dyn_ftrace *rec,
-				unsigned long old_addr,
-				unsigned long new_addr);
 unsigned long ftrace_find_rec_direct(unsigned long ip);
 int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
 int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
@@ -414,30 +406,6 @@ int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr
 #else
 struct ftrace_ops;
 # define ftrace_direct_func_count 0
-static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	return -ENOTSUPP;
-}
-static inline int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	return -ENOTSUPP;
-}
-static inline int modify_ftrace_direct(unsigned long ip,
-				       unsigned long old_addr, unsigned long new_addr)
-{
-	return -ENOTSUPP;
-}
-static inline struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
-	return NULL;
-}
-static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-					      struct dyn_ftrace *rec,
-					      unsigned long old_addr,
-					      unsigned long new_addr)
-{
-	return -ENODEV;
-}
 static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	return 0;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index e634b80f49d1..5efe201428fa 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2585,20 +2585,6 @@ static void call_direct_funcs(unsigned long ip, unsigned long pip,
 
 	arch_ftrace_set_direct_caller(fregs, addr);
 }
-
-struct ftrace_ops direct_ops = {
-	.func		= call_direct_funcs,
-	.flags		= FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS
-			  | FTRACE_OPS_FL_PERMANENT,
-	/*
-	 * By declaring the main trampoline as this trampoline
-	 * it will never have one allocated for it. Allocated
-	 * trampolines should not call direct functions.
-	 * The direct_ops should only be called by the builtin
-	 * ftrace_regs_caller trampoline.
-	 */
-	.trampoline	= FTRACE_REGS_ADDR,
-};
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
@@ -5295,387 +5281,8 @@ struct ftrace_direct_func {
 
 static LIST_HEAD(ftrace_direct_funcs);
 
-/**
- * ftrace_find_direct_func - test an address if it is a registered direct caller
- * @addr: The address of a registered direct caller
- *
- * This searches to see if a ftrace direct caller has been registered
- * at a specific address, and if so, it returns a descriptor for it.
- *
- * This can be used by architecture code to see if an address is
- * a direct caller (trampoline) attached to a fentry/mcount location.
- * This is useful for the function_graph tracer, as it may need to
- * do adjustments if it traced a location that also has a direct
- * trampoline attached to it.
- */
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
-	struct ftrace_direct_func *entry;
-	bool found = false;
-
-	/* May be called by fgraph trampoline (protected by rcu tasks) */
-	list_for_each_entry_rcu(entry, &ftrace_direct_funcs, next) {
-		if (entry->addr == addr) {
-			found = true;
-			break;
-		}
-	}
-	if (found)
-		return entry;
-
-	return NULL;
-}
-
-static struct ftrace_direct_func *ftrace_alloc_direct_func(unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-
-	direct = kmalloc(sizeof(*direct), GFP_KERNEL);
-	if (!direct)
-		return NULL;
-	direct->addr = addr;
-	direct->count = 0;
-	list_add_rcu(&direct->next, &ftrace_direct_funcs);
-	ftrace_direct_func_count++;
-	return direct;
-}
-
 static int register_ftrace_function_nolock(struct ftrace_ops *ops);
 
-/**
- * register_ftrace_direct - Call a custom trampoline directly
- * @ip: The address of the nop at the beginning of a function
- * @addr: The address of the trampoline to call at @ip
- *
- * This is used to connect a direct call from the nop location (@ip)
- * at the start of ftrace traced functions. The location that it calls
- * (@addr) must be able to handle a direct call, and save the parameters
- * of the function being traced, and restore them (or inject new ones
- * if needed), before returning.
- *
- * Returns:
- *  0 on success
- *  -EBUSY - Another direct function is already attached (there can be only one)
- *  -ENODEV - @ip does not point to a ftrace nop location (or not supported)
- *  -ENOMEM - There was an allocation failure.
- */
-int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-	struct ftrace_func_entry *entry;
-	struct ftrace_hash *free_hash = NULL;
-	struct dyn_ftrace *rec;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	/* See if there's a direct function at @ip already */
-	ret = -EBUSY;
-	if (ftrace_find_rec_direct(ip))
-		goto out_unlock;
-
-	ret = -ENODEV;
-	rec = lookup_rec(ip, ip);
-	if (!rec)
-		goto out_unlock;
-
-	/*
-	 * Check if the rec says it has a direct call but we didn't
-	 * find one earlier?
-	 */
-	if (WARN_ON(rec->flags & FTRACE_FL_DIRECT))
-		goto out_unlock;
-
-	/* Make sure the ip points to the exact record */
-	if (ip != rec->ip) {
-		ip = rec->ip;
-		/* Need to check this ip for a direct. */
-		if (ftrace_find_rec_direct(ip))
-			goto out_unlock;
-	}
-
-	ret = -ENOMEM;
-	direct = ftrace_find_direct_func(addr);
-	if (!direct) {
-		direct = ftrace_alloc_direct_func(addr);
-		if (!direct)
-			goto out_unlock;
-	}
-
-	entry = ftrace_add_rec_direct(ip, addr, &free_hash);
-	if (!entry)
-		goto out_unlock;
-
-	ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
-
-	if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
-		ret = register_ftrace_function_nolock(&direct_ops);
-		if (ret)
-			ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
-	}
-
-	if (ret) {
-		remove_hash_entry(direct_functions, entry);
-		kfree(entry);
-		if (!direct->count) {
-			list_del_rcu(&direct->next);
-			synchronize_rcu_tasks();
-			kfree(direct);
-			if (free_hash)
-				free_ftrace_hash(free_hash);
-			free_hash = NULL;
-			ftrace_direct_func_count--;
-		}
-	} else {
-		direct->count++;
-	}
- out_unlock:
-	mutex_unlock(&direct_mutex);
-
-	if (free_hash) {
-		synchronize_rcu_tasks();
-		free_ftrace_hash(free_hash);
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(register_ftrace_direct);
-
-static struct ftrace_func_entry *find_direct_entry(unsigned long *ip,
-						   struct dyn_ftrace **recp)
-{
-	struct ftrace_func_entry *entry;
-	struct dyn_ftrace *rec;
-
-	rec = lookup_rec(*ip, *ip);
-	if (!rec)
-		return NULL;
-
-	entry = __ftrace_lookup_ip(direct_functions, rec->ip);
-	if (!entry) {
-		WARN_ON(rec->flags & FTRACE_FL_DIRECT);
-		return NULL;
-	}
-
-	WARN_ON(!(rec->flags & FTRACE_FL_DIRECT));
-
-	/* Passed in ip just needs to be on the call site */
-	*ip = rec->ip;
-
-	if (recp)
-		*recp = rec;
-
-	return entry;
-}
-
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-	struct ftrace_func_entry *entry;
-	struct ftrace_hash *hash;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	entry = find_direct_entry(&ip, NULL);
-	if (!entry)
-		goto out_unlock;
-
-	hash = direct_ops.func_hash->filter_hash;
-	if (hash->count == 1)
-		unregister_ftrace_function(&direct_ops);
-
-	ret = ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
-
-	WARN_ON(ret);
-
-	remove_hash_entry(direct_functions, entry);
-
-	direct = ftrace_find_direct_func(addr);
-	if (!WARN_ON(!direct)) {
-		/* This is the good path (see the ! before WARN) */
-		direct->count--;
-		WARN_ON(direct->count < 0);
-		if (!direct->count) {
-			list_del_rcu(&direct->next);
-			synchronize_rcu_tasks();
-			kfree(direct);
-			kfree(entry);
-			ftrace_direct_func_count--;
-		}
-	}
- out_unlock:
-	mutex_unlock(&direct_mutex);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
-
-static struct ftrace_ops stub_ops = {
-	.func		= ftrace_stub,
-};
-
-/**
- * ftrace_modify_direct_caller - modify ftrace nop directly
- * @entry: The ftrace hash entry of the direct helper for @rec
- * @rec: The record representing the function site to patch
- * @old_addr: The location that the site at @rec->ip currently calls
- * @new_addr: The location that the site at @rec->ip should call
- *
- * An architecture may overwrite this function to optimize the
- * changing of the direct callback on an ftrace nop location.
- * This is called with the ftrace_lock mutex held, and no other
- * ftrace callbacks are on the associated record (@rec). Thus,
- * it is safe to modify the ftrace record, where it should be
- * currently calling @old_addr directly, to call @new_addr.
- *
- * This is called with direct_mutex locked.
- *
- * Safety checks should be made to make sure that the code at
- * @rec->ip is currently calling @old_addr. And this must
- * also update entry->direct to @new_addr.
- */
-int __weak ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-				       struct dyn_ftrace *rec,
-				       unsigned long old_addr,
-				       unsigned long new_addr)
-{
-	unsigned long ip = rec->ip;
-	int ret;
-
-	lockdep_assert_held(&direct_mutex);
-
-	/*
-	 * The ftrace_lock was used to determine if the record
-	 * had more than one registered user to it. If it did,
-	 * we needed to prevent that from changing to do the quick
-	 * switch. But if it did not (only a direct caller was attached)
-	 * then this function is called. But this function can deal
-	 * with attached callers to the rec that we care about, and
-	 * since this function uses standard ftrace calls that take
-	 * the ftrace_lock mutex, we need to release it.
-	 */
-	mutex_unlock(&ftrace_lock);
-
-	/*
-	 * By setting a stub function at the same address, we force
-	 * the code to call the iterator and the direct_ops helper.
-	 * This means that @ip does not call the direct call, and
-	 * we can simply modify it.
-	 */
-	ret = ftrace_set_filter_ip(&stub_ops, ip, 0, 0);
-	if (ret)
-		goto out_lock;
-
-	ret = register_ftrace_function_nolock(&stub_ops);
-	if (ret) {
-		ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
-		goto out_lock;
-	}
-
-	entry->direct = new_addr;
-
-	/*
-	 * By removing the stub, we put back the direct call, calling
-	 * the @new_addr.
-	 */
-	unregister_ftrace_function(&stub_ops);
-	ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
-
- out_lock:
-	mutex_lock(&ftrace_lock);
-
-	return ret;
-}
-
-/**
- * modify_ftrace_direct - Modify an existing direct call to call something else
- * @ip: The instruction pointer to modify
- * @old_addr: The address that the current @ip calls directly
- * @new_addr: The address that the @ip should call
- *
- * This modifies a ftrace direct caller at an instruction pointer without
- * having to disable it first. The direct call will switch over to the
- * @new_addr without missing anything.
- *
- * Returns: zero on success. Non zero on error, which includes:
- *  -ENODEV : the @ip given has no direct caller attached
- *  -EINVAL : the @old_addr does not match the current direct caller
- */
-int modify_ftrace_direct(unsigned long ip,
-			 unsigned long old_addr, unsigned long new_addr)
-{
-	struct ftrace_direct_func *direct, *new_direct = NULL;
-	struct ftrace_func_entry *entry;
-	struct dyn_ftrace *rec;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	mutex_lock(&ftrace_lock);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	entry = find_direct_entry(&ip, &rec);
-	if (!entry)
-		goto out_unlock;
-
-	ret = -EINVAL;
-	if (entry->direct != old_addr)
-		goto out_unlock;
-
-	direct = ftrace_find_direct_func(old_addr);
-	if (WARN_ON(!direct))
-		goto out_unlock;
-	if (direct->count > 1) {
-		ret = -ENOMEM;
-		new_direct = ftrace_alloc_direct_func(new_addr);
-		if (!new_direct)
-			goto out_unlock;
-		direct->count--;
-		new_direct->count++;
-	} else {
-		direct->addr = new_addr;
-	}
-
-	/*
-	 * If there's no other ftrace callback on the rec->ip location,
-	 * then it can be changed directly by the architecture.
-	 * If there is another caller, then we just need to change the
-	 * direct caller helper to point to @new_addr.
-	 */
-	if (ftrace_rec_count(rec) == 1) {
-		ret = ftrace_modify_direct_caller(entry, rec, old_addr, new_addr);
-	} else {
-		entry->direct = new_addr;
-		ret = 0;
-	}
-
-	if (unlikely(ret && new_direct)) {
-		direct->count++;
-		list_del_rcu(&new_direct->next);
-		synchronize_rcu_tasks();
-		kfree(new_direct);
-		ftrace_direct_func_count--;
-	}
-
- out_unlock:
-	mutex_unlock(&ftrace_lock);
-	mutex_unlock(&direct_mutex);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(modify_ftrace_direct);
-
 #define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
 
 static int check_direct_multi(struct ftrace_ops *ops)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
  2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
  2023-02-01 16:34 ` [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 15:17   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 4/8] ftrace: Store direct called addresses in their ops Florent Revest
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

Now that the original _ftrace_direct APIs are gone, the "_multi"
suffixes only add confusion.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/ftrace.h                      | 16 +++++------
 kernel/bpf/trampoline.c                     | 14 ++++-----
 kernel/trace/ftrace.c                       | 32 ++++++++++-----------
 kernel/trace/trace_selftest.c               |  7 +++--
 samples/Kconfig                             |  2 +-
 samples/ftrace/ftrace-direct-modify.c       |  8 +++---
 samples/ftrace/ftrace-direct-multi-modify.c |  8 +++---
 samples/ftrace/ftrace-direct-multi.c        |  6 ++--
 samples/ftrace/ftrace-direct-too.c          |  6 ++--
 samples/ftrace/ftrace-direct.c              |  6 ++--
 10 files changed, 53 insertions(+), 52 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 2d7c85e47c2d..a7dbd307c3a4 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -398,10 +398,10 @@ struct ftrace_func_entry {
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 extern int ftrace_direct_func_count;
 unsigned long ftrace_find_rec_direct(unsigned long ip);
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr);
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
 
 #else
 struct ftrace_ops;
@@ -410,19 +410,19 @@ static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	return 0;
 }
-static inline int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index d0ed7d6f5eec..150b53316df2 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -39,14 +39,14 @@ static int bpf_tramp_ftrace_ops_func(struct ftrace_ops *ops, enum ftrace_ops_cmd
 	int ret = 0;
 
 	if (cmd == FTRACE_OPS_CMD_ENABLE_SHARE_IPMODIFY_SELF) {
-		/* This is called inside register_ftrace_direct_multi(), so
+		/* This is called inside register_ftrace_direct(), so
 		 * tr->mutex is already locked.
 		 */
 		lockdep_assert_held_once(&tr->mutex);
 
 		/* Instead of updating the trampoline here, we propagate
-		 * -EAGAIN to register_ftrace_direct_multi(). Then we can
-		 * retry register_ftrace_direct_multi() after updating the
+		 * -EAGAIN to register_ftrace_direct(). Then we can
+		 * retry register_ftrace_direct() after updating the
 		 * trampoline.
 		 */
 		if ((tr->flags & BPF_TRAMP_F_CALL_ORIG) &&
@@ -198,7 +198,7 @@ static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
 	int ret;
 
 	if (tr->func.ftrace_managed)
-		ret = unregister_ftrace_direct_multi(tr->fops, (long)old_addr);
+		ret = unregister_ftrace_direct(tr->fops, (long)old_addr);
 	else
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, NULL);
 
@@ -215,9 +215,9 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
 
 	if (tr->func.ftrace_managed) {
 		if (lock_direct_mutex)
-			ret = modify_ftrace_direct_multi(tr->fops, (long)new_addr);
+			ret = modify_ftrace_direct(tr->fops, (long)new_addr);
 		else
-			ret = modify_ftrace_direct_multi_nolock(tr->fops, (long)new_addr);
+			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
 	} else {
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, new_addr);
 	}
@@ -243,7 +243,7 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
 
 	if (tr->func.ftrace_managed) {
 		ftrace_set_filter_ip(tr->fops, (unsigned long)ip, 0, 1);
-		ret = register_ftrace_direct_multi(tr->fops, (long)new_addr);
+		ret = register_ftrace_direct(tr->fops, (long)new_addr);
 	} else {
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr);
 	}
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5efe201428fa..cb77a0a208c7 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5312,7 +5312,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
 }
 
 /**
- * register_ftrace_direct_multi - Call a custom trampoline directly
+ * register_ftrace_direct - Call a custom trampoline directly
  * for multiple functions registered in @ops
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the trampoline to call at @ops functions
@@ -5333,7 +5333,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
  *  -ENODEV  - @ip does not point to a ftrace nop location (or not supported)
  *  -ENOMEM  - There was an allocation failure.
  */
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash, *free_hash = NULL;
 	struct ftrace_func_entry *entry, *new;
@@ -5391,11 +5391,11 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 	}
 	return err;
 }
-EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(register_ftrace_direct);
 
 /**
- * unregister_ftrace_direct_multi - Remove calls to custom trampoline
- * previously registered by register_ftrace_direct_multi for @ops object.
+ * unregister_ftrace_direct - Remove calls to custom trampoline
+ * previously registered by register_ftrace_direct for @ops object.
  * @ops: The address of the struct ftrace_ops object
  *
  * This is used to remove a direct calls to @addr from the nop locations
@@ -5406,7 +5406,7 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
  *  0 on success
  *  -EINVAL - The @ops object was not properly registered.
  */
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash = ops->func_hash->filter_hash;
 	int err;
@@ -5426,10 +5426,10 @@ int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 	ops->trampoline = 0;
 	return err;
 }
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
 
 static int
-__modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+__modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash;
 	struct ftrace_func_entry *entry, *iter;
@@ -5476,7 +5476,7 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 }
 
 /**
- * modify_ftrace_direct_multi_nolock - Modify an existing direct 'multi' call
+ * modify_ftrace_direct_nolock - Modify an existing direct 'multi' call
  * to call something else
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the new trampoline to call at @ops functions
@@ -5493,19 +5493,19 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
  * Returns: zero on success. Non zero on error, which includes:
  *  -EINVAL - The @ops object was not properly registered.
  */
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
 {
 	if (check_direct_multi(ops))
 		return -EINVAL;
 	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
 		return -EINVAL;
 
-	return __modify_ftrace_direct_multi(ops, addr);
+	return __modify_ftrace_direct(ops, addr);
 }
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct_nolock);
 
 /**
- * modify_ftrace_direct_multi - Modify an existing direct 'multi' call
+ * modify_ftrace_direct - Modify an existing direct 'multi' call
  * to call something else
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the new trampoline to call at @ops functions
@@ -5519,7 +5519,7 @@ EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
  * Returns: zero on success. Non zero on error, which includes:
  *  -EINVAL - The @ops object was not properly registered.
  */
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	int err;
 
@@ -5529,11 +5529,11 @@ int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 		return -EINVAL;
 
 	mutex_lock(&direct_mutex);
-	err = __modify_ftrace_direct_multi(ops, addr);
+	err = __modify_ftrace_direct(ops, addr);
 	mutex_unlock(&direct_mutex);
 	return err;
 }
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct);
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 9b7f10cbc51d..06218fc9374b 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -874,7 +874,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 * and make sure we get graph trace.
 	 */
 	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
-	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
+	ret = register_ftrace_direct(&direct,
+				     (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -894,8 +895,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 
 	unregister_ftrace_graph(&fgraph_ops);
 
-	ret = unregister_ftrace_direct_multi(&direct,
-				       (unsigned long) trace_direct_tramp);
+	ret = unregister_ftrace_direct(&direct,
+				       (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
diff --git a/samples/Kconfig b/samples/Kconfig
index 0d81c00289ee..e85998ca354d 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -38,7 +38,7 @@ config SAMPLE_FTRACE_DIRECT
 	  that hooks to wake_up_process and prints the parameters.
 
 config SAMPLE_FTRACE_DIRECT_MULTI
-	tristate "Build register_ftrace_direct_multi() example"
+	tristate "Build register_ftrace_direct() on multiple ips example"
 	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS && m
 	depends on HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	help
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index e1e6c286970c..0f1b8859e1e1 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -116,7 +116,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
+		ret = modify_ftrace_direct(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -132,7 +132,7 @@ static int __init ftrace_direct_init(void)
 	int ret;
 
 	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
-	ret = register_ftrace_direct_multi(&direct, my_tramp);
+	ret = register_ftrace_direct(&direct, my_tramp);
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
 	return ret;
@@ -141,12 +141,12 @@ static int __init ftrace_direct_init(void)
 static void __exit ftrace_direct_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct_multi(&direct, my_tramp);
+	unregister_ftrace_direct(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index d52370cad0b6..407c56325e65 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -123,7 +123,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
+		ret = modify_ftrace_direct(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -141,7 +141,7 @@ static int __init ftrace_direct_multi_init(void)
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
 
-	ret = register_ftrace_direct_multi(&direct, my_tramp);
+	ret = register_ftrace_direct(&direct, my_tramp);
 
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
@@ -151,12 +151,12 @@ static int __init ftrace_direct_multi_init(void)
 static void __exit ftrace_direct_multi_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct_multi(&direct, my_tramp);
+	unregister_ftrace_direct(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_multi_init);
 module_exit(ftrace_direct_multi_exit);
 
 MODULE_AUTHOR("Jiri Olsa");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index ec1088922517..42a94806a446 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -73,17 +73,17 @@ static int __init ftrace_direct_multi_init(void)
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_multi_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 module_init(ftrace_direct_multi_init);
 module_exit(ftrace_direct_multi_exit);
 
 MODULE_AUTHOR("Jiri Olsa");
-MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 0e907092e2c0..7ee5dd3cc61d 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -76,17 +76,17 @@ static int __init ftrace_direct_init(void)
 {
 	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index e446c38f6b58..5ffce87fa83e 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -69,17 +69,17 @@ static int __init ftrace_direct_init(void)
 {
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
 MODULE_LICENSE("GPL");
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/8] ftrace: Store direct called addresses in their ops
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (2 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 15:29   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

All direct calls are now registered using the register_ftrace_direct API
so each ops can jump to only one direct-called trampoline.

By storing the direct called trampoline address directly in the ops we
can save one hashmap lookup in the direct call ops and implement arm64
direct calls on top of call ops.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/ftrace.h | 3 +++
 kernel/trace/ftrace.c  | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index a7dbd307c3a4..84f717f8959e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -321,6 +321,9 @@ struct ftrace_ops {
 	unsigned long			trampoline_size;
 	struct list_head		list;
 	ftrace_ops_func_t		ops_func;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	unsigned long			direct_call;
+#endif
 #endif
 };
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index cb77a0a208c7..b0426de11c45 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2577,9 +2577,8 @@ ftrace_add_rec_direct(unsigned long ip, unsigned long addr,
 static void call_direct_funcs(unsigned long ip, unsigned long pip,
 			      struct ftrace_ops *ops, struct ftrace_regs *fregs)
 {
-	unsigned long addr;
+	unsigned long addr = ops->direct_call;
 
-	addr = ftrace_find_rec_direct(ip);
 	if (!addr)
 		return;
 
@@ -5375,6 +5374,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	ops->func = call_direct_funcs;
 	ops->flags = MULTI_FLAGS;
 	ops->trampoline = FTRACE_REGS_ADDR;
+	ops->direct_call = addr;
 
 	err = register_ftrace_function_nolock(ops);
 
@@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	/* Enable the tmp_ops to have the same functions as the direct ops */
 	ftrace_ops_init(&tmp_ops);
 	tmp_ops.func_hash = ops->func_hash;
+	tmp_ops.direct_call = addr;
 
 	err = register_ftrace_function_nolock(&tmp_ops);
 	if (err)
@@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 			entry->direct = addr;
 		}
 	}
+	ops->direct_call = addr;
 
 	mutex_unlock(&ftrace_lock);
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (3 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 4/8] ftrace: Store direct called addresses in their ops Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 15:54   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Florent Revest
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

Direct called trampolines can be called in two ways:
- either from the ftrace callsite. In this case, they do not access any
  struct ftrace_regs nor pt_regs
- Or, if a ftrace ops is also attached, from the end of a ftrace
  trampoline. In this case, the call_direct_funcs ops is in charge of
  setting the direct call trampoline's address in a struct ftrace_regs

Since "ftrace: pass fregs to arch_ftrace_set_direct_caller()", the later
case no longer requires a full pt_regs. It only needs a struct
ftrace_regs so DIRECT_CALLS can work with both WITH_ARGS or WITH_REGS.
With architectures like arm64 already abandoning WITH_REGS in favor of
WITH_ARGS, it's important to have DIRECT_CALLS work WITH_ARGS only.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 kernel/trace/Kconfig  | 2 +-
 kernel/trace/ftrace.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 5df427a2321d..4496a7c69810 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -257,7 +257,7 @@ config DYNAMIC_FTRACE_WITH_REGS
 
 config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	def_bool y
-	depends on DYNAMIC_FTRACE_WITH_REGS
+	depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
 	depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 
 config DYNAMIC_FTRACE_WITH_CALL_OPS
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index b0426de11c45..73b6f6489ba1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5282,7 +5282,7 @@ static LIST_HEAD(ftrace_direct_funcs);
 
 static int register_ftrace_function_nolock(struct ftrace_ops *ops);
 
-#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
+#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)
 
 static int check_direct_multi(struct ftrace_ops *ops)
 {
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (4 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02 19:03   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 7/8] arm64: ftrace: Add direct call support Florent Revest
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Xu Kuohai, Li Huafei,
	Florent Revest

From: Xu Kuohai <xukuohai@huawei.com>

After direct call is enabled for arm64, ftrace selftest enters a
dead loop:

<trace_selftest_dynamic_test_func>:
00  bti     c
01  mov     x9, x30                            <trace_direct_tramp>:
02  bl      <trace_direct_tramp>    ---------->     ret
                                                     |
                                         lr/x30 is 03, return to 03
                                                     |
03  mov     w0, #0x0   <-----------------------------|
     |                                               |
     |                   dead loop!                  |
     |                                               |
04  ret   ---- lr/x30 is still 03, go back to 03 ----|

The reason is that when the direct caller trace_direct_tramp() returns
to the patched function trace_selftest_dynamic_test_func(), lr is still
the address after the instrumented instruction in the patched function,
so when the patched function exits, it returns to itself!

To fix this issue, we need to restore lr before trace_direct_tramp()
exits, so use a dedicated trace_direct_tramp() for arm64.

Reported-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/arm64/include/asm/ftrace.h  |  7 +++++++
 arch/arm64/kernel/entry-ftrace.S | 10 ++++++++++
 kernel/trace/trace_selftest.c    |  2 ++
 3 files changed, 19 insertions(+)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1c2672bbbf37..cf6d9c42ff36 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -168,6 +168,13 @@ static inline bool arch_syscall_match_sym_name(const char *sym,
 	 */
 	return !strcmp(sym + 8, name);
 }
+
+#if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \
+    defined(CONFIG_FTRACE_SELFTEST)
+extern void ftrace_dummy_tramp(void);
+#define trace_direct_tramp ftrace_dummy_tramp
+#endif
+
 #endif /* ifndef __ASSEMBLY__ */
 
 #endif /* __ASM_FTRACE_H */
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 350ed81324ac..9869debd22fb 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -118,6 +118,16 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	ret	x9
 SYM_CODE_END(ftrace_caller)
 
+#if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \
+    defined(CONFIG_FTRACE_SELFTEST)
+SYM_CODE_START(ftrace_dummy_tramp)
+	bti c
+	mov x10, x30
+	mov x30, x9
+	ret x10
+SYM_CODE_END(ftrace_dummy_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
 
 /*
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 06218fc9374b..f9f5d4e8ab50 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -789,11 +789,13 @@ static struct fgraph_ops fgraph_ops __initdata  = {
 #define CALL_DEPTH_ACCOUNT ""
 #endif
 
+#ifndef trace_direct_tramp
 noinline __noclone static void trace_direct_tramp(void)
 {
 	asm(CALL_DEPTH_ACCOUNT);
 }
 #endif
+#endif
 
 /*
  * Pretty much the same than for the function tracer from which the selftest
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 7/8] arm64: ftrace: Add direct call support
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (5 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-03 15:34   ` Mark Rutland
  2023-02-01 16:34 ` [PATCH 8/8] arm64: ftrace: Add direct called trampoline samples support Florent Revest
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

This builds up on the CALL_OPS work which extends the ftrace patchsite
on arm64 with an ops pointer usable by the ftrace trampoline.

This ops pointer is valid at all time. Indeed, it is either pointing to
ftrace_list_ops or to the single ops which should be called from that
patchsite.

There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function:
  - If the direct called trampoline is within the reach of a BL
    instruction
     -> the ftrace patchsite jumps to the trampoline
  - Else
     -> the ftrace patchsite jumps to the ftrace_caller trampoline which
        reads the ops pointer in the patchsite and jumps to the direct
        call address stored in the ops
- Else
  -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
     ops literal points to ftrace_list_ops so it iterates over all
     registered ftrace ops, including the direct call ops and calls its
     call_direct_funcs handler which stores the direct called
     trampoline's address in the ftrace_regs and the ftrace_caller
     trampoline will return to that address instead of returning to the
     traced function

Signed-off-by: Florent Revest <revest@chromium.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/Kconfig               |  2 ++
 arch/arm64/include/asm/ftrace.h  | 17 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  6 ++++
 arch/arm64/kernel/entry-ftrace.S | 60 +++++++++++++++++++++++---------
 arch/arm64/kernel/ftrace.c       | 36 ++++++++++++++++---
 5 files changed, 100 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6f6f37161cf6..7deafd653c42 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -188,6 +188,8 @@ config ARM64
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
 		if $(cc-option,-fpatchable-function-entry=2)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+		if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
 	select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
 		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index cf6d9c42ff36..9fb95966b6d5 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -80,6 +80,10 @@ struct ftrace_regs {
 
 	unsigned long sp;
 	unsigned long pc;
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	unsigned long custom_tramp;
+#endif
 };
 
 static __always_inline unsigned long
@@ -136,6 +140,19 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *op, struct ftrace_regs *fregs);
 #define ftrace_graph_func ftrace_graph_func
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
+						 unsigned long addr)
+{
+	/*
+	 * The ftrace trampoline will return to this address instead of the
+	 * instrumented function.
+	 */
+	fregs->custom_tramp = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #endif
 
 #define ftrace_return_address(n) return_address(n)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index ae345b06e9f7..c67f0373b88c 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -93,6 +93,9 @@ int main(void)
   DEFINE(FREGS_LR,		offsetof(struct ftrace_regs, lr));
   DEFINE(FREGS_SP,		offsetof(struct ftrace_regs, sp));
   DEFINE(FREGS_PC,		offsetof(struct ftrace_regs, pc));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FREGS_CUSTOM_TRAMP,	offsetof(struct ftrace_regs, custom_tramp));
+#endif
   DEFINE(FREGS_SIZE,		sizeof(struct ftrace_regs));
   BLANK();
 #endif
@@ -197,6 +200,9 @@ int main(void)
 #endif
 #ifdef CONFIG_FUNCTION_TRACER
   DEFINE(FTRACE_OPS_FUNC,		offsetof(struct ftrace_ops, func));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FTRACE_OPS_DIRECT_CALL,	offsetof(struct ftrace_ops, direct_call));
+#endif
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 9869debd22fb..0576f38e6362 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -36,6 +36,30 @@
 SYM_CODE_START(ftrace_caller)
 	bti	c
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+	/*
+	 * The literal pointer to the ops is at an 8-byte aligned boundary
+	 * which is either 12 or 16 bytes before the BL instruction in the call
+	 * site. See ftrace_call_adjust() for details.
+	 *
+	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
+	 * and we can find the address of the literal in either case by
+	 * aligning to an 8-byte boundary and subtracting 16. We do the
+	 * alignment first as this allows us to fold the subtraction into the
+	 * LDR.
+	 */
+	bic	x11, x30, 0x7
+	ldr	x11, [x11, #-(4 * AARCH64_INSN_SIZE)]		// op
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	/* If the op has a direct call address, return to it */
+	ldr	x12, [x11, #FTRACE_OPS_DIRECT_CALL]		// op->direct_call
+	cbz	x12, 1f
+	ret	x12
+1:
+#endif
+#endif
+
 	/* Save original SP */
 	mov	x10, sp
 
@@ -57,6 +81,11 @@ SYM_CODE_START(ftrace_caller)
 	/* Save the PC after the ftrace callsite */
 	str	x30, [sp, #FREGS_PC]
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	/* Set custom_tramp to zero  */
+	str	xzr, [sp, #FREGS_CUSTOM_TRAMP]
+#endif
+
 	/* Create a frame record for the callsite above the ftrace regs */
 	stp	x29, x9, [sp, #FREGS_SIZE + 16]
 	add	x29, sp, #FREGS_SIZE + 16
@@ -71,20 +100,7 @@ SYM_CODE_START(ftrace_caller)
 	mov	x3, sp					// regs
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
-	/*
-	 * The literal pointer to the ops is at an 8-byte aligned boundary
-	 * which is either 12 or 16 bytes before the BL instruction in the call
-	 * site. See ftrace_call_adjust() for details.
-	 *
-	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
-	 * and we can find the address of the literal in either case by
-	 * aligning to an 8-byte boundary and subtracting 16. We do the
-	 * alignment first as this allows us to fold the subtraction into the
-	 * LDR.
-	 */
-	bic	x2, x30, 0x7
-	ldr	x2, [x2, #-16]				// op
-
+	mov	x2, x11					// op
 	ldr	x4, [x2, #FTRACE_OPS_FUNC]		// op->func
 	blr	x4					// op->func(ip, parent_ip, op, regs)
 
@@ -110,12 +126,24 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	/* Restore the callsite's FP, LR, PC */
 	ldr	x29, [sp, #FREGS_FP]
 	ldr	x30, [sp, #FREGS_LR]
-	ldr	x9,  [sp, #FREGS_PC]
+	ldr	x10, [sp, #FREGS_PC]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	ldr	x11, [sp, #FREGS_CUSTOM_TRAMP]
+	cbz	x11, 1f
+	/* Set x9 to parent ip before jump to custom trampoline */
+	mov	x9,  x30
+	/* Set lr to self ip */
+	ldr	x30, [sp, #FREGS_PC]
+	/* Set x10 (used for return address) to custom trampoline */
+	mov	x10, x11
+1:
+#endif
 
 	/* Restore the callsite's SP */
 	add	sp, sp, #FREGS_SIZE + 32
 
-	ret	x9
+	ret	x10
 SYM_CODE_END(ftrace_caller)
 
 #if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 5545fe1a9012..758436727fba 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -206,6 +206,13 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
 	return NULL;
 }
 
+static bool reachable_by_bl(unsigned long addr, unsigned long pc)
+{
+	long offset = (long)addr - (long)pc;
+
+	return offset >= -SZ_128M && offset < SZ_128M;
+}
+
 /*
  * Find the address the callsite must branch to in order to reach '*addr'.
  *
@@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
 				      unsigned long *addr)
 {
 	unsigned long pc = rec->ip;
-	long offset = (long)*addr - (long)pc;
 	struct plt_entry *plt;
 
+	/*
+	 * If a custom trampoline is unreachable, rely on the ftrace_caller
+	 * trampoline which knows how to indirectly reach that trampoline
+	 * through ops->direct_call.
+	 */
+	if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
+		*addr = FTRACE_ADDR;
+
 	/*
 	 * When the target is within range of the 'BL' instruction, use 'addr'
 	 * as-is and branch to that directly.
 	 */
-	if (offset >= -SZ_128M && offset < SZ_128M)
+	if (reachable_by_bl(*addr, pc))
 		return true;
 
 	/*
@@ -330,12 +344,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 		       unsigned long addr)
 {
-	if (WARN_ON_ONCE(old_addr != (unsigned long)ftrace_caller))
+	unsigned long pc = rec->ip;
+	u32 old, new;
+	int ret;
+
+	ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+	if (ret)
+		return ret;
+
+	if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
 		return -EINVAL;
-	if (WARN_ON_ONCE(addr != (unsigned long)ftrace_caller))
+	if (!ftrace_find_callable_addr(rec, NULL, &addr))
 		return -EINVAL;
 
-	return ftrace_rec_update_ops(rec);
+	old = aarch64_insn_gen_branch_imm(pc, old_addr,
+					  AARCH64_INSN_BRANCH_LINK);
+	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+
+	return ftrace_modify_code(pc, old, new, true);
 }
 #endif
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 8/8] arm64: ftrace: Add direct called trampoline samples support
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (6 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 7/8] arm64: ftrace: Add direct call support Florent Revest
@ 2023-02-01 16:34 ` Florent Revest
  2023-02-02  8:36 ` [PATCH 0/8] Add ftrace direct call for arm64 Xu Kuohai
  2023-02-02 20:06 ` Steven Rostedt
  9 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-01 16:34 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, Florent Revest

The ftrace samples need per-architecture trampoline implementations
to save and restore argument registers around the calls to
my_direct_func* and to restore polluted registers (eg: x30).

These samples also include <asm/nospec-branch.h> which does not exist on
arm64 and <asm/asm-offsets.h> which, on arm64, is not necessary and
redefines previously defined macros (resulting in warnings) so these
includes are guarded by !CONFIG_ARM64.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/arm64/Kconfig                          |  2 ++
 samples/ftrace/ftrace-direct-modify.c       | 32 ++++++++++++++++++
 samples/ftrace/ftrace-direct-multi-modify.c | 36 +++++++++++++++++++++
 samples/ftrace/ftrace-direct-multi.c        | 22 +++++++++++++
 samples/ftrace/ftrace-direct-too.c          | 25 ++++++++++++++
 samples/ftrace/ftrace-direct.c              | 23 +++++++++++++
 6 files changed, 140 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7deafd653c42..5480ef8eaa2a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -194,6 +194,8 @@ config ARM64
 		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
 		if DYNAMIC_FTRACE_WITH_ARGS
+	select HAVE_SAMPLE_FTRACE_DIRECT
+	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_FAST_GUP
 	select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 0f1b8859e1e1..944b327e130c 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -2,8 +2,10 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func1(void);
 extern void my_direct_func2(void);
@@ -96,6 +98,36 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static unsigned long my_tramp = (unsigned long)my_tramp1;
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 407c56325e65..ae1f97271d1a 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -2,8 +2,10 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func1(unsigned long ip);
 extern void my_direct_func2(unsigned long ip);
@@ -103,6 +105,40 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static unsigned long my_tramp = (unsigned long)my_tramp1;
 static unsigned long tramps[2] = {
 	(unsigned long)my_tramp1,
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 42a94806a446..84713fe28b74 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -4,8 +4,10 @@
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
 #include <linux/sched/stat.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(unsigned long ip);
 
@@ -66,6 +68,26 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_multi_init(void)
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 7ee5dd3cc61d..f46ee08caa2b 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -3,8 +3,10 @@
 
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(struct vm_area_struct *vma,
 			   unsigned long address, unsigned int flags);
@@ -70,6 +72,29 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #48\n"
+"	stp	x9, x30, [sp]\n"
+"	stp	x0, x1, [sp, #16]\n"
+"	str	x2, [sp, #32]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldp	x0, x1, [sp, #16]\n"
+"	ldr	x2, [sp, #32]\n"
+"	add	sp, sp, #48\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 5ffce87fa83e..e37e8d9e855c 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -3,8 +3,10 @@
 
 #include <linux/sched.h> /* for wake_up_process() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(struct task_struct *p);
 
@@ -63,6 +65,27 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Add ftrace direct call for arm64
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (7 preceding siblings ...)
  2023-02-01 16:34 ` [PATCH 8/8] arm64: ftrace: Add direct called trampoline samples support Florent Revest
@ 2023-02-02  8:36 ` Xu Kuohai
  2023-02-02 10:50   ` Daniel Borkmann
  2023-02-02 20:06 ` Steven Rostedt
  9 siblings, 1 reply; 35+ messages in thread
From: Xu Kuohai @ 2023-02-02  8:36 UTC (permalink / raw)
  To: Florent Revest, linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai

On 2/2/2023 12:34 AM, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> series on CALL_OPS [1] under the for-next/ftrace tag.
> > The first three patches consolidate the two existing ftrace APIs for registering
> direct calls. They are split to make the reviewers lives easier but if it'd be a
> preferred style, I'd be happy to squash them in the next revision.
> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
> from samples and selftests, there are no users of the _ftrace_direct API left
> in-tree so this deletes it and renames the _ftrace_direct_multi API to
> _ftrace_direct for simplicity.
> 
> The main benefit of this refactoring is that, with the API that's left, an
> ftrace_ops backing a direct call will only ever point to one direct call. We can
> therefore store the direct called trampoline address in the ops (patch 4) and
> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
> destination would be out of reach of a BL instruction at the ftrace callsite.
> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
> 
> This series has been tested on both arm64 and x86_64 with:
> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
> 2- samples/ftrace/*.ko (cf: patch 8)
> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

> 
> This follows up on prior art by Xu Kuohai [2].
> The implementation here is totally different but the fix for ftrace selftests
> (patch 6) is a trivial rebase of a patch originally by Xu so I kept his
> authorship and trailers untouched on that patch, I hope that's ok. >

that's ok for me, thanks.

> 1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
> 2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/
> 
> Florent Revest (7):
>    ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
>    ftrace: Remove the legacy _ftrace_direct API
>    ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
>    ftrace: Store direct called addresses in their ops
>    ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
>    arm64: ftrace: Add direct call support
>    arm64: ftrace: Add direct called trampoline samples support
> 
> Xu Kuohai (1):
>    ftrace: Fix dead loop caused by direct call in ftrace selftest
> 
>   arch/arm64/Kconfig                          |   4 +
>   arch/arm64/include/asm/ftrace.h             |  24 ++
>   arch/arm64/kernel/asm-offsets.c             |   6 +
>   arch/arm64/kernel/entry-ftrace.S            |  70 +++-
>   arch/arm64/kernel/ftrace.c                  |  36 +-
>   include/linux/ftrace.h                      |  51 +--
>   kernel/bpf/trampoline.c                     |  14 +-
>   kernel/trace/Kconfig                        |   2 +-
>   kernel/trace/ftrace.c                       | 433 +-------------------
>   kernel/trace/trace_selftest.c               |  14 +-
>   samples/Kconfig                             |   2 +-
>   samples/ftrace/ftrace-direct-modify.c       |  41 +-
>   samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
>   samples/ftrace/ftrace-direct-multi.c        |  28 +-
>   samples/ftrace/ftrace-direct-too.c          |  35 +-
>   samples/ftrace/ftrace-direct.c              |  33 +-
>   16 files changed, 333 insertions(+), 504 deletions(-)
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Add ftrace direct call for arm64
  2023-02-02  8:36 ` [PATCH 0/8] Add ftrace direct call for arm64 Xu Kuohai
@ 2023-02-02 10:50   ` Daniel Borkmann
  2023-02-02 17:32     ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Daniel Borkmann @ 2023-02-02 10:50 UTC (permalink / raw)
  To: Xu Kuohai, Florent Revest, linux-arm-kernel, linux-kernel,
	linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	andrii, kpsingh, jolsa, xukuohai

On 2/2/23 9:36 AM, Xu Kuohai wrote:
> On 2/2/2023 12:34 AM, Florent Revest wrote:
>> This series adds ftrace direct call support to arm64.
>> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
>>
>> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
>> series on CALL_OPS [1] under the for-next/ftrace tag.
>> > The first three patches consolidate the two existing ftrace APIs for registering
>> direct calls. They are split to make the reviewers lives easier but if it'd be a
>> preferred style, I'd be happy to squash them in the next revision.
>> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
>> from samples and selftests, there are no users of the _ftrace_direct API left
>> in-tree so this deletes it and renames the _ftrace_direct_multi API to
>> _ftrace_direct for simplicity.
>>
>> The main benefit of this refactoring is that, with the API that's left, an
>> ftrace_ops backing a direct call will only ever point to one direct call. We can
>> therefore store the direct called trampoline address in the ops (patch 4) and
>> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
>> destination would be out of reach of a BL instruction at the ftrace callsite.
>> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
>>
>> This series has been tested on both arm64 and x86_64 with:
>> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
>> 2- samples/ftrace/*.ko (cf: patch 8)
>> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

Thanks a ton for working on this!

> so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

+1, with regards to logistics, if possible it might be nice to eventually gets
this into a feature branch on arm64 tree, then we could pull it too from there
for bpf-next and hash out the BPF CI bits for arm64 in the meantime.

>> This follows up on prior art by Xu Kuohai [2].
>> The implementation here is totally different but the fix for ftrace selftests
>> (patch 6) is a trivial rebase of a patch originally by Xu so I kept his
>> authorship and trailers untouched on that patch, I hope that's ok. >
> 
> that's ok for me, thanks.
> 
>> 1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
>> 2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/
>>
>> Florent Revest (7):
>>    ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
>>    ftrace: Remove the legacy _ftrace_direct API
>>    ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
>>    ftrace: Store direct called addresses in their ops
>>    ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
>>    arm64: ftrace: Add direct call support
>>    arm64: ftrace: Add direct called trampoline samples support
>>
>> Xu Kuohai (1):
>>    ftrace: Fix dead loop caused by direct call in ftrace selftest
>>
>>   arch/arm64/Kconfig                          |   4 +
>>   arch/arm64/include/asm/ftrace.h             |  24 ++
>>   arch/arm64/kernel/asm-offsets.c             |   6 +
>>   arch/arm64/kernel/entry-ftrace.S            |  70 +++-
>>   arch/arm64/kernel/ftrace.c                  |  36 +-
>>   include/linux/ftrace.h                      |  51 +--
>>   kernel/bpf/trampoline.c                     |  14 +-
>>   kernel/trace/Kconfig                        |   2 +-
>>   kernel/trace/ftrace.c                       | 433 +-------------------
>>   kernel/trace/trace_selftest.c               |  14 +-
>>   samples/Kconfig                             |   2 +-
>>   samples/ftrace/ftrace-direct-modify.c       |  41 +-
>>   samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
>>   samples/ftrace/ftrace-direct-multi.c        |  28 +-
>>   samples/ftrace/ftrace-direct-too.c          |  35 +-
>>   samples/ftrace/ftrace-direct.c              |  33 +-
>>   16 files changed, 333 insertions(+), 504 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
@ 2023-02-02 15:01   ` Mark Rutland
  2023-02-02 17:37     ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 15:01 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:13PM +0100, Florent Revest wrote:
> The _multi API requires that users keep their own ops and can enforce
> that an op is only associated to one direct call.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  kernel/trace/trace_selftest.c         |  9 ++++++---
>  samples/ftrace/ftrace-direct-modify.c | 11 +++++++----
>  samples/ftrace/ftrace-direct-too.c    | 12 +++++++-----
>  samples/ftrace/ftrace-direct.c        | 12 +++++++-----

Looking at samples/ftrace/, as of this patch we have a few samples that are
almost identical, modulo the function being traced, and some different register
shuffling for arguments:

* ftrace-direct.c and ftrace-direct-multi.c
* ftrace-direct-modify.c and ftrace-direct-modify

... perhaps it would be better to just delete the !multi versions ?

Regardless, the changes below look functionally correct to me. I gave it a spin
on x86_64, both with the FTRACE_STARTUP_TEST, and loading the modules
dynamically, and those were all happy:

| # mount -t tracefs none /sys/kernel/tracing/
| # for KO in *.ko; do
| > echo $KO;
| > insmod $KO;
| > cat /sys/kernel/tracing/enabled_functions 
| > rmmod $KO;
| > done
| ftrace-direct-modify.ko
| [   91.890089] 
| [   91.890600] **********************************************************
| [   91.892277] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
| [   91.893912] **                                                      **
| [   91.895569] ** trace_printk() being used. Allocating extra memory.  **
| [   91.897215] **                                                      **
| [   91.898861] ** This means that this is a DEBUG kernel and it is     **
| [   91.900517] ** unsafe for production use.                           **
| [   91.902138] **                                                      **
| [   91.903755] ** If you see this message and you are not debugging    **
| [   91.905378] ** the kernel, report this immediately to your vendor!  **
| [   91.907010] **                                                      **
| [   91.908625] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
| [   91.910221] **********************************************************
| schedule (1) R   D      tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp1+0x0/0x1d [ftrace_direct_modify]
| ftrace-direct-multi-modify.ko
| wake_up_process (1) R   D       tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp1+0x0/0x23 [ftrace_direct_multi_modify]
| schedule (1) R   D      tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp1+0x0/0x23 [ftrace_direct_multi_modify]
| ftrace-direct-multi.ko
| wake_up_process (1) R   D       tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp+0x0/0x30 [ftrace_direct_multi]
| schedule (1) R   D      tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp+0x0/0x30 [ftrace_direct_multi]
| ftrace-direct-too.ko
| handle_mm_fault (1) R   D       tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp+0x0/0x30 [ftrace_direct_too]
| ftrace-direct.ko
| wake_up_process (1) R   D       tramp: ftrace_regs_caller+0x0/0x66 (call_direct_funcs+0x0/0x20)
|         direct-->my_tramp+0x0/0x20 [ftrace_direct]

So FWIW:

Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

Mark.

>  4 files changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index ff0536cea968..9b7f10cbc51d 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -806,6 +806,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  	int ret;
>  	unsigned long count;
>  	char *func_name __maybe_unused;
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	struct ftrace_ops direct = {};
> +#endif
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  	if (ftrace_filter_param) {
> @@ -870,8 +873,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  	 * Register direct function together with graph tracer
>  	 * and make sure we get graph trace.
>  	 */
> -	ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> -				     (unsigned long) trace_direct_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
> +	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
>  	if (ret)
>  		goto out;
>  
> @@ -891,7 +894,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  
>  	unregister_ftrace_graph(&fgraph_ops);
>  
> -	ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> +	ret = unregister_ftrace_direct_multi(&direct,
>  				       (unsigned long) trace_direct_tramp);
>  	if (ret)
>  		goto out;
> diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
> index de5a0f67f320..e1e6c286970c 100644
> --- a/samples/ftrace/ftrace-direct-modify.c
> +++ b/samples/ftrace/ftrace-direct-modify.c
> @@ -96,6 +96,8 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static unsigned long my_tramp = (unsigned long)my_tramp1;
>  static unsigned long tramps[2] = {
>  	(unsigned long)my_tramp1,
> @@ -114,7 +116,7 @@ static int simple_thread(void *arg)
>  		if (ret)
>  			continue;
>  		t ^= 1;
> -		ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
> +		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
>  		if (!ret)
>  			my_tramp = tramps[t];
>  		WARN_ON_ONCE(ret);
> @@ -129,7 +131,8 @@ static int __init ftrace_direct_init(void)
>  {
>  	int ret;
>  
> -	ret = register_ftrace_direct(my_ip, my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
> +	ret = register_ftrace_direct_multi(&direct, my_tramp);
>  	if (!ret)
>  		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
>  	return ret;
> @@ -138,12 +141,12 @@ static int __init ftrace_direct_init(void)
>  static void __exit ftrace_direct_exit(void)
>  {
>  	kthread_stop(simple_tsk);
> -	unregister_ftrace_direct(my_ip, my_tramp);
> +	unregister_ftrace_direct_multi(&direct, my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
> +MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> index e13fb59a2b47..0e907092e2c0 100644
> --- a/samples/ftrace/ftrace-direct-too.c
> +++ b/samples/ftrace/ftrace-direct-too.c
> @@ -70,21 +70,23 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static int __init ftrace_direct_init(void)
>  {
> -	return register_ftrace_direct((unsigned long)handle_mm_fault,
> -				     (unsigned long)my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
> +
> +	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct((unsigned long)handle_mm_fault,
> -				 (unsigned long)my_tramp);
> +	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
> +MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
> index 1f769d0db20f..e446c38f6b58 100644
> --- a/samples/ftrace/ftrace-direct.c
> +++ b/samples/ftrace/ftrace-direct.c
> @@ -63,21 +63,23 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static int __init ftrace_direct_init(void)
>  {
> -	return register_ftrace_direct((unsigned long)wake_up_process,
> -				     (unsigned long)my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
> +
> +	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct((unsigned long)wake_up_process,
> -				 (unsigned long)my_tramp);
> +	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
> +MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");
> -- 
> 2.39.1.519.gcb327c4b5f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API
  2023-02-01 16:34 ` [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API Florent Revest
@ 2023-02-02 15:11   ` Mark Rutland
  0 siblings, 0 replies; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 15:11 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:14PM +0100, Florent Revest wrote:
> This API relies on a single global ops, used for all direct calls
> registered with it. However, to implement arm64 direct calls, we need
> each ops to point to a single direct call trampoline.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/ftrace.h |  32 ----
>  kernel/trace/ftrace.c  | 393 -----------------------------------------
>  2 files changed, 425 deletions(-)

Nice diffstat! ;)

Testing on x86_64: this builds cleanly, the FTRACE_STARTUP_TESTS pass, and each
of the ftrace_direct*.ko test modules loads without issue.

So FWIW:

Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 366c730beaa3..2d7c85e47c2d 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -397,14 +397,6 @@ struct ftrace_func_entry {
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  extern int ftrace_direct_func_count;
> -int register_ftrace_direct(unsigned long ip, unsigned long addr);
> -int unregister_ftrace_direct(unsigned long ip, unsigned long addr);
> -int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr);
> -struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr);
> -int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
> -				struct dyn_ftrace *rec,
> -				unsigned long old_addr,
> -				unsigned long new_addr);
>  unsigned long ftrace_find_rec_direct(unsigned long ip);
>  int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
>  int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
> @@ -414,30 +406,6 @@ int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr
>  #else
>  struct ftrace_ops;
>  # define ftrace_direct_func_count 0
> -static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
> -{
> -	return -ENOTSUPP;
> -}
> -static inline int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
> -{
> -	return -ENOTSUPP;
> -}
> -static inline int modify_ftrace_direct(unsigned long ip,
> -				       unsigned long old_addr, unsigned long new_addr)
> -{
> -	return -ENOTSUPP;
> -}
> -static inline struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
> -{
> -	return NULL;
> -}
> -static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
> -					      struct dyn_ftrace *rec,
> -					      unsigned long old_addr,
> -					      unsigned long new_addr)
> -{
> -	return -ENODEV;
> -}
>  static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
>  {
>  	return 0;
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index e634b80f49d1..5efe201428fa 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2585,20 +2585,6 @@ static void call_direct_funcs(unsigned long ip, unsigned long pip,
>  
>  	arch_ftrace_set_direct_caller(fregs, addr);
>  }
> -
> -struct ftrace_ops direct_ops = {
> -	.func		= call_direct_funcs,
> -	.flags		= FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS
> -			  | FTRACE_OPS_FL_PERMANENT,
> -	/*
> -	 * By declaring the main trampoline as this trampoline
> -	 * it will never have one allocated for it. Allocated
> -	 * trampolines should not call direct functions.
> -	 * The direct_ops should only be called by the builtin
> -	 * ftrace_regs_caller trampoline.
> -	 */
> -	.trampoline	= FTRACE_REGS_ADDR,
> -};
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
>  
>  /**
> @@ -5295,387 +5281,8 @@ struct ftrace_direct_func {
>  
>  static LIST_HEAD(ftrace_direct_funcs);
>  
> -/**
> - * ftrace_find_direct_func - test an address if it is a registered direct caller
> - * @addr: The address of a registered direct caller
> - *
> - * This searches to see if a ftrace direct caller has been registered
> - * at a specific address, and if so, it returns a descriptor for it.
> - *
> - * This can be used by architecture code to see if an address is
> - * a direct caller (trampoline) attached to a fentry/mcount location.
> - * This is useful for the function_graph tracer, as it may need to
> - * do adjustments if it traced a location that also has a direct
> - * trampoline attached to it.
> - */
> -struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
> -{
> -	struct ftrace_direct_func *entry;
> -	bool found = false;
> -
> -	/* May be called by fgraph trampoline (protected by rcu tasks) */
> -	list_for_each_entry_rcu(entry, &ftrace_direct_funcs, next) {
> -		if (entry->addr == addr) {
> -			found = true;
> -			break;
> -		}
> -	}
> -	if (found)
> -		return entry;
> -
> -	return NULL;
> -}
> -
> -static struct ftrace_direct_func *ftrace_alloc_direct_func(unsigned long addr)
> -{
> -	struct ftrace_direct_func *direct;
> -
> -	direct = kmalloc(sizeof(*direct), GFP_KERNEL);
> -	if (!direct)
> -		return NULL;
> -	direct->addr = addr;
> -	direct->count = 0;
> -	list_add_rcu(&direct->next, &ftrace_direct_funcs);
> -	ftrace_direct_func_count++;
> -	return direct;
> -}
> -
>  static int register_ftrace_function_nolock(struct ftrace_ops *ops);
>  
> -/**
> - * register_ftrace_direct - Call a custom trampoline directly
> - * @ip: The address of the nop at the beginning of a function
> - * @addr: The address of the trampoline to call at @ip
> - *
> - * This is used to connect a direct call from the nop location (@ip)
> - * at the start of ftrace traced functions. The location that it calls
> - * (@addr) must be able to handle a direct call, and save the parameters
> - * of the function being traced, and restore them (or inject new ones
> - * if needed), before returning.
> - *
> - * Returns:
> - *  0 on success
> - *  -EBUSY - Another direct function is already attached (there can be only one)
> - *  -ENODEV - @ip does not point to a ftrace nop location (or not supported)
> - *  -ENOMEM - There was an allocation failure.
> - */
> -int register_ftrace_direct(unsigned long ip, unsigned long addr)
> -{
> -	struct ftrace_direct_func *direct;
> -	struct ftrace_func_entry *entry;
> -	struct ftrace_hash *free_hash = NULL;
> -	struct dyn_ftrace *rec;
> -	int ret = -ENODEV;
> -
> -	mutex_lock(&direct_mutex);
> -
> -	ip = ftrace_location(ip);
> -	if (!ip)
> -		goto out_unlock;
> -
> -	/* See if there's a direct function at @ip already */
> -	ret = -EBUSY;
> -	if (ftrace_find_rec_direct(ip))
> -		goto out_unlock;
> -
> -	ret = -ENODEV;
> -	rec = lookup_rec(ip, ip);
> -	if (!rec)
> -		goto out_unlock;
> -
> -	/*
> -	 * Check if the rec says it has a direct call but we didn't
> -	 * find one earlier?
> -	 */
> -	if (WARN_ON(rec->flags & FTRACE_FL_DIRECT))
> -		goto out_unlock;
> -
> -	/* Make sure the ip points to the exact record */
> -	if (ip != rec->ip) {
> -		ip = rec->ip;
> -		/* Need to check this ip for a direct. */
> -		if (ftrace_find_rec_direct(ip))
> -			goto out_unlock;
> -	}
> -
> -	ret = -ENOMEM;
> -	direct = ftrace_find_direct_func(addr);
> -	if (!direct) {
> -		direct = ftrace_alloc_direct_func(addr);
> -		if (!direct)
> -			goto out_unlock;
> -	}
> -
> -	entry = ftrace_add_rec_direct(ip, addr, &free_hash);
> -	if (!entry)
> -		goto out_unlock;
> -
> -	ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
> -
> -	if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
> -		ret = register_ftrace_function_nolock(&direct_ops);
> -		if (ret)
> -			ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
> -	}
> -
> -	if (ret) {
> -		remove_hash_entry(direct_functions, entry);
> -		kfree(entry);
> -		if (!direct->count) {
> -			list_del_rcu(&direct->next);
> -			synchronize_rcu_tasks();
> -			kfree(direct);
> -			if (free_hash)
> -				free_ftrace_hash(free_hash);
> -			free_hash = NULL;
> -			ftrace_direct_func_count--;
> -		}
> -	} else {
> -		direct->count++;
> -	}
> - out_unlock:
> -	mutex_unlock(&direct_mutex);
> -
> -	if (free_hash) {
> -		synchronize_rcu_tasks();
> -		free_ftrace_hash(free_hash);
> -	}
> -
> -	return ret;
> -}
> -EXPORT_SYMBOL_GPL(register_ftrace_direct);
> -
> -static struct ftrace_func_entry *find_direct_entry(unsigned long *ip,
> -						   struct dyn_ftrace **recp)
> -{
> -	struct ftrace_func_entry *entry;
> -	struct dyn_ftrace *rec;
> -
> -	rec = lookup_rec(*ip, *ip);
> -	if (!rec)
> -		return NULL;
> -
> -	entry = __ftrace_lookup_ip(direct_functions, rec->ip);
> -	if (!entry) {
> -		WARN_ON(rec->flags & FTRACE_FL_DIRECT);
> -		return NULL;
> -	}
> -
> -	WARN_ON(!(rec->flags & FTRACE_FL_DIRECT));
> -
> -	/* Passed in ip just needs to be on the call site */
> -	*ip = rec->ip;
> -
> -	if (recp)
> -		*recp = rec;
> -
> -	return entry;
> -}
> -
> -int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
> -{
> -	struct ftrace_direct_func *direct;
> -	struct ftrace_func_entry *entry;
> -	struct ftrace_hash *hash;
> -	int ret = -ENODEV;
> -
> -	mutex_lock(&direct_mutex);
> -
> -	ip = ftrace_location(ip);
> -	if (!ip)
> -		goto out_unlock;
> -
> -	entry = find_direct_entry(&ip, NULL);
> -	if (!entry)
> -		goto out_unlock;
> -
> -	hash = direct_ops.func_hash->filter_hash;
> -	if (hash->count == 1)
> -		unregister_ftrace_function(&direct_ops);
> -
> -	ret = ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
> -
> -	WARN_ON(ret);
> -
> -	remove_hash_entry(direct_functions, entry);
> -
> -	direct = ftrace_find_direct_func(addr);
> -	if (!WARN_ON(!direct)) {
> -		/* This is the good path (see the ! before WARN) */
> -		direct->count--;
> -		WARN_ON(direct->count < 0);
> -		if (!direct->count) {
> -			list_del_rcu(&direct->next);
> -			synchronize_rcu_tasks();
> -			kfree(direct);
> -			kfree(entry);
> -			ftrace_direct_func_count--;
> -		}
> -	}
> - out_unlock:
> -	mutex_unlock(&direct_mutex);
> -
> -	return ret;
> -}
> -EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
> -
> -static struct ftrace_ops stub_ops = {
> -	.func		= ftrace_stub,
> -};
> -
> -/**
> - * ftrace_modify_direct_caller - modify ftrace nop directly
> - * @entry: The ftrace hash entry of the direct helper for @rec
> - * @rec: The record representing the function site to patch
> - * @old_addr: The location that the site at @rec->ip currently calls
> - * @new_addr: The location that the site at @rec->ip should call
> - *
> - * An architecture may overwrite this function to optimize the
> - * changing of the direct callback on an ftrace nop location.
> - * This is called with the ftrace_lock mutex held, and no other
> - * ftrace callbacks are on the associated record (@rec). Thus,
> - * it is safe to modify the ftrace record, where it should be
> - * currently calling @old_addr directly, to call @new_addr.
> - *
> - * This is called with direct_mutex locked.
> - *
> - * Safety checks should be made to make sure that the code at
> - * @rec->ip is currently calling @old_addr. And this must
> - * also update entry->direct to @new_addr.
> - */
> -int __weak ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
> -				       struct dyn_ftrace *rec,
> -				       unsigned long old_addr,
> -				       unsigned long new_addr)
> -{
> -	unsigned long ip = rec->ip;
> -	int ret;
> -
> -	lockdep_assert_held(&direct_mutex);
> -
> -	/*
> -	 * The ftrace_lock was used to determine if the record
> -	 * had more than one registered user to it. If it did,
> -	 * we needed to prevent that from changing to do the quick
> -	 * switch. But if it did not (only a direct caller was attached)
> -	 * then this function is called. But this function can deal
> -	 * with attached callers to the rec that we care about, and
> -	 * since this function uses standard ftrace calls that take
> -	 * the ftrace_lock mutex, we need to release it.
> -	 */
> -	mutex_unlock(&ftrace_lock);
> -
> -	/*
> -	 * By setting a stub function at the same address, we force
> -	 * the code to call the iterator and the direct_ops helper.
> -	 * This means that @ip does not call the direct call, and
> -	 * we can simply modify it.
> -	 */
> -	ret = ftrace_set_filter_ip(&stub_ops, ip, 0, 0);
> -	if (ret)
> -		goto out_lock;
> -
> -	ret = register_ftrace_function_nolock(&stub_ops);
> -	if (ret) {
> -		ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
> -		goto out_lock;
> -	}
> -
> -	entry->direct = new_addr;
> -
> -	/*
> -	 * By removing the stub, we put back the direct call, calling
> -	 * the @new_addr.
> -	 */
> -	unregister_ftrace_function(&stub_ops);
> -	ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
> -
> - out_lock:
> -	mutex_lock(&ftrace_lock);
> -
> -	return ret;
> -}
> -
> -/**
> - * modify_ftrace_direct - Modify an existing direct call to call something else
> - * @ip: The instruction pointer to modify
> - * @old_addr: The address that the current @ip calls directly
> - * @new_addr: The address that the @ip should call
> - *
> - * This modifies a ftrace direct caller at an instruction pointer without
> - * having to disable it first. The direct call will switch over to the
> - * @new_addr without missing anything.
> - *
> - * Returns: zero on success. Non zero on error, which includes:
> - *  -ENODEV : the @ip given has no direct caller attached
> - *  -EINVAL : the @old_addr does not match the current direct caller
> - */
> -int modify_ftrace_direct(unsigned long ip,
> -			 unsigned long old_addr, unsigned long new_addr)
> -{
> -	struct ftrace_direct_func *direct, *new_direct = NULL;
> -	struct ftrace_func_entry *entry;
> -	struct dyn_ftrace *rec;
> -	int ret = -ENODEV;
> -
> -	mutex_lock(&direct_mutex);
> -
> -	mutex_lock(&ftrace_lock);
> -
> -	ip = ftrace_location(ip);
> -	if (!ip)
> -		goto out_unlock;
> -
> -	entry = find_direct_entry(&ip, &rec);
> -	if (!entry)
> -		goto out_unlock;
> -
> -	ret = -EINVAL;
> -	if (entry->direct != old_addr)
> -		goto out_unlock;
> -
> -	direct = ftrace_find_direct_func(old_addr);
> -	if (WARN_ON(!direct))
> -		goto out_unlock;
> -	if (direct->count > 1) {
> -		ret = -ENOMEM;
> -		new_direct = ftrace_alloc_direct_func(new_addr);
> -		if (!new_direct)
> -			goto out_unlock;
> -		direct->count--;
> -		new_direct->count++;
> -	} else {
> -		direct->addr = new_addr;
> -	}
> -
> -	/*
> -	 * If there's no other ftrace callback on the rec->ip location,
> -	 * then it can be changed directly by the architecture.
> -	 * If there is another caller, then we just need to change the
> -	 * direct caller helper to point to @new_addr.
> -	 */
> -	if (ftrace_rec_count(rec) == 1) {
> -		ret = ftrace_modify_direct_caller(entry, rec, old_addr, new_addr);
> -	} else {
> -		entry->direct = new_addr;
> -		ret = 0;
> -	}
> -
> -	if (unlikely(ret && new_direct)) {
> -		direct->count++;
> -		list_del_rcu(&new_direct->next);
> -		synchronize_rcu_tasks();
> -		kfree(new_direct);
> -		ftrace_direct_func_count--;
> -	}
> -
> - out_unlock:
> -	mutex_unlock(&ftrace_lock);
> -	mutex_unlock(&direct_mutex);
> -	return ret;
> -}
> -EXPORT_SYMBOL_GPL(modify_ftrace_direct);
> -
>  #define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
>  
>  static int check_direct_multi(struct ftrace_ops *ops)
> -- 
> 2.39.1.519.gcb327c4b5f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  2023-02-01 16:34 ` [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
@ 2023-02-02 15:17   ` Mark Rutland
  0 siblings, 0 replies; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 15:17 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:15PM +0100, Florent Revest wrote:
> Now that the original _ftrace_direct APIs are gone, the "_multi"
> suffixes only add confusion.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/ftrace.h                      | 16 +++++------
>  kernel/bpf/trampoline.c                     | 14 ++++-----
>  kernel/trace/ftrace.c                       | 32 ++++++++++-----------
>  kernel/trace/trace_selftest.c               |  7 +++--
>  samples/Kconfig                             |  2 +-
>  samples/ftrace/ftrace-direct-modify.c       |  8 +++---
>  samples/ftrace/ftrace-direct-multi-modify.c |  8 +++---
>  samples/ftrace/ftrace-direct-multi.c        |  6 ++--
>  samples/ftrace/ftrace-direct-too.c          |  6 ++--
>  samples/ftrace/ftrace-direct.c              |  6 ++--
>  10 files changed, 53 insertions(+), 52 deletions(-)

This looks like a sensible cleanup to me.

As with the last two patches, on x86_64 the kernel builds cleanly, the
FTRACE_STARTUP_TEST is happy, and loading each of the ftrace_direct*.ko samples
works fine. So:

Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 2d7c85e47c2d..a7dbd307c3a4 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -398,10 +398,10 @@ struct ftrace_func_entry {
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  extern int ftrace_direct_func_count;
>  unsigned long ftrace_find_rec_direct(unsigned long ip);
> -int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
> -int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
> -int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
> -int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr);
> +int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
> +int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
> +int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
> +int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
>  
>  #else
>  struct ftrace_ops;
> @@ -410,19 +410,19 @@ static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
>  {
>  	return 0;
>  }
> -static inline int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +static inline int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	return -ENODEV;
>  }
> -static inline int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +static inline int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	return -ENODEV;
>  }
> -static inline int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +static inline int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	return -ENODEV;
>  }
> -static inline int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
> +static inline int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	return -ENODEV;
>  }
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index d0ed7d6f5eec..150b53316df2 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -39,14 +39,14 @@ static int bpf_tramp_ftrace_ops_func(struct ftrace_ops *ops, enum ftrace_ops_cmd
>  	int ret = 0;
>  
>  	if (cmd == FTRACE_OPS_CMD_ENABLE_SHARE_IPMODIFY_SELF) {
> -		/* This is called inside register_ftrace_direct_multi(), so
> +		/* This is called inside register_ftrace_direct(), so
>  		 * tr->mutex is already locked.
>  		 */
>  		lockdep_assert_held_once(&tr->mutex);
>  
>  		/* Instead of updating the trampoline here, we propagate
> -		 * -EAGAIN to register_ftrace_direct_multi(). Then we can
> -		 * retry register_ftrace_direct_multi() after updating the
> +		 * -EAGAIN to register_ftrace_direct(). Then we can
> +		 * retry register_ftrace_direct() after updating the
>  		 * trampoline.
>  		 */
>  		if ((tr->flags & BPF_TRAMP_F_CALL_ORIG) &&
> @@ -198,7 +198,7 @@ static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
>  	int ret;
>  
>  	if (tr->func.ftrace_managed)
> -		ret = unregister_ftrace_direct_multi(tr->fops, (long)old_addr);
> +		ret = unregister_ftrace_direct(tr->fops, (long)old_addr);
>  	else
>  		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, NULL);
>  
> @@ -215,9 +215,9 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
>  
>  	if (tr->func.ftrace_managed) {
>  		if (lock_direct_mutex)
> -			ret = modify_ftrace_direct_multi(tr->fops, (long)new_addr);
> +			ret = modify_ftrace_direct(tr->fops, (long)new_addr);
>  		else
> -			ret = modify_ftrace_direct_multi_nolock(tr->fops, (long)new_addr);
> +			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
>  	} else {
>  		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, new_addr);
>  	}
> @@ -243,7 +243,7 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
>  
>  	if (tr->func.ftrace_managed) {
>  		ftrace_set_filter_ip(tr->fops, (unsigned long)ip, 0, 1);
> -		ret = register_ftrace_direct_multi(tr->fops, (long)new_addr);
> +		ret = register_ftrace_direct(tr->fops, (long)new_addr);
>  	} else {
>  		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr);
>  	}
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 5efe201428fa..cb77a0a208c7 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5312,7 +5312,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
>  }
>  
>  /**
> - * register_ftrace_direct_multi - Call a custom trampoline directly
> + * register_ftrace_direct - Call a custom trampoline directly
>   * for multiple functions registered in @ops
>   * @ops: The address of the struct ftrace_ops object
>   * @addr: The address of the trampoline to call at @ops functions
> @@ -5333,7 +5333,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
>   *  -ENODEV  - @ip does not point to a ftrace nop location (or not supported)
>   *  -ENOMEM  - There was an allocation failure.
>   */
> -int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	struct ftrace_hash *hash, *free_hash = NULL;
>  	struct ftrace_func_entry *entry, *new;
> @@ -5391,11 +5391,11 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
>  	}
>  	return err;
>  }
> -EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
> +EXPORT_SYMBOL_GPL(register_ftrace_direct);
>  
>  /**
> - * unregister_ftrace_direct_multi - Remove calls to custom trampoline
> - * previously registered by register_ftrace_direct_multi for @ops object.
> + * unregister_ftrace_direct - Remove calls to custom trampoline
> + * previously registered by register_ftrace_direct for @ops object.
>   * @ops: The address of the struct ftrace_ops object
>   *
>   * This is used to remove a direct calls to @addr from the nop locations
> @@ -5406,7 +5406,7 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
>   *  0 on success
>   *  -EINVAL - The @ops object was not properly registered.
>   */
> -int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	struct ftrace_hash *hash = ops->func_hash->filter_hash;
>  	int err;
> @@ -5426,10 +5426,10 @@ int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
>  	ops->trampoline = 0;
>  	return err;
>  }
> -EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
> +EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
>  
>  static int
> -__modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +__modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	struct ftrace_hash *hash;
>  	struct ftrace_func_entry *entry, *iter;
> @@ -5476,7 +5476,7 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
>  }
>  
>  /**
> - * modify_ftrace_direct_multi_nolock - Modify an existing direct 'multi' call
> + * modify_ftrace_direct_nolock - Modify an existing direct 'multi' call
>   * to call something else
>   * @ops: The address of the struct ftrace_ops object
>   * @addr: The address of the new trampoline to call at @ops functions
> @@ -5493,19 +5493,19 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
>   * Returns: zero on success. Non zero on error, which includes:
>   *  -EINVAL - The @ops object was not properly registered.
>   */
> -int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
> +int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	if (check_direct_multi(ops))
>  		return -EINVAL;
>  	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
>  		return -EINVAL;
>  
> -	return __modify_ftrace_direct_multi(ops, addr);
> +	return __modify_ftrace_direct(ops, addr);
>  }
> -EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
> +EXPORT_SYMBOL_GPL(modify_ftrace_direct_nolock);
>  
>  /**
> - * modify_ftrace_direct_multi - Modify an existing direct 'multi' call
> + * modify_ftrace_direct - Modify an existing direct 'multi' call
>   * to call something else
>   * @ops: The address of the struct ftrace_ops object
>   * @addr: The address of the new trampoline to call at @ops functions
> @@ -5519,7 +5519,7 @@ EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
>   * Returns: zero on success. Non zero on error, which includes:
>   *  -EINVAL - The @ops object was not properly registered.
>   */
> -int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
> +int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  {
>  	int err;
>  
> @@ -5529,11 +5529,11 @@ int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
>  		return -EINVAL;
>  
>  	mutex_lock(&direct_mutex);
> -	err = __modify_ftrace_direct_multi(ops, addr);
> +	err = __modify_ftrace_direct(ops, addr);
>  	mutex_unlock(&direct_mutex);
>  	return err;
>  }
> -EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi);
> +EXPORT_SYMBOL_GPL(modify_ftrace_direct);
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
>  
>  /**
> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index 9b7f10cbc51d..06218fc9374b 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -874,7 +874,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  	 * and make sure we get graph trace.
>  	 */
>  	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
> -	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
> +	ret = register_ftrace_direct(&direct,
> +				     (unsigned long)trace_direct_tramp);
>  	if (ret)
>  		goto out;
>  
> @@ -894,8 +895,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  
>  	unregister_ftrace_graph(&fgraph_ops);
>  
> -	ret = unregister_ftrace_direct_multi(&direct,
> -				       (unsigned long) trace_direct_tramp);
> +	ret = unregister_ftrace_direct(&direct,
> +				       (unsigned long)trace_direct_tramp);
>  	if (ret)
>  		goto out;
>  
> diff --git a/samples/Kconfig b/samples/Kconfig
> index 0d81c00289ee..e85998ca354d 100644
> --- a/samples/Kconfig
> +++ b/samples/Kconfig
> @@ -38,7 +38,7 @@ config SAMPLE_FTRACE_DIRECT
>  	  that hooks to wake_up_process and prints the parameters.
>  
>  config SAMPLE_FTRACE_DIRECT_MULTI
> -	tristate "Build register_ftrace_direct_multi() example"
> +	tristate "Build register_ftrace_direct() on multiple ips example"
>  	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS && m
>  	depends on HAVE_SAMPLE_FTRACE_DIRECT_MULTI
>  	help
> diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
> index e1e6c286970c..0f1b8859e1e1 100644
> --- a/samples/ftrace/ftrace-direct-modify.c
> +++ b/samples/ftrace/ftrace-direct-modify.c
> @@ -116,7 +116,7 @@ static int simple_thread(void *arg)
>  		if (ret)
>  			continue;
>  		t ^= 1;
> -		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
> +		ret = modify_ftrace_direct(&direct, tramps[t]);
>  		if (!ret)
>  			my_tramp = tramps[t];
>  		WARN_ON_ONCE(ret);
> @@ -132,7 +132,7 @@ static int __init ftrace_direct_init(void)
>  	int ret;
>  
>  	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
> -	ret = register_ftrace_direct_multi(&direct, my_tramp);
> +	ret = register_ftrace_direct(&direct, my_tramp);
>  	if (!ret)
>  		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
>  	return ret;
> @@ -141,12 +141,12 @@ static int __init ftrace_direct_init(void)
>  static void __exit ftrace_direct_exit(void)
>  {
>  	kthread_stop(simple_tsk);
> -	unregister_ftrace_direct_multi(&direct, my_tramp);
> +	unregister_ftrace_direct(&direct, my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
> +MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
> index d52370cad0b6..407c56325e65 100644
> --- a/samples/ftrace/ftrace-direct-multi-modify.c
> +++ b/samples/ftrace/ftrace-direct-multi-modify.c
> @@ -123,7 +123,7 @@ static int simple_thread(void *arg)
>  		if (ret)
>  			continue;
>  		t ^= 1;
> -		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
> +		ret = modify_ftrace_direct(&direct, tramps[t]);
>  		if (!ret)
>  			my_tramp = tramps[t];
>  		WARN_ON_ONCE(ret);
> @@ -141,7 +141,7 @@ static int __init ftrace_direct_multi_init(void)
>  	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
>  	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
>  
> -	ret = register_ftrace_direct_multi(&direct, my_tramp);
> +	ret = register_ftrace_direct(&direct, my_tramp);
>  
>  	if (!ret)
>  		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
> @@ -151,12 +151,12 @@ static int __init ftrace_direct_multi_init(void)
>  static void __exit ftrace_direct_multi_exit(void)
>  {
>  	kthread_stop(simple_tsk);
> -	unregister_ftrace_direct_multi(&direct, my_tramp);
> +	unregister_ftrace_direct(&direct, my_tramp);
>  }
>  
>  module_init(ftrace_direct_multi_init);
>  module_exit(ftrace_direct_multi_exit);
>  
>  MODULE_AUTHOR("Jiri Olsa");
> -MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
> +MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
> index ec1088922517..42a94806a446 100644
> --- a/samples/ftrace/ftrace-direct-multi.c
> +++ b/samples/ftrace/ftrace-direct-multi.c
> @@ -73,17 +73,17 @@ static int __init ftrace_direct_multi_init(void)
>  	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
>  	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
>  
> -	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> +	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_multi_exit(void)
>  {
> -	unregister_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> +	unregister_ftrace_direct(&direct, (unsigned long) my_tramp);
>  }
>  
>  module_init(ftrace_direct_multi_init);
>  module_exit(ftrace_direct_multi_exit);
>  
>  MODULE_AUTHOR("Jiri Olsa");
> -MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
> +MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> index 0e907092e2c0..7ee5dd3cc61d 100644
> --- a/samples/ftrace/ftrace-direct-too.c
> +++ b/samples/ftrace/ftrace-direct-too.c
> @@ -76,17 +76,17 @@ static int __init ftrace_direct_init(void)
>  {
>  	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
>  
> -	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> +	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
> +	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
> +MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
> index e446c38f6b58..5ffce87fa83e 100644
> --- a/samples/ftrace/ftrace-direct.c
> +++ b/samples/ftrace/ftrace-direct.c
> @@ -69,17 +69,17 @@ static int __init ftrace_direct_init(void)
>  {
>  	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
>  
> -	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> +	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
> +	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
> +MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
>  MODULE_LICENSE("GPL");
> -- 
> 2.39.1.519.gcb327c4b5f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] ftrace: Store direct called addresses in their ops
  2023-02-01 16:34 ` [PATCH 4/8] ftrace: Store direct called addresses in their ops Florent Revest
@ 2023-02-02 15:29   ` Mark Rutland
  2023-02-02 17:41     ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 15:29 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:16PM +0100, Florent Revest wrote:
> All direct calls are now registered using the register_ftrace_direct API
> so each ops can jump to only one direct-called trampoline.
> 
> By storing the direct called trampoline address directly in the ops we
> can save one hashmap lookup in the direct call ops and implement arm64
> direct calls on top of call ops.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/ftrace.h | 3 +++
>  kernel/trace/ftrace.c  | 6 ++++--
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index a7dbd307c3a4..84f717f8959e 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -321,6 +321,9 @@ struct ftrace_ops {
>  	unsigned long			trampoline_size;
>  	struct list_head		list;
>  	ftrace_ops_func_t		ops_func;
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	unsigned long			direct_call;
> +#endif
>  #endif
>  };
>  
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index cb77a0a208c7..b0426de11c45 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2577,9 +2577,8 @@ ftrace_add_rec_direct(unsigned long ip, unsigned long addr,
>  static void call_direct_funcs(unsigned long ip, unsigned long pip,
>  			      struct ftrace_ops *ops, struct ftrace_regs *fregs)
>  {
> -	unsigned long addr;
> +	unsigned long addr = ops->direct_call;
>  
> -	addr = ftrace_find_rec_direct(ip);
>  	if (!addr)
>  		return;
>  
> @@ -5375,6 +5374,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  	ops->func = call_direct_funcs;
>  	ops->flags = MULTI_FLAGS;
>  	ops->trampoline = FTRACE_REGS_ADDR;
> +	ops->direct_call = addr;
>  
>  	err = register_ftrace_function_nolock(ops);
>  
> @@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  	/* Enable the tmp_ops to have the same functions as the direct ops */
>  	ftrace_ops_init(&tmp_ops);
>  	tmp_ops.func_hash = ops->func_hash;
> +	tmp_ops.direct_call = addr;
>  
>  	err = register_ftrace_function_nolock(&tmp_ops);
>  	if (err)
> @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  			entry->direct = addr;
>  		}
>  	}
> +	ops->direct_call = addr;

AFAICT we don't synchronize threads when installing the tmp_ops, so IIUC on
arches with call_ops, there could be a a thread in the middle of ftrace_caller
which has loaded the ops pointer from the patch-site, but hasn't yet loaded the
ops::direct_func pointer, and could race with this assignment.

Given that, I think this needs to be:

	WRITE_ONCE(ops->direct_call, addr);

... in order to avoid the risk of the store being torn, and the ftrace_caller
trampoline loading a corrupted pointer.

Other than that, this looks good to me!

Thanks,
Mark.

>  
>  	mutex_unlock(&ftrace_lock);
>  
> -- 
> 2.39.1.519.gcb327c4b5f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-01 16:34 ` [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
@ 2023-02-02 15:54   ` Mark Rutland
  2023-02-02 16:56     ` Mark Rutland
  2023-02-02 18:18     ` Florent Revest
  0 siblings, 2 replies; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 15:54 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:17PM +0100, Florent Revest wrote:
> Direct called trampolines can be called in two ways:
> - either from the ftrace callsite. In this case, they do not access any
>   struct ftrace_regs nor pt_regs
> - Or, if a ftrace ops is also attached, from the end of a ftrace
>   trampoline. In this case, the call_direct_funcs ops is in charge of
>   setting the direct call trampoline's address in a struct ftrace_regs
> 
> Since "ftrace: pass fregs to arch_ftrace_set_direct_caller()", the later
> case no longer requires a full pt_regs.

Minor nit, but could we please have the commit ID, e.g.

| Since commit:
| 
|   9705bc70960459ae ("ftrace: pass fregs to arch_ftrace_set_direct_caller()")
| 
| The latter case no longer requires a full pt_regs.

> It only needs a struct
> ftrace_regs so DIRECT_CALLS can work with both WITH_ARGS or WITH_REGS.
> With architectures like arm64 already abandoning WITH_REGS in favor of
> WITH_ARGS, it's important to have DIRECT_CALLS work WITH_ARGS only.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  kernel/trace/Kconfig  | 2 +-
>  kernel/trace/ftrace.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 5df427a2321d..4496a7c69810 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -257,7 +257,7 @@ config DYNAMIC_FTRACE_WITH_REGS
>  
>  config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  	def_bool y
> -	depends on DYNAMIC_FTRACE_WITH_REGS
> +	depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
>  	depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  
>  config DYNAMIC_FTRACE_WITH_CALL_OPS
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index b0426de11c45..73b6f6489ba1 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5282,7 +5282,7 @@ static LIST_HEAD(ftrace_direct_funcs);
>  
>  static int register_ftrace_function_nolock(struct ftrace_ops *ops);
>  
> -#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
> +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)

Unfortunately, I think this is broken for architectures where:

* DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
* DYNAMIC_FTRACE_WITH_REGS=y
* DYNAMIC_FTRACE_WITH_ARGS=n

... since those might pass a NULL ftrace_regs around, and so when using the
list ops arch_ftrace_set_direct_caller() might blow up accessing an element of
ftrace_regs.

It looks like 32-bit x86 is the only case with that combination, and its
ftrace_caller implementation passes a NULL regs, so I reckon that'll blow up.
However, it looks like there aren't any FTRACE_DIRECT samples wired up for
32-bit x86, so I'm not aware of a test case we can use.

We could make the FTRACE_OPS_FL_SAVE_REGS flag conditional, or we could
implement DYNAMIC_FTRACE_WITH_ARGS for 32-bit x86 and have
DYNAMIC_FTRACE_WITH_DIRECT_CALLS depend solely on that.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-02 15:54   ` Mark Rutland
@ 2023-02-02 16:56     ` Mark Rutland
  2023-02-02 18:19       ` Florent Revest
  2023-02-02 18:18     ` Florent Revest
  1 sibling, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 16:56 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 02, 2023 at 03:54:33PM +0000, Mark Rutland wrote:
> On Wed, Feb 01, 2023 at 05:34:17PM +0100, Florent Revest wrote:
> > -#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
> > +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)
> 
> Unfortunately, I think this is broken for architectures where:
> 
> * DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
> * DYNAMIC_FTRACE_WITH_REGS=y
> * DYNAMIC_FTRACE_WITH_ARGS=n
> 
> ... since those might pass a NULL ftrace_regs around, and so when using the
> list ops arch_ftrace_set_direct_caller() might blow up accessing an element of
> ftrace_regs.
> 
> It looks like 32-bit x86 is the only case with that combination, and its
> ftrace_caller implementation passes a NULL regs, so I reckon that'll blow up.
> However, it looks like there aren't any FTRACE_DIRECT samples wired up for
> 32-bit x86, so I'm not aware of a test case we can use.

FWIW, the FTRACE_STARTUP_TEST tickles this:

[    1.896209] Testing tracer function_graph: 
[    2.900282] BUG: kernel NULL pointer dereference, address: 0000002c
[    2.901171] #PF: supervisor write access in kernel mode
[    2.901171] #PF: error_code(0x0002) - not-present page
[    2.901171] *pde = 00000000 
[    2.901171] Oops: 0002 [#1] PREEMPT SMP
[    2.901171] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc3-00014-gcfd6340c71ce #1
[    2.901171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[    2.901171] EIP: call_direct_funcs+0xd/0x1c
[    2.901171] Code: 00 00 00 00 90 a9 00 00 00 01 0f 84 d7 fe ff ff 0d 00 00 80 00 89 46 04 e9 d2 fe ff ff 8b 41 64 85 c0 74 11 55 89 e5 8b 55 08 <89> 42 2c 5d c3 8d b6 00 00 00 00 c3 8d 76 00 89 c1 89 b
[    2.901171] EAX: cc3620e8 EBX: c1147e44 ECX: c1147e44 EDX: 00000000
[    2.901171] ESI: fffffeff EDI: cc354208 EBP: c1147dbc ESP: c1147dbc
[    2.901171] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
[    2.901171] CR0: 80050033 CR2: 0000002c CR3: 0d703000 CR4: 00350ed0
[    2.901171] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    2.901171] DR6: fffe0ff0 DR7: 00000400
[    2.901171] Call Trace:
[    2.901171]  arch_ftrace_ops_list_func+0xf5/0x1bc
[    2.901171]  ? ftrace_enable_ftrace_graph_caller+0x3b/0x44
[    2.901171]  ? trace_selftest_startup_function_graph+0x1d9/0x298
[    2.901171]  ? syscall_unregfunc+0xa0/0xa0
[    2.901171]  ftrace_call+0x5/0x13
[    2.901171]  trace_selftest_dynamic_test_func+0x5/0xc
[    2.901171]  trace_selftest_startup_function_graph+0x1d9/0x298
[    2.901171]  ? trace_selftest_dynamic_test_func+0x5/0xc
[    2.901171]  ? trace_selftest_startup_function_graph+0x1d9/0x298
[    2.901171]  ? ftrace_check_record+0x340/0x340
[    2.901171]  ? ftrace_check_record+0x340/0x340
[    2.901171]  ? ftrace_stub_graph+0x4/0x4
[    2.901171]  ? trace_selftest_test_regs_func+0x18/0x18
[    2.901171]  run_tracer_selftest+0x7d/0x1bc
[    2.901171]  ? graph_depth_read+0x90/0x90
[    2.901171]  register_tracer+0xd3/0x284
[    2.901171]  ? register_trace_event+0xf6/0x180
[    2.901171]  ? init_graph_tracefs+0x38/0x38
[    2.901171]  init_graph_trace+0x56/0x78
[    2.901171]  do_one_initcall+0x53/0x204
[    2.901171]  ? parse_args+0x143/0x3ec
[    2.901171]  ? __kmem_cache_alloc_node+0x2d/0x224
[    2.901171]  kernel_init_freeable+0x198/0x2bc
[    2.901171]  ? rdinit_setup+0x30/0x30
[    2.901171]  ? rest_init+0xb0/0xb0
[    2.901171]  kernel_init+0x1a/0x1d0
[    2.901171]  ? schedule_tail_wrapper+0x9/0xc
[    2.901171]  ret_from_fork+0x1c/0x28
[    2.901171] Modules linked in:
[    2.901171] CR2: 000000000000002c
[    2.901171] ---[ end trace 0000000000000000 ]---
[    2.901171] EIP: call_direct_funcs+0xd/0x1c
[    2.901171] Code: 00 00 00 00 90 a9 00 00 00 01 0f 84 d7 fe ff ff 0d 00 00 80 00 89 46 04 e9 d2 fe ff ff 8b 41 64 85 c0 74 11 55 89 e5 8b 55 08 <89> 42 2c 5d c3 8d b6 00 00 00 00 c3 8d 76 00 89 c1 89 b
[    2.901171] EAX: cc3620e8 EBX: c1147e44 ECX: c1147e44 EDX: 00000000
[    2.901171] ESI: fffffeff EDI: cc354208 EBP: c1147dbc ESP: c1147dbc
[    2.901171] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
[    2.901171] CR0: 80050033 CR2: 0000002c CR3: 0d703000 CR4: 00350ed0
[    2.901171] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    2.901171] DR6: fffe0ff0 DR7: 00000400
[    2.901171] note: swapper/0[1] exited with preempt_count 1
[    2.901175] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    2.902171] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---

The below diff solved that for me.

Thanks,
Mark.

---->8----
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 84f717f8959e..3d2156e335d7 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -241,6 +241,12 @@ enum {
        FTRACE_OPS_FL_DIRECT                    = BIT(17),
 };
 
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+#define FTRACE_OPS_FL_SAVE_ARGS                        FTRACE_OPS_FL_SAVE_REGS
+#else
+#define FTRACE_OPS_FL_SAVE_ARGS                        0
+#endif
+
 /*
  * FTRACE_OPS_CMD_* commands allow the ftrace core logic to request changes
  * to a ftrace_ops. Note, the requests may fail.
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 73b6f6489ba1..8e739303b6a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5282,7 +5282,7 @@ static LIST_HEAD(ftrace_direct_funcs);
 
 static int register_ftrace_function_nolock(struct ftrace_ops *ops);
 
-#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)
+#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_ARGS)
 
 static int check_direct_multi(struct ftrace_ops *ops)
 {


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Add ftrace direct call for arm64
  2023-02-02 10:50   ` Daniel Borkmann
@ 2023-02-02 17:32     ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-02 17:32 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Xu Kuohai, linux-arm-kernel, linux-kernel, linux-trace-kernel,
	bpf, catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	andrii, kpsingh, jolsa, xukuohai, Manu Bretelle

On Thu, Feb 2, 2023 at 11:50 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 2/2/23 9:36 AM, Xu Kuohai wrote:
> > On 2/2/2023 12:34 AM, Florent Revest wrote:
> >> This series adds ftrace direct call support to arm64.
> >> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> >>
> >> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> >> series on CALL_OPS [1] under the for-next/ftrace tag.
> >> > The first three patches consolidate the two existing ftrace APIs for registering
> >> direct calls. They are split to make the reviewers lives easier but if it'd be a
> >> preferred style, I'd be happy to squash them in the next revision.
> >> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
> >> from samples and selftests, there are no users of the _ftrace_direct API left
> >> in-tree so this deletes it and renames the _ftrace_direct_multi API to
> >> _ftrace_direct for simplicity.
> >>
> >> The main benefit of this refactoring is that, with the API that's left, an
> >> ftrace_ops backing a direct call will only ever point to one direct call. We can
> >> therefore store the direct called trampoline address in the ops (patch 4) and
> >> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
> >> destination would be out of reach of a BL instruction at the ftrace callsite.
> >> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
> >>
> >> This series has been tested on both arm64 and x86_64 with:
> >> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
> >> 2- samples/ftrace/*.ko (cf: patch 8)
> >> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)
>
> Thanks a ton for working on this!
>
> > so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

That's a good point Xu, thanks! I'll update the deny list in my next revision.
It looks like this series fixes *a lot* of these tests, so that's exciting. :)

> +1, with regards to logistics, if possible it might be nice to eventually gets
> this into a feature branch on arm64 tree, then we could pull it too from there
> for bpf-next and hash out the BPF CI bits for arm64 in the meantime.

I believe that Manu Bretelle already wired up the BPF CI for arm64, is
there more work required ?
Regarding the logistics, whatever works sgtm... :) I suppose it's up
to Catalin or Will.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-02 15:01   ` Mark Rutland
@ 2023-02-02 17:37     ` Florent Revest
  2023-02-07 15:21       ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-02 17:37 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 2, 2023 at 4:02 PM Mark Rutland <mark.rutland@arm.com> wrote:
> Looking at samples/ftrace/, as of this patch we have a few samples that are
> almost identical, modulo the function being traced, and some different register
> shuffling for arguments:
>
> * ftrace-direct.c and ftrace-direct-multi.c
> * ftrace-direct-modify.c and ftrace-direct-modify
>
> ... perhaps it would be better to just delete the !multi versions ?

The multi versions hook two functions and the !multi hook just one but
I agree that this granularity in coverage is probably just a
maintenance burden and doesn't help with much! :)
I'll delete the !multi in v2, as part of the patch 2 and patch 1 will
just migrate the selftest to use the multi API.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] ftrace: Store direct called addresses in their ops
  2023-02-02 15:29   ` Mark Rutland
@ 2023-02-02 17:41     ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-02 17:41 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 2, 2023 at 4:30 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Feb 01, 2023 at 05:34:16PM +0100, Florent Revest wrote:
> > All direct calls are now registered using the register_ftrace_direct API
> > so each ops can jump to only one direct-called trampoline.
> >
> > By storing the direct called trampoline address directly in the ops we
> > can save one hashmap lookup in the direct call ops and implement arm64
> > direct calls on top of call ops.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
> >  include/linux/ftrace.h | 3 +++
> >  kernel/trace/ftrace.c  | 6 ++++--
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index a7dbd307c3a4..84f717f8959e 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -321,6 +321,9 @@ struct ftrace_ops {
> >       unsigned long                   trampoline_size;
> >       struct list_head                list;
> >       ftrace_ops_func_t               ops_func;
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > +     unsigned long                   direct_call;
> > +#endif
> >  #endif
> >  };
> >
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index cb77a0a208c7..b0426de11c45 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -2577,9 +2577,8 @@ ftrace_add_rec_direct(unsigned long ip, unsigned long addr,
> >  static void call_direct_funcs(unsigned long ip, unsigned long pip,
> >                             struct ftrace_ops *ops, struct ftrace_regs *fregs)
> >  {
> > -     unsigned long addr;
> > +     unsigned long addr = ops->direct_call;
> >
> > -     addr = ftrace_find_rec_direct(ip);
> >       if (!addr)
> >               return;
> >
> > @@ -5375,6 +5374,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >       ops->func = call_direct_funcs;
> >       ops->flags = MULTI_FLAGS;
> >       ops->trampoline = FTRACE_REGS_ADDR;
> > +     ops->direct_call = addr;
> >
> >       err = register_ftrace_function_nolock(ops);
> >
> > @@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >       /* Enable the tmp_ops to have the same functions as the direct ops */
> >       ftrace_ops_init(&tmp_ops);
> >       tmp_ops.func_hash = ops->func_hash;
> > +     tmp_ops.direct_call = addr;
> >
> >       err = register_ftrace_function_nolock(&tmp_ops);
> >       if (err)
> > @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >                       entry->direct = addr;
> >               }
> >       }
> > +     ops->direct_call = addr;
>
> AFAICT we don't synchronize threads when installing the tmp_ops, so IIUC on
> arches with call_ops, there could be a a thread in the middle of ftrace_caller
> which has loaded the ops pointer from the patch-site, but hasn't yet loaded the
> ops::direct_func pointer, and could race with this assignment.
>
> Given that, I think this needs to be:
>
>         WRITE_ONCE(ops->direct_call, addr);
>
> ... in order to avoid the risk of the store being torn, and the ftrace_caller
> trampoline loading a corrupted pointer.

Good point, I'll do that in v2.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-02 15:54   ` Mark Rutland
  2023-02-02 16:56     ` Mark Rutland
@ 2023-02-02 18:18     ` Florent Revest
  1 sibling, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-02 18:18 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 2, 2023 at 4:54 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Feb 01, 2023 at 05:34:17PM +0100, Florent Revest wrote:
> > Direct called trampolines can be called in two ways:
> > - either from the ftrace callsite. In this case, they do not access any
> >   struct ftrace_regs nor pt_regs
> > - Or, if a ftrace ops is also attached, from the end of a ftrace
> >   trampoline. In this case, the call_direct_funcs ops is in charge of
> >   setting the direct call trampoline's address in a struct ftrace_regs
> >
> > Since "ftrace: pass fregs to arch_ftrace_set_direct_caller()", the later
> > case no longer requires a full pt_regs.
>
> Minor nit, but could we please have the commit ID, e.g.
>
> | Since commit:
> |
> |   9705bc70960459ae ("ftrace: pass fregs to arch_ftrace_set_direct_caller()")
> |
> | The latter case no longer requires a full pt_regs.

Sure thing, will do!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-02 16:56     ` Mark Rutland
@ 2023-02-02 18:19       ` Florent Revest
  2023-02-03 10:03         ` Mark Rutland
  0 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-02 18:19 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 2, 2023 at 5:57 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Thu, Feb 02, 2023 at 03:54:33PM +0000, Mark Rutland wrote:
> > On Wed, Feb 01, 2023 at 05:34:17PM +0100, Florent Revest wrote:
> > > -#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
> > > +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)
> >
> > Unfortunately, I think this is broken for architectures where:
> >
> > * DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
> > * DYNAMIC_FTRACE_WITH_REGS=y
> > * DYNAMIC_FTRACE_WITH_ARGS=n
> >
> > ... since those might pass a NULL ftrace_regs around, and so when using the
> > list ops arch_ftrace_set_direct_caller() might blow up accessing an element of
> > ftrace_regs.
> >
> > It looks like 32-bit x86 is the only case with that combination, and its
> > ftrace_caller implementation passes a NULL regs, so I reckon that'll blow up.
> > However, it looks like there aren't any FTRACE_DIRECT samples wired up for
> > 32-bit x86, so I'm not aware of a test case we can use.
>
> FWIW, the FTRACE_STARTUP_TEST tickles this:

Good catch and thanks for reproducing the bug too!

> [    1.896209] Testing tracer function_graph:
> [    2.900282] BUG: kernel NULL pointer dereference, address: 0000002c
> [    2.901171] #PF: supervisor write access in kernel mode
> [    2.901171] #PF: error_code(0x0002) - not-present page
> [    2.901171] *pde = 00000000
> [    2.901171] Oops: 0002 [#1] PREEMPT SMP
> [    2.901171] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc3-00014-gcfd6340c71ce #1
> [    2.901171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> [    2.901171] EIP: call_direct_funcs+0xd/0x1c
> [    2.901171] Code: 00 00 00 00 90 a9 00 00 00 01 0f 84 d7 fe ff ff 0d 00 00 80 00 89 46 04 e9 d2 fe ff ff 8b 41 64 85 c0 74 11 55 89 e5 8b 55 08 <89> 42 2c 5d c3 8d b6 00 00 00 00 c3 8d 76 00 89 c1 89 b
> [    2.901171] EAX: cc3620e8 EBX: c1147e44 ECX: c1147e44 EDX: 00000000
> [    2.901171] ESI: fffffeff EDI: cc354208 EBP: c1147dbc ESP: c1147dbc
> [    2.901171] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
> [    2.901171] CR0: 80050033 CR2: 0000002c CR3: 0d703000 CR4: 00350ed0
> [    2.901171] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [    2.901171] DR6: fffe0ff0 DR7: 00000400
> [    2.901171] Call Trace:
> [    2.901171]  arch_ftrace_ops_list_func+0xf5/0x1bc
> [    2.901171]  ? ftrace_enable_ftrace_graph_caller+0x3b/0x44
> [    2.901171]  ? trace_selftest_startup_function_graph+0x1d9/0x298
> [    2.901171]  ? syscall_unregfunc+0xa0/0xa0
> [    2.901171]  ftrace_call+0x5/0x13
> [    2.901171]  trace_selftest_dynamic_test_func+0x5/0xc
> [    2.901171]  trace_selftest_startup_function_graph+0x1d9/0x298
> [    2.901171]  ? trace_selftest_dynamic_test_func+0x5/0xc
> [    2.901171]  ? trace_selftest_startup_function_graph+0x1d9/0x298
> [    2.901171]  ? ftrace_check_record+0x340/0x340
> [    2.901171]  ? ftrace_check_record+0x340/0x340
> [    2.901171]  ? ftrace_stub_graph+0x4/0x4
> [    2.901171]  ? trace_selftest_test_regs_func+0x18/0x18
> [    2.901171]  run_tracer_selftest+0x7d/0x1bc
> [    2.901171]  ? graph_depth_read+0x90/0x90
> [    2.901171]  register_tracer+0xd3/0x284
> [    2.901171]  ? register_trace_event+0xf6/0x180
> [    2.901171]  ? init_graph_tracefs+0x38/0x38
> [    2.901171]  init_graph_trace+0x56/0x78
> [    2.901171]  do_one_initcall+0x53/0x204
> [    2.901171]  ? parse_args+0x143/0x3ec
> [    2.901171]  ? __kmem_cache_alloc_node+0x2d/0x224
> [    2.901171]  kernel_init_freeable+0x198/0x2bc
> [    2.901171]  ? rdinit_setup+0x30/0x30
> [    2.901171]  ? rest_init+0xb0/0xb0
> [    2.901171]  kernel_init+0x1a/0x1d0
> [    2.901171]  ? schedule_tail_wrapper+0x9/0xc
> [    2.901171]  ret_from_fork+0x1c/0x28
> [    2.901171] Modules linked in:
> [    2.901171] CR2: 000000000000002c
> [    2.901171] ---[ end trace 0000000000000000 ]---
> [    2.901171] EIP: call_direct_funcs+0xd/0x1c
> [    2.901171] Code: 00 00 00 00 90 a9 00 00 00 01 0f 84 d7 fe ff ff 0d 00 00 80 00 89 46 04 e9 d2 fe ff ff 8b 41 64 85 c0 74 11 55 89 e5 8b 55 08 <89> 42 2c 5d c3 8d b6 00 00 00 00 c3 8d 76 00 89 c1 89 b
> [    2.901171] EAX: cc3620e8 EBX: c1147e44 ECX: c1147e44 EDX: 00000000
> [    2.901171] ESI: fffffeff EDI: cc354208 EBP: c1147dbc ESP: c1147dbc
> [    2.901171] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
> [    2.901171] CR0: 80050033 CR2: 0000002c CR3: 0d703000 CR4: 00350ed0
> [    2.901171] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [    2.901171] DR6: fffe0ff0 DR7: 00000400
> [    2.901171] note: swapper/0[1] exited with preempt_count 1
> [    2.901175] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
> [    2.902171] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
>
> The below diff solved that for me.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 84f717f8959e..3d2156e335d7 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -241,6 +241,12 @@ enum {
>         FTRACE_OPS_FL_DIRECT                    = BIT(17),
>  };
>
> +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> +#define FTRACE_OPS_FL_SAVE_ARGS                        FTRACE_OPS_FL_SAVE_REGS
> +#else
> +#define FTRACE_OPS_FL_SAVE_ARGS                        0

Mh, could we (theoretically) be in a situation where an arch supports
WITH_ARGS but it also has two ftrace_caller trampolines: one that
saves the args and the other that saves nothing ? (For example if x86
migrates their WITH_REGS to WITH_ARGS only)
Wouldn't it make sense then to define FTRACE_OPS_FL_SAVE_ARGS as an
extra bit to tell ftrace that we need the args, similarly to
FTRACE_OPS_FL_SAVE_REGS ?

If that can't happen or if we want to leave this up for later, the
patch lgtm and I can squash it into my patch 5 in v2. ;)

> +#endif
> +
>  /*
>   * FTRACE_OPS_CMD_* commands allow the ftrace core logic to request changes
>   * to a ftrace_ops. Note, the requests may fail.
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 73b6f6489ba1..8e739303b6a2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5282,7 +5282,7 @@ static LIST_HEAD(ftrace_direct_funcs);
>
>  static int register_ftrace_function_nolock(struct ftrace_ops *ops);
>
> -#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT)
> +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_ARGS)
>
>  static int check_direct_multi(struct ftrace_ops *ops)
>  {
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest
  2023-02-01 16:34 ` [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Florent Revest
@ 2023-02-02 19:03   ` Mark Rutland
  2023-02-03 12:35     ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-02 19:03 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, Xu Kuohai, Li Huafei

On Wed, Feb 01, 2023 at 05:34:18PM +0100, Florent Revest wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
> 
> After direct call is enabled for arm64, ftrace selftest enters a
> dead loop:
> 
> <trace_selftest_dynamic_test_func>:
> 00  bti     c
> 01  mov     x9, x30                            <trace_direct_tramp>:
> 02  bl      <trace_direct_tramp>    ---------->     ret
>                                                      |
>                                          lr/x30 is 03, return to 03
>                                                      |
> 03  mov     w0, #0x0   <-----------------------------|
>      |                                               |
>      |                   dead loop!                  |
>      |                                               |
> 04  ret   ---- lr/x30 is still 03, go back to 03 ----|
> 
> The reason is that when the direct caller trace_direct_tramp() returns
> to the patched function trace_selftest_dynamic_test_func(), lr is still
> the address after the instrumented instruction in the patched function,
> so when the patched function exits, it returns to itself!
> 
> To fix this issue, we need to restore lr before trace_direct_tramp()
> exits, so use a dedicated trace_direct_tramp() for arm64.
> 
> Reported-by: Li Huafei <lihuafei1@huawei.com>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Signed-off-by: Florent Revest <revest@chromium.org>

Looking at this, I don't think that the existing trace_direct_tramp makes
sense -- in general a C function doesn't follow the direct call trampoline
calling convention, and cannot be used as a direct call trampoline.

Looking further, there's a distinct latent issue on s390, where the mismatch
between their regular calling convention and their direct call trampoline
calling convention means that trace_direct_tramp() returns into the *caller* of
the instrumented function, skipping that entirely (verified locally with QEMU
and printk()s added to DYN_FTRACE_TEST_NAME() / DYN_FTRACE_TEST_NAME2()
functions).

I think it'd be much better to do something like the below as a preparatory
cleanup (tested on s390 under QEMU).

Thanks,
Mark.

---->8----
From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland@arm.com>
Date: Thu, 2 Feb 2023 18:37:40 +0000
Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp

The ftrace selftest code has a trace_direct_tramp() function which it
uses as a direct call trampoline. This happens to work on x86, since the
direct call's return address is in the usual place, and can be returned
to via a RET, but in general the calling convention for direct calls is
different from regular function calls, and requires a trampoline written
in assembly.

On s390, regular function calls place the return address in %r14, and an
ftrace patch-site in an instrumented function places the trampoline's
return address (which is within the instrumented function) in %r0,
preserving the original %r14 value in-place. As a regular C function
will return to the address in %r14, using a C function as the trampoline
results in the trampoline returning to the caller of the instrumented
function, skipping the body of the instrumented function.

Note that the s390 issue is not detcted by the ftrace selftest code, as
the instrumented function is trivial, and returning back into the caller
happens to be equivalent.

On arm64, regular function calls place the return address in x30, and
an ftrace patch-site in an instrumented function saves this into r9
and places the trampoline's return address (within the instrumented
function) in x30. A regular C function will return to the address in
x30, but will not restore x9 into x30. Consequently, using a C function
as the trampoline results in returning to the trampoline's return
address having corrupted x30, such that when the instrumented function
returns, it will return back into itself.

To avoid future issues in this area, remove the trace_direct_tramp()
function, and require that each architecture with direct calls provides
a stub trampoline, named ftrace_stub_direct_tramp. This can be written
to handle the architecture's trampoline calling convention, and in
future could be used elsewhere (e.g. in the ftrace ops sample, to
measure the overhead of direct calls), so we may as well always build it
in.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Xu Kuohai <xukuohai@huawei.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Florent Revest <revest@chromium.org>
---
 arch/s390/kernel/mcount.S     |  5 +++++
 arch/x86/kernel/ftrace_32.S   |  5 +++++
 arch/x86/kernel/ftrace_64.S   |  4 ++++
 include/linux/ftrace.h        |  2 ++
 kernel/trace/trace_selftest.c | 15 ++-------------
 5 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
index 4786bfe02144..ad13a0e2c307 100644
--- a/arch/s390/kernel/mcount.S
+++ b/arch/s390/kernel/mcount.S
@@ -32,6 +32,11 @@ ENTRY(ftrace_stub)
 	BR_EX	%r14
 ENDPROC(ftrace_stub)
 
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	lgr	%r1, %r0
+	BR_EX	%r1
+SYM_CODE_END(ftrace_stub_direct_tramp)
+
 	.macro	ftrace_regs_entry, allregs=0
 	stg	%r14,(__SF_GPRS+8*8)(%r15)	# save traced function caller
 
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index a0ed0e4a2c0c..0d9a14528176 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
 	jmp	.Lftrace_ret
 SYM_CODE_END(ftrace_regs_caller)
 
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+	CALL_DEPTH_ACCOUNT
+	RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_CODE_START(ftrace_graph_caller)
 	pushl	%eax
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 1265ad519249..8fc77e3e039c 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
 SYM_FUNC_END(ftrace_regs_caller)
 STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
 
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+	CALL_DEPTH_ACCOUNT
+	RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
 
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 3d2156e335d7..a9836b40630e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
 int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
 int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
 
+void ftrace_stub_direct_tramp(void *);
+
 #else
 struct ftrace_ops;
 # define ftrace_direct_func_count 0
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 06218fc9374b..e6530b7b42e4 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata  = {
 	.retfunc		= &trace_graph_return,
 };
 
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
-#ifndef CALL_DEPTH_ACCOUNT
-#define CALL_DEPTH_ACCOUNT ""
-#endif
-
-noinline __noclone static void trace_direct_tramp(void)
-{
-	asm(CALL_DEPTH_ACCOUNT);
-}
-#endif
-
 /*
  * Pretty much the same than for the function tracer from which the selftest
  * has been borrowed.
@@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 */
 	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
 	ret = register_ftrace_direct(&direct,
-				     (unsigned long)trace_direct_tramp);
+				     (unsigned long)ftrace_stub_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	unregister_ftrace_graph(&fgraph_ops);
 
 	ret = unregister_ftrace_direct(&direct,
-				       (unsigned long)trace_direct_tramp);
+				       (unsigned long)ftrace_stub_direct_tramp);
 	if (ret)
 		goto out;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Add ftrace direct call for arm64
  2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
                   ` (8 preceding siblings ...)
  2023-02-02  8:36 ` [PATCH 0/8] Add ftrace direct call for arm64 Xu Kuohai
@ 2023-02-02 20:06 ` Steven Rostedt
  2023-02-03  9:49   ` Mark Rutland
  9 siblings, 1 reply; 35+ messages in thread
From: Steven Rostedt @ 2023-02-02 20:06 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai

On Wed,  1 Feb 2023 17:34:12 +0100
Florent Revest <revest@chromium.org> wrote:

> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> series on CALL_OPS [1] under the for-next/ftrace tag.

Just a note for future ftrace patches. Could you add the link to the
arm64 tree, so I don't need to go look for it ;-)

(Yes, I'm lazy)

-- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Add ftrace direct call for arm64
  2023-02-02 20:06 ` Steven Rostedt
@ 2023-02-03  9:49   ` Mark Rutland
  0 siblings, 0 replies; 35+ messages in thread
From: Mark Rutland @ 2023-02-03  9:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Florent Revest, linux-arm-kernel, linux-kernel,
	linux-trace-kernel, bpf, catalin.marinas, will, mhiramat, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai

On Thu, Feb 02, 2023 at 03:06:47PM -0500, Steven Rostedt wrote:
> On Wed,  1 Feb 2023 17:34:12 +0100
> Florent Revest <revest@chromium.org> wrote:
> 
> > It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> > series on CALL_OPS [1] under the for-next/ftrace tag.
> 
> Just a note for future ftrace patches. Could you add the link to the
> arm64 tree, so I don't need to go look for it ;-)

For the benefit of others looking for it now, the arm64 tree lives at:

  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/
  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

Mark.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-02 18:19       ` Florent Revest
@ 2023-02-03 10:03         ` Mark Rutland
  2023-02-03 11:01           ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-03 10:03 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 02, 2023 at 07:19:58PM +0100, Florent Revest wrote:
> On Thu, Feb 2, 2023 at 5:57 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 84f717f8959e..3d2156e335d7 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -241,6 +241,12 @@ enum {
> >         FTRACE_OPS_FL_DIRECT                    = BIT(17),
> >  };
> >
> > +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > +#define FTRACE_OPS_FL_SAVE_ARGS                        FTRACE_OPS_FL_SAVE_REGS
> > +#else
> > +#define FTRACE_OPS_FL_SAVE_ARGS                        0
> 
> Mh, could we (theoretically) be in a situation where an arch supports
> WITH_ARGS but it also has two ftrace_caller trampolines: one that
> saves the args and the other that saves nothing ? (For example if x86
> migrates their WITH_REGS to WITH_ARGS only)

I don't think so -- the point of WITH_ARGS is that we always have to
save/restore the args, and when WITH_ARGS is selected they're unconditionally
available (though not necessarily a full pt_regs), which is what other code
assumes when WITH_ARGS is selected.

> Wouldn't it make sense then to define FTRACE_OPS_FL_SAVE_ARGS as an
> extra bit to tell ftrace that we need the args, similarly to
> FTRACE_OPS_FL_SAVE_REGS ?
> 
> If that can't happen or if we want to leave this up for later, the
> patch lgtm and I can squash it into my patch 5 in v2. ;)

I think that can't happen, and for now the above should be fine.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-03 10:03         ` Mark Rutland
@ 2023-02-03 11:01           ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-03 11:01 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Fri, Feb 3, 2023 at 11:03 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Thu, Feb 02, 2023 at 07:19:58PM +0100, Florent Revest wrote:
> > On Thu, Feb 2, 2023 at 5:57 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > > index 84f717f8959e..3d2156e335d7 100644
> > > --- a/include/linux/ftrace.h
> > > +++ b/include/linux/ftrace.h
> > > @@ -241,6 +241,12 @@ enum {
> > >         FTRACE_OPS_FL_DIRECT                    = BIT(17),
> > >  };
> > >
> > > +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > > +#define FTRACE_OPS_FL_SAVE_ARGS                        FTRACE_OPS_FL_SAVE_REGS
> > > +#else
> > > +#define FTRACE_OPS_FL_SAVE_ARGS                        0
> >
> > Mh, could we (theoretically) be in a situation where an arch supports
> > WITH_ARGS but it also has two ftrace_caller trampolines: one that
> > saves the args and the other that saves nothing ? (For example if x86
> > migrates their WITH_REGS to WITH_ARGS only)
>
> I don't think so -- the point of WITH_ARGS is that we always have to
> save/restore the args, and when WITH_ARGS is selected they're unconditionally
> available (though not necessarily a full pt_regs), which is what other code
> assumes when WITH_ARGS is selected.

Perfect then!

> > Wouldn't it make sense then to define FTRACE_OPS_FL_SAVE_ARGS as an
> > extra bit to tell ftrace that we need the args, similarly to
> > FTRACE_OPS_FL_SAVE_REGS ?
> >
> > If that can't happen or if we want to leave this up for later, the
> > patch lgtm and I can squash it into my patch 5 in v2. ;)
>
> I think that can't happen, and for now the above should be fine.

Yep

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest
  2023-02-02 19:03   ` Mark Rutland
@ 2023-02-03 12:35     ` Florent Revest
  2023-02-03 15:37       ` Mark Rutland
  0 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-03 12:35 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, Xu Kuohai, Li Huafei

On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Feb 01, 2023 at 05:34:18PM +0100, Florent Revest wrote:
> > From: Xu Kuohai <xukuohai@huawei.com>
> >
> > After direct call is enabled for arm64, ftrace selftest enters a
> > dead loop:
> >
> > <trace_selftest_dynamic_test_func>:
> > 00  bti     c
> > 01  mov     x9, x30                            <trace_direct_tramp>:
> > 02  bl      <trace_direct_tramp>    ---------->     ret
> >                                                      |
> >                                          lr/x30 is 03, return to 03
> >                                                      |
> > 03  mov     w0, #0x0   <-----------------------------|
> >      |                                               |
> >      |                   dead loop!                  |
> >      |                                               |
> > 04  ret   ---- lr/x30 is still 03, go back to 03 ----|
> >
> > The reason is that when the direct caller trace_direct_tramp() returns
> > to the patched function trace_selftest_dynamic_test_func(), lr is still
> > the address after the instrumented instruction in the patched function,
> > so when the patched function exits, it returns to itself!
> >
> > To fix this issue, we need to restore lr before trace_direct_tramp()
> > exits, so use a dedicated trace_direct_tramp() for arm64.
> >
> > Reported-by: Li Huafei <lihuafei1@huawei.com>
> > Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> > Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > Signed-off-by: Florent Revest <revest@chromium.org>
>
> Looking at this, I don't think that the existing trace_direct_tramp makes
> sense -- in general a C function doesn't follow the direct call trampoline
> calling convention, and cannot be used as a direct call trampoline.

Agreed.

> Looking further, there's a distinct latent issue on s390, where the mismatch
> between their regular calling convention and their direct call trampoline
> calling convention means that trace_direct_tramp() returns into the *caller* of
> the instrumented function, skipping that entirely (verified locally with QEMU
> and printk()s added to DYN_FTRACE_TEST_NAME() / DYN_FTRACE_TEST_NAME2()
> functions).

Good catch! I didn't dare to adventure that far into s390 :)

> I think it'd be much better to do something like the below as a preparatory
> cleanup (tested on s390 under QEMU).

Thanks, that looks great to me. I'll make it a part of the series in v2 then.
Unless it's preferred that this gets merged separately?

> Thanks,
> Mark.
>
> ---->8----
> From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland@arm.com>
> Date: Thu, 2 Feb 2023 18:37:40 +0000
> Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp
>
> The ftrace selftest code has a trace_direct_tramp() function which it
> uses as a direct call trampoline. This happens to work on x86, since the
> direct call's return address is in the usual place, and can be returned
> to via a RET, but in general the calling convention for direct calls is
> different from regular function calls, and requires a trampoline written
> in assembly.
>
> On s390, regular function calls place the return address in %r14, and an
> ftrace patch-site in an instrumented function places the trampoline's
> return address (which is within the instrumented function) in %r0,
> preserving the original %r14 value in-place. As a regular C function
> will return to the address in %r14, using a C function as the trampoline
> results in the trampoline returning to the caller of the instrumented
> function, skipping the body of the instrumented function.
>
> Note that the s390 issue is not detcted by the ftrace selftest code, as
> the instrumented function is trivial, and returning back into the caller
> happens to be equivalent.
>
> On arm64, regular function calls place the return address in x30, and
> an ftrace patch-site in an instrumented function saves this into r9
> and places the trampoline's return address (within the instrumented
> function) in x30. A regular C function will return to the address in
> x30, but will not restore x9 into x30. Consequently, using a C function
> as the trampoline results in returning to the trampoline's return
> address having corrupted x30, such that when the instrumented function
> returns, it will return back into itself.
>
> To avoid future issues in this area, remove the trace_direct_tramp()
> function, and require that each architecture with direct calls provides
> a stub trampoline, named ftrace_stub_direct_tramp. This can be written
> to handle the architecture's trampoline calling convention, and in
> future could be used elsewhere (e.g. in the ftrace ops sample, to
> measure the overhead of direct calls), so we may as well always build it
> in.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Li Huafei <lihuafei1@huawei.com>
> Cc: Xu Kuohai <xukuohai@huawei.com>
> Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> Cc: Florent Revest <revest@chromium.org>
> ---
>  arch/s390/kernel/mcount.S     |  5 +++++
>  arch/x86/kernel/ftrace_32.S   |  5 +++++
>  arch/x86/kernel/ftrace_64.S   |  4 ++++
>  include/linux/ftrace.h        |  2 ++
>  kernel/trace/trace_selftest.c | 15 ++-------------
>  5 files changed, 18 insertions(+), 13 deletions(-)
>
> diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
> index 4786bfe02144..ad13a0e2c307 100644
> --- a/arch/s390/kernel/mcount.S
> +++ b/arch/s390/kernel/mcount.S
> @@ -32,6 +32,11 @@ ENTRY(ftrace_stub)
>         BR_EX   %r14
>  ENDPROC(ftrace_stub)
>
> +SYM_CODE_START(ftrace_stub_direct_tramp)
> +       lgr     %r1, %r0
> +       BR_EX   %r1
> +SYM_CODE_END(ftrace_stub_direct_tramp)
> +
>         .macro  ftrace_regs_entry, allregs=0
>         stg     %r14,(__SF_GPRS+8*8)(%r15)      # save traced function caller
>
> diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
> index a0ed0e4a2c0c..0d9a14528176 100644
> --- a/arch/x86/kernel/ftrace_32.S
> +++ b/arch/x86/kernel/ftrace_32.S
> @@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
>         jmp     .Lftrace_ret
>  SYM_CODE_END(ftrace_regs_caller)
>
> +SYM_FUNC_START(ftrace_stub_direct_tramp)
> +       CALL_DEPTH_ACCOUNT
> +       RET
> +SYM_FUNC_END(ftrace_stub_direct_tramp)
> +
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  SYM_CODE_START(ftrace_graph_caller)
>         pushl   %eax
> diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> index 1265ad519249..8fc77e3e039c 100644
> --- a/arch/x86/kernel/ftrace_64.S
> +++ b/arch/x86/kernel/ftrace_64.S
> @@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
>  SYM_FUNC_END(ftrace_regs_caller)
>  STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
>
> +SYM_FUNC_START(ftrace_stub_direct_tramp)
> +       CALL_DEPTH_ACCOUNT
> +       RET
> +SYM_FUNC_END(ftrace_stub_direct_tramp)
>
>  #else /* ! CONFIG_DYNAMIC_FTRACE */
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 3d2156e335d7..a9836b40630e 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
>  int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
>  int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
>
> +void ftrace_stub_direct_tramp(void *);
> +
>  #else
>  struct ftrace_ops;
>  # define ftrace_direct_func_count 0
> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index 06218fc9374b..e6530b7b42e4 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata  = {
>         .retfunc                = &trace_graph_return,
>  };
>
> -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> -#ifndef CALL_DEPTH_ACCOUNT
> -#define CALL_DEPTH_ACCOUNT ""
> -#endif
> -
> -noinline __noclone static void trace_direct_tramp(void)
> -{
> -       asm(CALL_DEPTH_ACCOUNT);
> -}
> -#endif
> -
>  /*
>   * Pretty much the same than for the function tracer from which the selftest
>   * has been borrowed.
> @@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>          */
>         ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
>         ret = register_ftrace_direct(&direct,
> -                                    (unsigned long)trace_direct_tramp);
> +                                    (unsigned long)ftrace_stub_direct_tramp);
>         if (ret)
>                 goto out;
>
> @@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>         unregister_ftrace_graph(&fgraph_ops);
>
>         ret = unregister_ftrace_direct(&direct,
> -                                      (unsigned long)trace_direct_tramp);
> +                                      (unsigned long)ftrace_stub_direct_tramp);
>         if (ret)
>                 goto out;
>
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] arm64: ftrace: Add direct call support
  2023-02-01 16:34 ` [PATCH 7/8] arm64: ftrace: Add direct call support Florent Revest
@ 2023-02-03 15:34   ` Mark Rutland
  2023-02-06 16:25     ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-03 15:34 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Wed, Feb 01, 2023 at 05:34:19PM +0100, Florent Revest wrote:
> This builds up on the CALL_OPS work which extends the ftrace patchsite
> on arm64 with an ops pointer usable by the ftrace trampoline.
> 
> This ops pointer is valid at all time. Indeed, it is either pointing to
> ftrace_list_ops or to the single ops which should be called from that
> patchsite.
> 
> There are a few cases to distinguish:
> - If a direct call ops is the only one tracing a function:
>   - If the direct called trampoline is within the reach of a BL
>     instruction
>      -> the ftrace patchsite jumps to the trampoline
>   - Else
>      -> the ftrace patchsite jumps to the ftrace_caller trampoline which
>         reads the ops pointer in the patchsite and jumps to the direct
>         call address stored in the ops
> - Else
>   -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
>      ops literal points to ftrace_list_ops so it iterates over all
>      registered ftrace ops, including the direct call ops and calls its
>      call_direct_funcs handler which stores the direct called
>      trampoline's address in the ftrace_regs and the ftrace_caller
>      trampoline will return to that address instead of returning to the
>      traced function

This looks pretty good!

Overall I think this is the right shape, but I have a few minor comments that
lead to a bit of churn. I've noted those below, and I've also pushed out a
branch with suggested fixups (as discrete patches) to my
arm64/ftrace/direct-call-testing branch, which you can find at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
  git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git

Note that's based on a merge of the arm64 tree's for-next/ftrace branch and the
trace tree's trace/for-next branch, and there were a couple of trivial
conflicts I had to fix up when first picking this series (which I've noted in
the affected patches)

Those trees are at:

  # arm64
  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

  # trace
  https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git

> Signed-off-by: Florent Revest <revest@chromium.org>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> ---
>  arch/arm64/Kconfig               |  2 ++
>  arch/arm64/include/asm/ftrace.h  | 17 +++++++++
>  arch/arm64/kernel/asm-offsets.c  |  6 ++++
>  arch/arm64/kernel/entry-ftrace.S | 60 +++++++++++++++++++++++---------
>  arch/arm64/kernel/ftrace.c       | 36 ++++++++++++++++---
>  5 files changed, 100 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6f6f37161cf6..7deafd653c42 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -188,6 +188,8 @@ config ARM64
>  	select HAVE_DYNAMIC_FTRACE
>  	select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
>  		if $(cc-option,-fpatchable-function-entry=2)
> +	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
> +		if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
>  	select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
>  		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index cf6d9c42ff36..9fb95966b6d5 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -80,6 +80,10 @@ struct ftrace_regs {
>  
>  	unsigned long sp;
>  	unsigned long pc;
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	unsigned long custom_tramp;

Minor nit, but could we please s/custom_tramp/direct_tramp/?

> +#endif
>  };

I forgot to add a comment here, but we require that sizeof(struct ftrace_regs)
is a multiple of 16, as the AAPCS requires that the stack is 16-byte aligned at
function call boundaries (and on arm64 we currently assume that it's *always*
aligned).

I'd suggest we make this:

| /*
|  * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
|  * stack alignment
|  */
| struct ftrace_regs {
| 	/* x0 - x8 */
| 	unsigned long regs[9];
|
| #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
| 	unsigned long direct_tramp;
| #else
| 	unsigned long __unused;
| #endif
|  
| 	unsigned long fp;
| 	unsigned long lr;
| 
| 	unsigned long sp;
| 	unsigned long pc;
| };

>  
>  static __always_inline unsigned long
> @@ -136,6 +140,19 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
>  void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
>  		       struct ftrace_ops *op, struct ftrace_regs *fregs);
>  #define ftrace_graph_func ftrace_graph_func
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
> +						 unsigned long addr)
> +{
> +	/*
> +	 * The ftrace trampoline will return to this address instead of the
> +	 * instrumented function.
> +	 */
> +	fregs->custom_tramp = addr;
> +}
> +#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
> +
>  #endif
>  
>  #define ftrace_return_address(n) return_address(n)
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index ae345b06e9f7..c67f0373b88c 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -93,6 +93,9 @@ int main(void)
>    DEFINE(FREGS_LR,		offsetof(struct ftrace_regs, lr));
>    DEFINE(FREGS_SP,		offsetof(struct ftrace_regs, sp));
>    DEFINE(FREGS_PC,		offsetof(struct ftrace_regs, pc));
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +  DEFINE(FREGS_CUSTOM_TRAMP,	offsetof(struct ftrace_regs, custom_tramp));
> +#endif
>    DEFINE(FREGS_SIZE,		sizeof(struct ftrace_regs));
>    BLANK();
>  #endif
> @@ -197,6 +200,9 @@ int main(void)
>  #endif
>  #ifdef CONFIG_FUNCTION_TRACER
>    DEFINE(FTRACE_OPS_FUNC,		offsetof(struct ftrace_ops, func));
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +  DEFINE(FTRACE_OPS_DIRECT_CALL,	offsetof(struct ftrace_ops, direct_call));
> +#endif
>  #endif
>    return 0;
>  }
> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> index 9869debd22fb..0576f38e6362 100644
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -36,6 +36,30 @@
>  SYM_CODE_START(ftrace_caller)
>  	bti	c
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
> +	/*
> +	 * The literal pointer to the ops is at an 8-byte aligned boundary
> +	 * which is either 12 or 16 bytes before the BL instruction in the call
> +	 * site. See ftrace_call_adjust() for details.
> +	 *
> +	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
> +	 * and we can find the address of the literal in either case by
> +	 * aligning to an 8-byte boundary and subtracting 16. We do the
> +	 * alignment first as this allows us to fold the subtraction into the
> +	 * LDR.
> +	 */
> +	bic	x11, x30, 0x7
> +	ldr	x11, [x11, #-(4 * AARCH64_INSN_SIZE)]		// op
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	/* If the op has a direct call address, return to it */
> +	ldr	x12, [x11, #FTRACE_OPS_DIRECT_CALL]		// op->direct_call
> +	cbz	x12, 1f
> +	ret	x12
> +1:
> +#endif

For branching to the direct call trampoline, it would be better to use a
forward branch (i.e. BR), as that won't unbalance return stack predictors.
That'll need to use x16 or x17 to be compatible with a `BTI C` landing pad.

I'd also prefer if we could move the direct call invocation to a stub at the
end of ftrace_caller, e.g.

| SYM_CODE_START(ftrace_caller)
| 	... discover ops pointer here ...
| 
| 	ldr	w17, [x11, #FTRACE_OPS_DIRECT_CALL]
| 	cbnz	ftrace_caller_direct
| 
| 	... usual save and handling logic here ...
| 
| 	// just prior to return 
| 	ldr	x17, [sp, #FREGS_DIRECT_CALL]
| 	cbnz	ftrace_caller_direct_late
| 
| 	...  usual restore logic here ...
| 
| 	ret	x9
| 
| SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
| 	
| 	... shuffle registers for the trampoline here ...
| 
|	// fallthrough to the usual invocation
| SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
| 	br	x17
| SYM_CODE_END(ftrace_caller)

> +#endif
> +
>  	/* Save original SP */
>  	mov	x10, sp
>  
> @@ -57,6 +81,11 @@ SYM_CODE_START(ftrace_caller)
>  	/* Save the PC after the ftrace callsite */
>  	str	x30, [sp, #FREGS_PC]
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	/* Set custom_tramp to zero  */
> +	str	xzr, [sp, #FREGS_CUSTOM_TRAMP]
> +#endif
> +
>  	/* Create a frame record for the callsite above the ftrace regs */
>  	stp	x29, x9, [sp, #FREGS_SIZE + 16]
>  	add	x29, sp, #FREGS_SIZE + 16
> @@ -71,20 +100,7 @@ SYM_CODE_START(ftrace_caller)
>  	mov	x3, sp					// regs
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
> -	/*
> -	 * The literal pointer to the ops is at an 8-byte aligned boundary
> -	 * which is either 12 or 16 bytes before the BL instruction in the call
> -	 * site. See ftrace_call_adjust() for details.
> -	 *
> -	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
> -	 * and we can find the address of the literal in either case by
> -	 * aligning to an 8-byte boundary and subtracting 16. We do the
> -	 * alignment first as this allows us to fold the subtraction into the
> -	 * LDR.
> -	 */
> -	bic	x2, x30, 0x7
> -	ldr	x2, [x2, #-16]				// op
> -
> +	mov	x2, x11					// op
>  	ldr	x4, [x2, #FTRACE_OPS_FUNC]		// op->func
>  	blr	x4					// op->func(ip, parent_ip, op, regs)
>  
> @@ -110,12 +126,24 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
>  	/* Restore the callsite's FP, LR, PC */
>  	ldr	x29, [sp, #FREGS_FP]
>  	ldr	x30, [sp, #FREGS_LR]
> -	ldr	x9,  [sp, #FREGS_PC]
> +	ldr	x10, [sp, #FREGS_PC]
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	ldr	x11, [sp, #FREGS_CUSTOM_TRAMP]
> +	cbz	x11, 1f
> +	/* Set x9 to parent ip before jump to custom trampoline */
> +	mov	x9,  x30
> +	/* Set lr to self ip */
> +	ldr	x30, [sp, #FREGS_PC]
> +	/* Set x10 (used for return address) to custom trampoline */
> +	mov	x10, x11
> +1:
> +#endif
>  
>  	/* Restore the callsite's SP */
>  	add	sp, sp, #FREGS_SIZE + 32
>  
> -	ret	x9
> +	ret	x10
>  SYM_CODE_END(ftrace_caller)
>  
>  #if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 5545fe1a9012..758436727fba 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -206,6 +206,13 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
>  	return NULL;
>  }
>  
> +static bool reachable_by_bl(unsigned long addr, unsigned long pc)
> +{
> +	long offset = (long)addr - (long)pc;
> +
> +	return offset >= -SZ_128M && offset < SZ_128M;
> +}
> +
>  /*
>   * Find the address the callsite must branch to in order to reach '*addr'.
>   *
> @@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
>  				      unsigned long *addr)
>  {
>  	unsigned long pc = rec->ip;
> -	long offset = (long)*addr - (long)pc;
>  	struct plt_entry *plt;
>  
> +	/*
> +	 * If a custom trampoline is unreachable, rely on the ftrace_caller
> +	 * trampoline which knows how to indirectly reach that trampoline
> +	 * through ops->direct_call.
> +	 */
> +	if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
> +		*addr = FTRACE_ADDR;
> +

With this, the check in get_ftrace_plt() is now redundant, since that's always
called with FTRACE_ADDR as the 'addr' argument. We could probably clean that
up, but I'm happy to leave that for now and handle that as a follow-up, since
I think that'll imply making some other structural changes, which qould obscure
the bulk of this patch.

>  	/*
>  	 * When the target is within range of the 'BL' instruction, use 'addr'
>  	 * as-is and branch to that directly.
>  	 */
> -	if (offset >= -SZ_128M && offset < SZ_128M)
> +	if (reachable_by_bl(*addr, pc))
>  		return true;
>  
>  	/*
> @@ -330,12 +344,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>  		       unsigned long addr)
>  {
> -	if (WARN_ON_ONCE(old_addr != (unsigned long)ftrace_caller))
> +	unsigned long pc = rec->ip;
> +	u32 old, new;
> +	int ret;
> +
> +	ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
> +	if (ret)
> +		return ret;
> +
> +	if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
>  		return -EINVAL;
> -	if (WARN_ON_ONCE(addr != (unsigned long)ftrace_caller))
> +	if (!ftrace_find_callable_addr(rec, NULL, &addr))
>  		return -EINVAL;
>  
> -	return ftrace_rec_update_ops(rec);
> +	old = aarch64_insn_gen_branch_imm(pc, old_addr,
> +					  AARCH64_INSN_BRANCH_LINK);
> +	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
> +
> +	return ftrace_modify_code(pc, old, new, true);
>  }
>  #endif

Aside from the above, this looks good to me.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest
  2023-02-03 12:35     ` Florent Revest
@ 2023-02-03 15:37       ` Mark Rutland
  2023-02-06 16:25         ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2023-02-03 15:37 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, Xu Kuohai, Li Huafei

On Fri, Feb 03, 2023 at 01:35:00PM +0100, Florent Revest wrote:
> On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > I think it'd be much better to do something like the below as a preparatory
> > cleanup (tested on s390 under QEMU).
> 
> Thanks, that looks great to me. I'll make it a part of the series in v2 then.
> Unless it's preferred that this gets merged separately?

I reckon put it in the series for v2, and if Steve or Masami want to pick it up
beforehand, they can choose to do so from there?

Since it's not currently exploding, I suspect it's not urgent.

Mark.

> 
> > Thanks,
> > Mark.
> >
> > ---->8----
> > From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001
> > From: Mark Rutland <mark.rutland@arm.com>
> > Date: Thu, 2 Feb 2023 18:37:40 +0000
> > Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp
> >
> > The ftrace selftest code has a trace_direct_tramp() function which it
> > uses as a direct call trampoline. This happens to work on x86, since the
> > direct call's return address is in the usual place, and can be returned
> > to via a RET, but in general the calling convention for direct calls is
> > different from regular function calls, and requires a trampoline written
> > in assembly.
> >
> > On s390, regular function calls place the return address in %r14, and an
> > ftrace patch-site in an instrumented function places the trampoline's
> > return address (which is within the instrumented function) in %r0,
> > preserving the original %r14 value in-place. As a regular C function
> > will return to the address in %r14, using a C function as the trampoline
> > results in the trampoline returning to the caller of the instrumented
> > function, skipping the body of the instrumented function.
> >
> > Note that the s390 issue is not detcted by the ftrace selftest code, as
> > the instrumented function is trivial, and returning back into the caller
> > happens to be equivalent.
> >
> > On arm64, regular function calls place the return address in x30, and
> > an ftrace patch-site in an instrumented function saves this into r9
> > and places the trampoline's return address (within the instrumented
> > function) in x30. A regular C function will return to the address in
> > x30, but will not restore x9 into x30. Consequently, using a C function
> > as the trampoline results in returning to the trampoline's return
> > address having corrupted x30, such that when the instrumented function
> > returns, it will return back into itself.
> >
> > To avoid future issues in this area, remove the trace_direct_tramp()
> > function, and require that each architecture with direct calls provides
> > a stub trampoline, named ftrace_stub_direct_tramp. This can be written
> > to handle the architecture's trampoline calling convention, and in
> > future could be used elsewhere (e.g. in the ftrace ops sample, to
> > measure the overhead of direct calls), so we may as well always build it
> > in.
> >
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Li Huafei <lihuafei1@huawei.com>
> > Cc: Xu Kuohai <xukuohai@huawei.com>
> > Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> > Cc: Florent Revest <revest@chromium.org>
> > ---
> >  arch/s390/kernel/mcount.S     |  5 +++++
> >  arch/x86/kernel/ftrace_32.S   |  5 +++++
> >  arch/x86/kernel/ftrace_64.S   |  4 ++++
> >  include/linux/ftrace.h        |  2 ++
> >  kernel/trace/trace_selftest.c | 15 ++-------------
> >  5 files changed, 18 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
> > index 4786bfe02144..ad13a0e2c307 100644
> > --- a/arch/s390/kernel/mcount.S
> > +++ b/arch/s390/kernel/mcount.S
> > @@ -32,6 +32,11 @@ ENTRY(ftrace_stub)
> >         BR_EX   %r14
> >  ENDPROC(ftrace_stub)
> >
> > +SYM_CODE_START(ftrace_stub_direct_tramp)
> > +       lgr     %r1, %r0
> > +       BR_EX   %r1
> > +SYM_CODE_END(ftrace_stub_direct_tramp)
> > +
> >         .macro  ftrace_regs_entry, allregs=0
> >         stg     %r14,(__SF_GPRS+8*8)(%r15)      # save traced function caller
> >
> > diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
> > index a0ed0e4a2c0c..0d9a14528176 100644
> > --- a/arch/x86/kernel/ftrace_32.S
> > +++ b/arch/x86/kernel/ftrace_32.S
> > @@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
> >         jmp     .Lftrace_ret
> >  SYM_CODE_END(ftrace_regs_caller)
> >
> > +SYM_FUNC_START(ftrace_stub_direct_tramp)
> > +       CALL_DEPTH_ACCOUNT
> > +       RET
> > +SYM_FUNC_END(ftrace_stub_direct_tramp)
> > +
> >  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> >  SYM_CODE_START(ftrace_graph_caller)
> >         pushl   %eax
> > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> > index 1265ad519249..8fc77e3e039c 100644
> > --- a/arch/x86/kernel/ftrace_64.S
> > +++ b/arch/x86/kernel/ftrace_64.S
> > @@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
> >  SYM_FUNC_END(ftrace_regs_caller)
> >  STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
> >
> > +SYM_FUNC_START(ftrace_stub_direct_tramp)
> > +       CALL_DEPTH_ACCOUNT
> > +       RET
> > +SYM_FUNC_END(ftrace_stub_direct_tramp)
> >
> >  #else /* ! CONFIG_DYNAMIC_FTRACE */
> >
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 3d2156e335d7..a9836b40630e 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
> >  int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
> >  int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
> >
> > +void ftrace_stub_direct_tramp(void *);
> > +
> >  #else
> >  struct ftrace_ops;
> >  # define ftrace_direct_func_count 0
> > diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> > index 06218fc9374b..e6530b7b42e4 100644
> > --- a/kernel/trace/trace_selftest.c
> > +++ b/kernel/trace/trace_selftest.c
> > @@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata  = {
> >         .retfunc                = &trace_graph_return,
> >  };
> >
> > -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > -#ifndef CALL_DEPTH_ACCOUNT
> > -#define CALL_DEPTH_ACCOUNT ""
> > -#endif
> > -
> > -noinline __noclone static void trace_direct_tramp(void)
> > -{
> > -       asm(CALL_DEPTH_ACCOUNT);
> > -}
> > -#endif
> > -
> >  /*
> >   * Pretty much the same than for the function tracer from which the selftest
> >   * has been borrowed.
> > @@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
> >          */
> >         ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
> >         ret = register_ftrace_direct(&direct,
> > -                                    (unsigned long)trace_direct_tramp);
> > +                                    (unsigned long)ftrace_stub_direct_tramp);
> >         if (ret)
> >                 goto out;
> >
> > @@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
> >         unregister_ftrace_graph(&fgraph_ops);
> >
> >         ret = unregister_ftrace_direct(&direct,
> > -                                      (unsigned long)trace_direct_tramp);
> > +                                      (unsigned long)ftrace_stub_direct_tramp);
> >         if (ret)
> >                 goto out;
> >
> > --
> > 2.30.2
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest
  2023-02-03 15:37       ` Mark Rutland
@ 2023-02-06 16:25         ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-06 16:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, Xu Kuohai, Li Huafei

On Fri, Feb 3, 2023 at 4:37 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Feb 03, 2023 at 01:35:00PM +0100, Florent Revest wrote:
> > On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > I think it'd be much better to do something like the below as a preparatory
> > > cleanup (tested on s390 under QEMU).
> >
> > Thanks, that looks great to me. I'll make it a part of the series in v2 then.
> > Unless it's preferred that this gets merged separately?
>
> I reckon put it in the series for v2, and if Steve or Masami want to pick it up
> beforehand, they can choose to do so from there?
>
> Since it's not currently exploding, I suspect it's not urgent.
>
> Mark.

Ack

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] arm64: ftrace: Add direct call support
  2023-02-03 15:34   ` Mark Rutland
@ 2023-02-06 16:25     ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-06 16:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Fri, Feb 3, 2023 at 4:34 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Feb 01, 2023 at 05:34:19PM +0100, Florent Revest wrote:
> > This builds up on the CALL_OPS work which extends the ftrace patchsite
> > on arm64 with an ops pointer usable by the ftrace trampoline.
> >
> > This ops pointer is valid at all time. Indeed, it is either pointing to
> > ftrace_list_ops or to the single ops which should be called from that
> > patchsite.
> >
> > There are a few cases to distinguish:
> > - If a direct call ops is the only one tracing a function:
> >   - If the direct called trampoline is within the reach of a BL
> >     instruction
> >      -> the ftrace patchsite jumps to the trampoline
> >   - Else
> >      -> the ftrace patchsite jumps to the ftrace_caller trampoline which
> >         reads the ops pointer in the patchsite and jumps to the direct
> >         call address stored in the ops
> > - Else
> >   -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
> >      ops literal points to ftrace_list_ops so it iterates over all
> >      registered ftrace ops, including the direct call ops and calls its
> >      call_direct_funcs handler which stores the direct called
> >      trampoline's address in the ftrace_regs and the ftrace_caller
> >      trampoline will return to that address instead of returning to the
> >      traced function
>
> This looks pretty good!
>
> Overall I think this is the right shape, but I have a few minor comments that
> lead to a bit of churn. I've noted those below, and I've also pushed out a
> branch with suggested fixups (as discrete patches) to my
> arm64/ftrace/direct-call-testing branch, which you can find at:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
>
> Note that's based on a merge of the arm64 tree's for-next/ftrace branch and the
> trace tree's trace/for-next branch, and there were a couple of trivial
> conflicts I had to fix up when first picking this series (which I've noted in
> the affected patches)
>
> Those trees are at:
>
>   # arm64
>   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>
>   # trace
>   https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
>

Thanks for taking the time to publish these, that helps a lot.

> > @@ -80,6 +80,10 @@ struct ftrace_regs {
> >
> >       unsigned long sp;
> >       unsigned long pc;
> > +
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > +     unsigned long custom_tramp;
>
> Minor nit, but could we please s/custom_tramp/direct_tramp/?

Yes, that's better

> I forgot to add a comment here, but we require that sizeof(struct ftrace_regs)
> is a multiple of 16, as the AAPCS requires that the stack is 16-byte aligned at
> function call boundaries (and on arm64 we currently assume that it's *always*
> aligned).
>
> I'd suggest we make this:
>
> | /*
> |  * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
> |  * stack alignment
> |  */
> | struct ftrace_regs {
> |       /* x0 - x8 */
> |       unsigned long regs[9];
> |
> | #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> |       unsigned long direct_tramp;
> | #else
> |       unsigned long __unused;
> | #endif
> |
> |       unsigned long fp;
> |       unsigned long lr;
> |
> |       unsigned long sp;
> |       unsigned long pc;
> | };

Oh good catch, that was easy to miss. I'm surprised I didn't hit
horrible bugs when testing this in qemu.

> For branching to the direct call trampoline, it would be better to use a
> forward branch (i.e. BR), as that won't unbalance return stack predictors.
> That'll need to use x16 or x17 to be compatible with a `BTI C` landing pad.
>
> I'd also prefer if we could move the direct call invocation to a stub at the
> end of ftrace_caller, e.g.
>
> | SYM_CODE_START(ftrace_caller)
> |       ... discover ops pointer here ...
> |
> |       ldr     w17, [x11, #FTRACE_OPS_DIRECT_CALL]
> |       cbnz    ftrace_caller_direct
> |
> |       ... usual save and handling logic here ...
> |
> |       // just prior to return
> |       ldr     x17, [sp, #FREGS_DIRECT_CALL]
> |       cbnz    ftrace_caller_direct_late
> |
> |       ...  usual restore logic here ...
> |
> |       ret     x9
> |
> | SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
> |
> |       ... shuffle registers for the trampoline here ...
> |
> |       // fallthrough to the usual invocation
> | SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
> |       br      x17
> | SYM_CODE_END(ftrace_caller)

Agreed, it makes things a lot easier to read and your branch makes
restoring LR and PC more sane too. I squashed your change in and will
send it as part of v2.

> > @@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
> >                                     unsigned long *addr)
> >  {
> >       unsigned long pc = rec->ip;
> > -     long offset = (long)*addr - (long)pc;
> >       struct plt_entry *plt;
> >
> > +     /*
> > +      * If a custom trampoline is unreachable, rely on the ftrace_caller
> > +      * trampoline which knows how to indirectly reach that trampoline
> > +      * through ops->direct_call.
> > +      */
> > +     if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
> > +             *addr = FTRACE_ADDR;
> > +
>
> With this, the check in get_ftrace_plt() is now redundant, since that's always
> called with FTRACE_ADDR as the 'addr' argument. We could probably clean that
> up, but I'm happy to leave that for now and handle that as a follow-up, since
> I think that'll imply making some other structural changes, which qould obscure
> the bulk of this patch.

Fair enough, I'll add an extra patch for this in v2. It's fairly
simple and if other maintainers think it's too much for the scope of
the series, we can always pick it up later and merge it separately.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-02 17:37     ` Florent Revest
@ 2023-02-07 15:21       ` Florent Revest
  2023-02-07 15:35         ` Steven Rostedt
  0 siblings, 1 reply; 35+ messages in thread
From: Florent Revest @ 2023-02-07 15:21 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Thu, Feb 2, 2023 at 6:37 PM Florent Revest <revest@chromium.org> wrote:
>
> On Thu, Feb 2, 2023 at 4:02 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > Looking at samples/ftrace/, as of this patch we have a few samples that are
> > almost identical, modulo the function being traced, and some different register
> > shuffling for arguments:
> >
> > * ftrace-direct.c and ftrace-direct-multi.c
> > * ftrace-direct-modify.c and ftrace-direct-modify
> >
> > ... perhaps it would be better to just delete the !multi versions ?
>
> The multi versions hook two functions and the !multi hook just one but
> I agree that this granularity in coverage is probably just a
> maintenance burden and doesn't help with much! :)
> I'll delete the !multi in v2, as part of the patch 2 and patch 1 will
> just migrate the selftest to use the multi API.

Actually, I'm not sure anymore if we should delete the !multi samples...

I realized that they are also used as part of the ftrace selftests in:
- tools/testing/selftests/ftrace/test.d/direct/ftrace-direct.tc
- tools/testing/selftests/ftrace/test.d/direct/kprobe-direct.tc

It does not really make sense to use the ftrace-direct-muti sample as
a drop-in replacement for the ftrace-direct sample there since they
don't really do the same thing so we would either need to change the
test a bit or the multi sample.
Also, we would still need to adapt the ftrace-direct-too sample since
it has no multi equivalent and is required there.

It's certainly doable but now it feels to me like going one step too
far with the refactoring within the scope of this series. Do you think
it's worth it ? I have 70% of that work done already so I'm happy to
finish it but I thought I should check back before sending a v2 that's
more complex than anticipated.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-07 15:21       ` Florent Revest
@ 2023-02-07 15:35         ` Steven Rostedt
  2023-02-07 16:19           ` Florent Revest
  0 siblings, 1 reply; 35+ messages in thread
From: Steven Rostedt @ 2023-02-07 15:35 UTC (permalink / raw)
  To: Florent Revest
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, linux-trace-kernel,
	bpf, catalin.marinas, will, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Tue, 7 Feb 2023 16:21:49 +0100
Florent Revest <revest@chromium.org> wrote:

> Actually, I'm not sure anymore if we should delete the !multi samples...
> 
> I realized that they are also used as part of the ftrace selftests in:
> - tools/testing/selftests/ftrace/test.d/direct/ftrace-direct.tc
> - tools/testing/selftests/ftrace/test.d/direct/kprobe-direct.tc
> 
> It does not really make sense to use the ftrace-direct-muti sample as
> a drop-in replacement for the ftrace-direct sample there since they
> don't really do the same thing so we would either need to change the
> test a bit or the multi sample.
> Also, we would still need to adapt the ftrace-direct-too sample since
> it has no multi equivalent and is required there.

Let's not delete the samples, and they do test slightly different use cases
(although the code may be somewhat the same). I rather still keep that test
coverage.

-- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-07 15:35         ` Steven Rostedt
@ 2023-02-07 16:19           ` Florent Revest
  0 siblings, 0 replies; 35+ messages in thread
From: Florent Revest @ 2023-02-07 16:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, linux-trace-kernel,
	bpf, catalin.marinas, will, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai

On Tue, Feb 7, 2023 at 4:35 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 7 Feb 2023 16:21:49 +0100
> Florent Revest <revest@chromium.org> wrote:
>
> > Actually, I'm not sure anymore if we should delete the !multi samples...
> >
> > I realized that they are also used as part of the ftrace selftests in:
> > - tools/testing/selftests/ftrace/test.d/direct/ftrace-direct.tc
> > - tools/testing/selftests/ftrace/test.d/direct/kprobe-direct.tc
> >
> > It does not really make sense to use the ftrace-direct-muti sample as
> > a drop-in replacement for the ftrace-direct sample there since they
> > don't really do the same thing so we would either need to change the
> > test a bit or the multi sample.
> > Also, we would still need to adapt the ftrace-direct-too sample since
> > it has no multi equivalent and is required there.
>
> Let's not delete the samples, and they do test slightly different use cases
> (although the code may be somewhat the same). I rather still keep that test
> coverage.
>
> -- Steve

Ack :) Thanks Steve!

I'll undo the work I've started to do on this and send a v2 shortly afterward.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-02-07 16:19 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
2023-02-02 15:01   ` Mark Rutland
2023-02-02 17:37     ` Florent Revest
2023-02-07 15:21       ` Florent Revest
2023-02-07 15:35         ` Steven Rostedt
2023-02-07 16:19           ` Florent Revest
2023-02-01 16:34 ` [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API Florent Revest
2023-02-02 15:11   ` Mark Rutland
2023-02-01 16:34 ` [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
2023-02-02 15:17   ` Mark Rutland
2023-02-01 16:34 ` [PATCH 4/8] ftrace: Store direct called addresses in their ops Florent Revest
2023-02-02 15:29   ` Mark Rutland
2023-02-02 17:41     ` Florent Revest
2023-02-01 16:34 ` [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
2023-02-02 15:54   ` Mark Rutland
2023-02-02 16:56     ` Mark Rutland
2023-02-02 18:19       ` Florent Revest
2023-02-03 10:03         ` Mark Rutland
2023-02-03 11:01           ` Florent Revest
2023-02-02 18:18     ` Florent Revest
2023-02-01 16:34 ` [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Florent Revest
2023-02-02 19:03   ` Mark Rutland
2023-02-03 12:35     ` Florent Revest
2023-02-03 15:37       ` Mark Rutland
2023-02-06 16:25         ` Florent Revest
2023-02-01 16:34 ` [PATCH 7/8] arm64: ftrace: Add direct call support Florent Revest
2023-02-03 15:34   ` Mark Rutland
2023-02-06 16:25     ` Florent Revest
2023-02-01 16:34 ` [PATCH 8/8] arm64: ftrace: Add direct called trampoline samples support Florent Revest
2023-02-02  8:36 ` [PATCH 0/8] Add ftrace direct call for arm64 Xu Kuohai
2023-02-02 10:50   ` Daniel Borkmann
2023-02-02 17:32     ` Florent Revest
2023-02-02 20:06 ` Steven Rostedt
2023-02-03  9:49   ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).