bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Add ftrace direct call for arm64
@ 2023-02-07 18:21 Florent Revest
  2023-02-07 18:21 ` [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to apply on top of the arm64 tree which contains Mark Rutland's
series on CALL_OPS [1] under the for-next/ftrace tag.
  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/
  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

The first three patches consolidate the two existing ftrace APIs for registering
direct calls. They are split to make the reviewers lives easier but if it'd be a
preferred style, I'd be happy to squash them in the next revision.
Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
from samples and selftests, there are no users of the _ftrace_direct API left
in-tree so this deletes it and renames the _ftrace_direct_multi API to
_ftrace_direct for simplicity.

The main benefit of this refactoring is that, with the API that's left, an
ftrace_ops backing a direct call will only ever point to one direct call. We can
therefore store the direct called trampoline address in the ops (patch 4) and
look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
destination would be out of reach of a BL instruction at the ftrace callsite.
(in this case, ftrace_caller acts as a lightweight intermediary trampoline)

This series has been tested on both arm64 and x86_64 with:
1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
2- samples/ftrace/*.ko (cf: patch 9)
3- tools/testing/selftests/bpf/test_progs (cf: patch 10)

Changes since v1 [2]:
- Updated the bpf selftests denylist according to newly passing tests
- Refactored the ftrace_caller assembly according to Mark's feedback
- Replaced Xu's stub trampoline patch for selftests with Mark's take on this
- Fixed direct calls on arch WITH_REGS=y and WITH_ARGS=n (x86 32-bit)
- Fixed the ftrace_regs stack alignment
- Simplified get_ftrace_plt() (cf: patch 8)
- Fixed a possible race when writing ops->direct_call
- Renamed "custom_tramp" to "direct_tramp"
- Referenced the commit id when mentioning a previous commit
- Linked the arm64 tree in the cover letter

This followed up on prior series by Xu Kuohai [3] and a RFC by me [4].

1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
2: https://lore.kernel.org/all/20230201163420.1579014-1-revest@chromium.org/
3: https://lore.kernel.org/all/20220913162732.163631-1-xukuohai@huaweicloud.com/
4: https://lore.kernel.org/all/20221108220651.24492-1-revest@chromium.org/

Florent Revest (9):
  ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  ftrace: Remove the legacy _ftrace_direct API
  ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  ftrace: Store direct called addresses in their ops
  ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  arm64: ftrace: Add direct call support
  arm64: ftrace: Simplify get_ftrace_plt
  arm64: ftrace: Add direct call trampoline samples support
  selftests/bpf: Update the tests deny list on aarch64

Mark Rutland (1):
  ftrace: selftest: remove broken trace_direct_tramp

 arch/arm64/Kconfig                           |   4 +
 arch/arm64/include/asm/ftrace.h              |  22 +
 arch/arm64/kernel/asm-offsets.c              |   6 +
 arch/arm64/kernel/entry-ftrace.S             |  90 +++-
 arch/arm64/kernel/ftrace.c                   |  46 +-
 arch/s390/kernel/mcount.S                    |   5 +
 arch/x86/kernel/ftrace_32.S                  |   5 +
 arch/x86/kernel/ftrace_64.S                  |   4 +
 include/linux/ftrace.h                       |  59 +--
 kernel/bpf/trampoline.c                      |  14 +-
 kernel/trace/Kconfig                         |   2 +-
 kernel/trace/ftrace.c                        | 433 +------------------
 kernel/trace/trace_selftest.c                |  23 +-
 samples/Kconfig                              |   2 +-
 samples/ftrace/ftrace-direct-modify.c        |  42 +-
 samples/ftrace/ftrace-direct-multi-modify.c  |  44 +-
 samples/ftrace/ftrace-direct-multi.c         |  26 +-
 samples/ftrace/ftrace-direct-too.c           |  35 +-
 samples/ftrace/ftrace-direct.c               |  33 +-
 tools/testing/selftests/bpf/DENYLIST.aarch64 |  82 +---
 20 files changed, 382 insertions(+), 595 deletions(-)

-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-03-15 23:33   ` Steven Rostedt
  2023-02-07 18:21 ` [PATCH v2 02/10] ftrace: Remove the legacy _ftrace_direct API Florent Revest
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

The _multi API requires that users keep their own ops but can enforce
that an op is only associated to one direct call.

Signed-off-by: Florent Revest <revest@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
---
 kernel/trace/trace_selftest.c         | 11 +++++++----
 samples/ftrace/ftrace-direct-modify.c | 12 ++++++++----
 samples/ftrace/ftrace-direct-too.c    | 12 +++++++-----
 samples/ftrace/ftrace-direct.c        | 12 +++++++-----
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index ff0536cea968..57221f69a33b 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -806,6 +806,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	int ret;
 	unsigned long count;
 	char *func_name __maybe_unused;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	struct ftrace_ops direct = {};
+#endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 	if (ftrace_filter_param) {
@@ -870,8 +873,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 * Register direct function together with graph tracer
 	 * and make sure we get graph trace.
 	 */
-	ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
-				     (unsigned long) trace_direct_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
+	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -891,8 +894,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 
 	unregister_ftrace_graph(&fgraph_ops);
 
-	ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
-				       (unsigned long) trace_direct_tramp);
+	ret = unregister_ftrace_direct_multi(&direct,
+					     (unsigned long) trace_direct_tramp);
 	if (ret)
 		goto out;
 
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index de5a0f67f320..ecd76f75cb80 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -96,6 +96,8 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static unsigned long my_tramp = (unsigned long)my_tramp1;
 static unsigned long tramps[2] = {
 	(unsigned long)my_tramp1,
@@ -114,7 +116,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
+		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -129,7 +131,9 @@ static int __init ftrace_direct_init(void)
 {
 	int ret;
 
-	ret = register_ftrace_direct(my_ip, my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
+	ret = register_ftrace_direct_multi(&direct, my_tramp);
+
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
 	return ret;
@@ -138,12 +142,12 @@ static int __init ftrace_direct_init(void)
 static void __exit ftrace_direct_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct(my_ip, my_tramp);
+	unregister_ftrace_direct_multi(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index e13fb59a2b47..0e907092e2c0 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -70,21 +70,23 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static int __init ftrace_direct_init(void)
 {
-	return register_ftrace_direct((unsigned long)handle_mm_fault,
-				     (unsigned long)my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
+
+	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct((unsigned long)handle_mm_fault,
-				 (unsigned long)my_tramp);
+	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
+MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 1f769d0db20f..e446c38f6b58 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -63,21 +63,23 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+static struct ftrace_ops direct;
+
 static int __init ftrace_direct_init(void)
 {
-	return register_ftrace_direct((unsigned long)wake_up_process,
-				     (unsigned long)my_tramp);
+	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
+
+	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct((unsigned long)wake_up_process,
-				 (unsigned long)my_tramp);
+	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
 MODULE_LICENSE("GPL");
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 02/10] ftrace: Remove the legacy _ftrace_direct API
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
  2023-02-07 18:21 ` [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 03/10] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

This API relies on a single global ops, used for all direct calls
registered with it. However, to implement arm64 direct calls, we need
each ops to point to a single direct call trampoline.

Signed-off-by: Florent Revest <revest@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
---
 include/linux/ftrace.h |  32 ----
 kernel/trace/ftrace.c  | 393 -----------------------------------------
 2 files changed, 425 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 366c730beaa3..2d7c85e47c2d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -397,14 +397,6 @@ struct ftrace_func_entry {
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 extern int ftrace_direct_func_count;
-int register_ftrace_direct(unsigned long ip, unsigned long addr);
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr);
-int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr);
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr);
-int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-				struct dyn_ftrace *rec,
-				unsigned long old_addr,
-				unsigned long new_addr);
 unsigned long ftrace_find_rec_direct(unsigned long ip);
 int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
 int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
@@ -414,30 +406,6 @@ int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr
 #else
 struct ftrace_ops;
 # define ftrace_direct_func_count 0
-static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	return -ENOTSUPP;
-}
-static inline int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	return -ENOTSUPP;
-}
-static inline int modify_ftrace_direct(unsigned long ip,
-				       unsigned long old_addr, unsigned long new_addr)
-{
-	return -ENOTSUPP;
-}
-static inline struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
-	return NULL;
-}
-static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-					      struct dyn_ftrace *rec,
-					      unsigned long old_addr,
-					      unsigned long new_addr)
-{
-	return -ENODEV;
-}
 static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	return 0;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index e634b80f49d1..5efe201428fa 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2585,20 +2585,6 @@ static void call_direct_funcs(unsigned long ip, unsigned long pip,
 
 	arch_ftrace_set_direct_caller(fregs, addr);
 }
-
-struct ftrace_ops direct_ops = {
-	.func		= call_direct_funcs,
-	.flags		= FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS
-			  | FTRACE_OPS_FL_PERMANENT,
-	/*
-	 * By declaring the main trampoline as this trampoline
-	 * it will never have one allocated for it. Allocated
-	 * trampolines should not call direct functions.
-	 * The direct_ops should only be called by the builtin
-	 * ftrace_regs_caller trampoline.
-	 */
-	.trampoline	= FTRACE_REGS_ADDR,
-};
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
@@ -5295,387 +5281,8 @@ struct ftrace_direct_func {
 
 static LIST_HEAD(ftrace_direct_funcs);
 
-/**
- * ftrace_find_direct_func - test an address if it is a registered direct caller
- * @addr: The address of a registered direct caller
- *
- * This searches to see if a ftrace direct caller has been registered
- * at a specific address, and if so, it returns a descriptor for it.
- *
- * This can be used by architecture code to see if an address is
- * a direct caller (trampoline) attached to a fentry/mcount location.
- * This is useful for the function_graph tracer, as it may need to
- * do adjustments if it traced a location that also has a direct
- * trampoline attached to it.
- */
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
-	struct ftrace_direct_func *entry;
-	bool found = false;
-
-	/* May be called by fgraph trampoline (protected by rcu tasks) */
-	list_for_each_entry_rcu(entry, &ftrace_direct_funcs, next) {
-		if (entry->addr == addr) {
-			found = true;
-			break;
-		}
-	}
-	if (found)
-		return entry;
-
-	return NULL;
-}
-
-static struct ftrace_direct_func *ftrace_alloc_direct_func(unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-
-	direct = kmalloc(sizeof(*direct), GFP_KERNEL);
-	if (!direct)
-		return NULL;
-	direct->addr = addr;
-	direct->count = 0;
-	list_add_rcu(&direct->next, &ftrace_direct_funcs);
-	ftrace_direct_func_count++;
-	return direct;
-}
-
 static int register_ftrace_function_nolock(struct ftrace_ops *ops);
 
-/**
- * register_ftrace_direct - Call a custom trampoline directly
- * @ip: The address of the nop at the beginning of a function
- * @addr: The address of the trampoline to call at @ip
- *
- * This is used to connect a direct call from the nop location (@ip)
- * at the start of ftrace traced functions. The location that it calls
- * (@addr) must be able to handle a direct call, and save the parameters
- * of the function being traced, and restore them (or inject new ones
- * if needed), before returning.
- *
- * Returns:
- *  0 on success
- *  -EBUSY - Another direct function is already attached (there can be only one)
- *  -ENODEV - @ip does not point to a ftrace nop location (or not supported)
- *  -ENOMEM - There was an allocation failure.
- */
-int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-	struct ftrace_func_entry *entry;
-	struct ftrace_hash *free_hash = NULL;
-	struct dyn_ftrace *rec;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	/* See if there's a direct function at @ip already */
-	ret = -EBUSY;
-	if (ftrace_find_rec_direct(ip))
-		goto out_unlock;
-
-	ret = -ENODEV;
-	rec = lookup_rec(ip, ip);
-	if (!rec)
-		goto out_unlock;
-
-	/*
-	 * Check if the rec says it has a direct call but we didn't
-	 * find one earlier?
-	 */
-	if (WARN_ON(rec->flags & FTRACE_FL_DIRECT))
-		goto out_unlock;
-
-	/* Make sure the ip points to the exact record */
-	if (ip != rec->ip) {
-		ip = rec->ip;
-		/* Need to check this ip for a direct. */
-		if (ftrace_find_rec_direct(ip))
-			goto out_unlock;
-	}
-
-	ret = -ENOMEM;
-	direct = ftrace_find_direct_func(addr);
-	if (!direct) {
-		direct = ftrace_alloc_direct_func(addr);
-		if (!direct)
-			goto out_unlock;
-	}
-
-	entry = ftrace_add_rec_direct(ip, addr, &free_hash);
-	if (!entry)
-		goto out_unlock;
-
-	ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
-
-	if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
-		ret = register_ftrace_function_nolock(&direct_ops);
-		if (ret)
-			ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
-	}
-
-	if (ret) {
-		remove_hash_entry(direct_functions, entry);
-		kfree(entry);
-		if (!direct->count) {
-			list_del_rcu(&direct->next);
-			synchronize_rcu_tasks();
-			kfree(direct);
-			if (free_hash)
-				free_ftrace_hash(free_hash);
-			free_hash = NULL;
-			ftrace_direct_func_count--;
-		}
-	} else {
-		direct->count++;
-	}
- out_unlock:
-	mutex_unlock(&direct_mutex);
-
-	if (free_hash) {
-		synchronize_rcu_tasks();
-		free_ftrace_hash(free_hash);
-	}
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(register_ftrace_direct);
-
-static struct ftrace_func_entry *find_direct_entry(unsigned long *ip,
-						   struct dyn_ftrace **recp)
-{
-	struct ftrace_func_entry *entry;
-	struct dyn_ftrace *rec;
-
-	rec = lookup_rec(*ip, *ip);
-	if (!rec)
-		return NULL;
-
-	entry = __ftrace_lookup_ip(direct_functions, rec->ip);
-	if (!entry) {
-		WARN_ON(rec->flags & FTRACE_FL_DIRECT);
-		return NULL;
-	}
-
-	WARN_ON(!(rec->flags & FTRACE_FL_DIRECT));
-
-	/* Passed in ip just needs to be on the call site */
-	*ip = rec->ip;
-
-	if (recp)
-		*recp = rec;
-
-	return entry;
-}
-
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
-	struct ftrace_direct_func *direct;
-	struct ftrace_func_entry *entry;
-	struct ftrace_hash *hash;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	entry = find_direct_entry(&ip, NULL);
-	if (!entry)
-		goto out_unlock;
-
-	hash = direct_ops.func_hash->filter_hash;
-	if (hash->count == 1)
-		unregister_ftrace_function(&direct_ops);
-
-	ret = ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
-
-	WARN_ON(ret);
-
-	remove_hash_entry(direct_functions, entry);
-
-	direct = ftrace_find_direct_func(addr);
-	if (!WARN_ON(!direct)) {
-		/* This is the good path (see the ! before WARN) */
-		direct->count--;
-		WARN_ON(direct->count < 0);
-		if (!direct->count) {
-			list_del_rcu(&direct->next);
-			synchronize_rcu_tasks();
-			kfree(direct);
-			kfree(entry);
-			ftrace_direct_func_count--;
-		}
-	}
- out_unlock:
-	mutex_unlock(&direct_mutex);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
-
-static struct ftrace_ops stub_ops = {
-	.func		= ftrace_stub,
-};
-
-/**
- * ftrace_modify_direct_caller - modify ftrace nop directly
- * @entry: The ftrace hash entry of the direct helper for @rec
- * @rec: The record representing the function site to patch
- * @old_addr: The location that the site at @rec->ip currently calls
- * @new_addr: The location that the site at @rec->ip should call
- *
- * An architecture may overwrite this function to optimize the
- * changing of the direct callback on an ftrace nop location.
- * This is called with the ftrace_lock mutex held, and no other
- * ftrace callbacks are on the associated record (@rec). Thus,
- * it is safe to modify the ftrace record, where it should be
- * currently calling @old_addr directly, to call @new_addr.
- *
- * This is called with direct_mutex locked.
- *
- * Safety checks should be made to make sure that the code at
- * @rec->ip is currently calling @old_addr. And this must
- * also update entry->direct to @new_addr.
- */
-int __weak ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
-				       struct dyn_ftrace *rec,
-				       unsigned long old_addr,
-				       unsigned long new_addr)
-{
-	unsigned long ip = rec->ip;
-	int ret;
-
-	lockdep_assert_held(&direct_mutex);
-
-	/*
-	 * The ftrace_lock was used to determine if the record
-	 * had more than one registered user to it. If it did,
-	 * we needed to prevent that from changing to do the quick
-	 * switch. But if it did not (only a direct caller was attached)
-	 * then this function is called. But this function can deal
-	 * with attached callers to the rec that we care about, and
-	 * since this function uses standard ftrace calls that take
-	 * the ftrace_lock mutex, we need to release it.
-	 */
-	mutex_unlock(&ftrace_lock);
-
-	/*
-	 * By setting a stub function at the same address, we force
-	 * the code to call the iterator and the direct_ops helper.
-	 * This means that @ip does not call the direct call, and
-	 * we can simply modify it.
-	 */
-	ret = ftrace_set_filter_ip(&stub_ops, ip, 0, 0);
-	if (ret)
-		goto out_lock;
-
-	ret = register_ftrace_function_nolock(&stub_ops);
-	if (ret) {
-		ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
-		goto out_lock;
-	}
-
-	entry->direct = new_addr;
-
-	/*
-	 * By removing the stub, we put back the direct call, calling
-	 * the @new_addr.
-	 */
-	unregister_ftrace_function(&stub_ops);
-	ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
-
- out_lock:
-	mutex_lock(&ftrace_lock);
-
-	return ret;
-}
-
-/**
- * modify_ftrace_direct - Modify an existing direct call to call something else
- * @ip: The instruction pointer to modify
- * @old_addr: The address that the current @ip calls directly
- * @new_addr: The address that the @ip should call
- *
- * This modifies a ftrace direct caller at an instruction pointer without
- * having to disable it first. The direct call will switch over to the
- * @new_addr without missing anything.
- *
- * Returns: zero on success. Non zero on error, which includes:
- *  -ENODEV : the @ip given has no direct caller attached
- *  -EINVAL : the @old_addr does not match the current direct caller
- */
-int modify_ftrace_direct(unsigned long ip,
-			 unsigned long old_addr, unsigned long new_addr)
-{
-	struct ftrace_direct_func *direct, *new_direct = NULL;
-	struct ftrace_func_entry *entry;
-	struct dyn_ftrace *rec;
-	int ret = -ENODEV;
-
-	mutex_lock(&direct_mutex);
-
-	mutex_lock(&ftrace_lock);
-
-	ip = ftrace_location(ip);
-	if (!ip)
-		goto out_unlock;
-
-	entry = find_direct_entry(&ip, &rec);
-	if (!entry)
-		goto out_unlock;
-
-	ret = -EINVAL;
-	if (entry->direct != old_addr)
-		goto out_unlock;
-
-	direct = ftrace_find_direct_func(old_addr);
-	if (WARN_ON(!direct))
-		goto out_unlock;
-	if (direct->count > 1) {
-		ret = -ENOMEM;
-		new_direct = ftrace_alloc_direct_func(new_addr);
-		if (!new_direct)
-			goto out_unlock;
-		direct->count--;
-		new_direct->count++;
-	} else {
-		direct->addr = new_addr;
-	}
-
-	/*
-	 * If there's no other ftrace callback on the rec->ip location,
-	 * then it can be changed directly by the architecture.
-	 * If there is another caller, then we just need to change the
-	 * direct caller helper to point to @new_addr.
-	 */
-	if (ftrace_rec_count(rec) == 1) {
-		ret = ftrace_modify_direct_caller(entry, rec, old_addr, new_addr);
-	} else {
-		entry->direct = new_addr;
-		ret = 0;
-	}
-
-	if (unlikely(ret && new_direct)) {
-		direct->count++;
-		list_del_rcu(&new_direct->next);
-		synchronize_rcu_tasks();
-		kfree(new_direct);
-		ftrace_direct_func_count--;
-	}
-
- out_unlock:
-	mutex_unlock(&ftrace_lock);
-	mutex_unlock(&direct_mutex);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(modify_ftrace_direct);
-
 #define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
 
 static int check_direct_multi(struct ftrace_ops *ops)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 03/10] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
  2023-02-07 18:21 ` [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
  2023-02-07 18:21 ` [PATCH v2 02/10] ftrace: Remove the legacy _ftrace_direct API Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 04/10] ftrace: Store direct called addresses in their ops Florent Revest
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Now that the original _ftrace_direct APIs are gone, the "_multi"
suffixes only add confusion.

Signed-off-by: Florent Revest <revest@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
---
 include/linux/ftrace.h                      | 16 +++++------
 kernel/bpf/trampoline.c                     | 14 ++++-----
 kernel/trace/ftrace.c                       | 32 ++++++++++-----------
 kernel/trace/trace_selftest.c               |  7 +++--
 samples/Kconfig                             |  2 +-
 samples/ftrace/ftrace-direct-modify.c       |  8 +++---
 samples/ftrace/ftrace-direct-multi-modify.c |  8 +++---
 samples/ftrace/ftrace-direct-multi.c        |  4 +--
 samples/ftrace/ftrace-direct-too.c          |  6 ++--
 samples/ftrace/ftrace-direct.c              |  6 ++--
 10 files changed, 52 insertions(+), 51 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 2d7c85e47c2d..a7dbd307c3a4 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -398,10 +398,10 @@ struct ftrace_func_entry {
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 extern int ftrace_direct_func_count;
 unsigned long ftrace_find_rec_direct(unsigned long ip);
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr);
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
 
 #else
 struct ftrace_ops;
@@ -410,19 +410,19 @@ static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	return 0;
 }
-static inline int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
-static inline int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
 {
 	return -ENODEV;
 }
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index d0ed7d6f5eec..150b53316df2 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -39,14 +39,14 @@ static int bpf_tramp_ftrace_ops_func(struct ftrace_ops *ops, enum ftrace_ops_cmd
 	int ret = 0;
 
 	if (cmd == FTRACE_OPS_CMD_ENABLE_SHARE_IPMODIFY_SELF) {
-		/* This is called inside register_ftrace_direct_multi(), so
+		/* This is called inside register_ftrace_direct(), so
 		 * tr->mutex is already locked.
 		 */
 		lockdep_assert_held_once(&tr->mutex);
 
 		/* Instead of updating the trampoline here, we propagate
-		 * -EAGAIN to register_ftrace_direct_multi(). Then we can
-		 * retry register_ftrace_direct_multi() after updating the
+		 * -EAGAIN to register_ftrace_direct(). Then we can
+		 * retry register_ftrace_direct() after updating the
 		 * trampoline.
 		 */
 		if ((tr->flags & BPF_TRAMP_F_CALL_ORIG) &&
@@ -198,7 +198,7 @@ static int unregister_fentry(struct bpf_trampoline *tr, void *old_addr)
 	int ret;
 
 	if (tr->func.ftrace_managed)
-		ret = unregister_ftrace_direct_multi(tr->fops, (long)old_addr);
+		ret = unregister_ftrace_direct(tr->fops, (long)old_addr);
 	else
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, NULL);
 
@@ -215,9 +215,9 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
 
 	if (tr->func.ftrace_managed) {
 		if (lock_direct_mutex)
-			ret = modify_ftrace_direct_multi(tr->fops, (long)new_addr);
+			ret = modify_ftrace_direct(tr->fops, (long)new_addr);
 		else
-			ret = modify_ftrace_direct_multi_nolock(tr->fops, (long)new_addr);
+			ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
 	} else {
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, new_addr);
 	}
@@ -243,7 +243,7 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
 
 	if (tr->func.ftrace_managed) {
 		ftrace_set_filter_ip(tr->fops, (unsigned long)ip, 0, 1);
-		ret = register_ftrace_direct_multi(tr->fops, (long)new_addr);
+		ret = register_ftrace_direct(tr->fops, (long)new_addr);
 	} else {
 		ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr);
 	}
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5efe201428fa..cb77a0a208c7 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5312,7 +5312,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
 }
 
 /**
- * register_ftrace_direct_multi - Call a custom trampoline directly
+ * register_ftrace_direct - Call a custom trampoline directly
  * for multiple functions registered in @ops
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the trampoline to call at @ops functions
@@ -5333,7 +5333,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long
  *  -ENODEV  - @ip does not point to a ftrace nop location (or not supported)
  *  -ENOMEM  - There was an allocation failure.
  */
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash, *free_hash = NULL;
 	struct ftrace_func_entry *entry, *new;
@@ -5391,11 +5391,11 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 	}
 	return err;
 }
-EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(register_ftrace_direct);
 
 /**
- * unregister_ftrace_direct_multi - Remove calls to custom trampoline
- * previously registered by register_ftrace_direct_multi for @ops object.
+ * unregister_ftrace_direct - Remove calls to custom trampoline
+ * previously registered by register_ftrace_direct for @ops object.
  * @ops: The address of the struct ftrace_ops object
  *
  * This is used to remove a direct calls to @addr from the nop locations
@@ -5406,7 +5406,7 @@ EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
  *  0 on success
  *  -EINVAL - The @ops object was not properly registered.
  */
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash = ops->func_hash->filter_hash;
 	int err;
@@ -5426,10 +5426,10 @@ int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 	ops->trampoline = 0;
 	return err;
 }
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
 
 static int
-__modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+__modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	struct ftrace_hash *hash;
 	struct ftrace_func_entry *entry, *iter;
@@ -5476,7 +5476,7 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 }
 
 /**
- * modify_ftrace_direct_multi_nolock - Modify an existing direct 'multi' call
+ * modify_ftrace_direct_nolock - Modify an existing direct 'multi' call
  * to call something else
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the new trampoline to call at @ops functions
@@ -5493,19 +5493,19 @@ __modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
  * Returns: zero on success. Non zero on error, which includes:
  *  -EINVAL - The @ops object was not properly registered.
  */
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
 {
 	if (check_direct_multi(ops))
 		return -EINVAL;
 	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
 		return -EINVAL;
 
-	return __modify_ftrace_direct_multi(ops, addr);
+	return __modify_ftrace_direct(ops, addr);
 }
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct_nolock);
 
 /**
- * modify_ftrace_direct_multi - Modify an existing direct 'multi' call
+ * modify_ftrace_direct - Modify an existing direct 'multi' call
  * to call something else
  * @ops: The address of the struct ftrace_ops object
  * @addr: The address of the new trampoline to call at @ops functions
@@ -5519,7 +5519,7 @@ EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
  * Returns: zero on success. Non zero on error, which includes:
  *  -EINVAL - The @ops object was not properly registered.
  */
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 {
 	int err;
 
@@ -5529,11 +5529,11 @@ int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
 		return -EINVAL;
 
 	mutex_lock(&direct_mutex);
-	err = __modify_ftrace_direct_multi(ops, addr);
+	err = __modify_ftrace_direct(ops, addr);
 	mutex_unlock(&direct_mutex);
 	return err;
 }
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct);
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 57221f69a33b..06218fc9374b 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -874,7 +874,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 * and make sure we get graph trace.
 	 */
 	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
-	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
+	ret = register_ftrace_direct(&direct,
+				     (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -894,8 +895,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 
 	unregister_ftrace_graph(&fgraph_ops);
 
-	ret = unregister_ftrace_direct_multi(&direct,
-					     (unsigned long) trace_direct_tramp);
+	ret = unregister_ftrace_direct(&direct,
+				       (unsigned long)trace_direct_tramp);
 	if (ret)
 		goto out;
 
diff --git a/samples/Kconfig b/samples/Kconfig
index 0d81c00289ee..e85998ca354d 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -38,7 +38,7 @@ config SAMPLE_FTRACE_DIRECT
 	  that hooks to wake_up_process and prints the parameters.
 
 config SAMPLE_FTRACE_DIRECT_MULTI
-	tristate "Build register_ftrace_direct_multi() example"
+	tristate "Build register_ftrace_direct() on multiple ips example"
 	depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS && m
 	depends on HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	help
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index ecd76f75cb80..150c06b489ee 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -116,7 +116,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
+		ret = modify_ftrace_direct(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -132,7 +132,7 @@ static int __init ftrace_direct_init(void)
 	int ret;
 
 	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
-	ret = register_ftrace_direct_multi(&direct, my_tramp);
+	ret = register_ftrace_direct(&direct, my_tramp);
 
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
@@ -142,12 +142,12 @@ static int __init ftrace_direct_init(void)
 static void __exit ftrace_direct_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct_multi(&direct, my_tramp);
+	unregister_ftrace_direct(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index d52370cad0b6..407c56325e65 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -123,7 +123,7 @@ static int simple_thread(void *arg)
 		if (ret)
 			continue;
 		t ^= 1;
-		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
+		ret = modify_ftrace_direct(&direct, tramps[t]);
 		if (!ret)
 			my_tramp = tramps[t];
 		WARN_ON_ONCE(ret);
@@ -141,7 +141,7 @@ static int __init ftrace_direct_multi_init(void)
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
 
-	ret = register_ftrace_direct_multi(&direct, my_tramp);
+	ret = register_ftrace_direct(&direct, my_tramp);
 
 	if (!ret)
 		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
@@ -151,12 +151,12 @@ static int __init ftrace_direct_multi_init(void)
 static void __exit ftrace_direct_multi_exit(void)
 {
 	kthread_stop(simple_tsk);
-	unregister_ftrace_direct_multi(&direct, my_tramp);
+	unregister_ftrace_direct(&direct, my_tramp);
 }
 
 module_init(ftrace_direct_multi_init);
 module_exit(ftrace_direct_multi_exit);
 
 MODULE_AUTHOR("Jiri Olsa");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index ec1088922517..46cf1873fda7 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -73,12 +73,12 @@ static int __init ftrace_direct_multi_init(void)
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_multi_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 module_init(ftrace_direct_multi_init);
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 0e907092e2c0..7ee5dd3cc61d 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -76,17 +76,17 @@ static int __init ftrace_direct_init(void)
 {
 	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
 MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index e446c38f6b58..5ffce87fa83e 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -69,17 +69,17 @@ static int __init ftrace_direct_init(void)
 {
 	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
 
-	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+	return register_ftrace_direct(&direct, (unsigned long) my_tramp);
 }
 
 static void __exit ftrace_direct_exit(void)
 {
-	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
+	unregister_ftrace_direct(&direct, (unsigned long)my_tramp);
 }
 
 module_init(ftrace_direct_init);
 module_exit(ftrace_direct_exit);
 
 MODULE_AUTHOR("Steven Rostedt");
-MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
 MODULE_LICENSE("GPL");
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 04/10] ftrace: Store direct called addresses in their ops
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (2 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 03/10] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-03-15 23:43   ` Steven Rostedt
  2023-02-07 18:21 ` [PATCH v2 05/10] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

All direct calls are now registered using the register_ftrace_direct API
so each ops can jump to only one direct-called trampoline.

By storing the direct called trampoline address directly in the ops we
can save one hashmap lookup in the direct call ops and implement arm64
direct calls on top of call ops.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/ftrace.h | 3 +++
 kernel/trace/ftrace.c  | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index a7dbd307c3a4..84f717f8959e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -321,6 +321,9 @@ struct ftrace_ops {
 	unsigned long			trampoline_size;
 	struct list_head		list;
 	ftrace_ops_func_t		ops_func;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	unsigned long			direct_call;
+#endif
 #endif
 };
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index cb77a0a208c7..dfa5f34ec320 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2577,9 +2577,8 @@ ftrace_add_rec_direct(unsigned long ip, unsigned long addr,
 static void call_direct_funcs(unsigned long ip, unsigned long pip,
 			      struct ftrace_ops *ops, struct ftrace_regs *fregs)
 {
-	unsigned long addr;
+	unsigned long addr = ops->direct_call;
 
-	addr = ftrace_find_rec_direct(ip);
 	if (!addr)
 		return;
 
@@ -5375,6 +5374,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	ops->func = call_direct_funcs;
 	ops->flags = MULTI_FLAGS;
 	ops->trampoline = FTRACE_REGS_ADDR;
+	ops->direct_call = addr;
 
 	err = register_ftrace_function_nolock(ops);
 
@@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	/* Enable the tmp_ops to have the same functions as the direct ops */
 	ftrace_ops_init(&tmp_ops);
 	tmp_ops.func_hash = ops->func_hash;
+	tmp_ops.direct_call = addr;
 
 	err = register_ftrace_function_nolock(&tmp_ops);
 	if (err)
@@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 			entry->direct = addr;
 		}
 	}
+	WRITE_ONCE(ops->direct_call, addr);
 
 	mutex_unlock(&ftrace_lock);
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 05/10] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (3 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 04/10] ftrace: Store direct called addresses in their ops Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp Florent Revest
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Direct called trampolines can be called in two ways:
- either from the ftrace callsite. In this case, they do not access any
  struct ftrace_regs nor pt_regs
- Or, if a ftrace ops is also attached, from the end of a ftrace
  trampoline. In this case, the call_direct_funcs ops is in charge of
  setting the direct call trampoline's address in a struct ftrace_regs

Since:

commit 9705bc709604 ("ftrace: pass fregs to arch_ftrace_set_direct_caller()")

The later case no longer requires a full pt_regs. It only needs a struct
ftrace_regs so DIRECT_CALLS can work with both WITH_ARGS or WITH_REGS.
With architectures like arm64 already abandoning WITH_REGS in favor of
WITH_ARGS, it's important to have DIRECT_CALLS work WITH_ARGS only.

Signed-off-by: Florent Revest <revest@chromium.org>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 include/linux/ftrace.h | 6 ++++++
 kernel/trace/Kconfig   | 2 +-
 kernel/trace/ftrace.c  | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 84f717f8959e..cabb40146da9 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -241,6 +241,12 @@ enum {
 	FTRACE_OPS_FL_DIRECT			= BIT(17),
 };
 
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+#define FTRACE_OPS_FL_SAVE_ARGS                        FTRACE_OPS_FL_SAVE_REGS
+#else
+#define FTRACE_OPS_FL_SAVE_ARGS                        0
+#endif
+
 /*
  * FTRACE_OPS_CMD_* commands allow the ftrace core logic to request changes
  * to a ftrace_ops. Note, the requests may fail.
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 5df427a2321d..4496a7c69810 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -257,7 +257,7 @@ config DYNAMIC_FTRACE_WITH_REGS
 
 config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	def_bool y
-	depends on DYNAMIC_FTRACE_WITH_REGS
+	depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
 	depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 
 config DYNAMIC_FTRACE_WITH_CALL_OPS
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index dfa5f34ec320..58b6f4411ac7 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5282,7 +5282,7 @@ static LIST_HEAD(ftrace_direct_funcs);
 
 static int register_ftrace_function_nolock(struct ftrace_ops *ops);
 
-#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
+#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_ARGS)
 
 static int check_direct_multi(struct ftrace_ops *ops)
 {
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (4 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 05/10] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-03-15 23:51   ` Steven Rostedt
  2023-02-07 18:21 ` [PATCH v2 07/10] arm64: ftrace: Add direct call support Florent Revest
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1, Xu Kuohai,
	Florent Revest

From: Mark Rutland <mark.rutland@arm.com>

The ftrace selftest code has a trace_direct_tramp() function which it
uses as a direct call trampoline. This happens to work on x86, since the
direct call's return address is in the usual place, and can be returned
to via a RET, but in general the calling convention for direct calls is
different from regular function calls, and requires a trampoline written
in assembly.

On s390, regular function calls place the return address in %r14, and an
ftrace patch-site in an instrumented function places the trampoline's
return address (which is within the instrumented function) in %r0,
preserving the original %r14 value in-place. As a regular C function
will return to the address in %r14, using a C function as the trampoline
results in the trampoline returning to the caller of the instrumented
function, skipping the body of the instrumented function.

Note that the s390 issue is not detcted by the ftrace selftest code, as
the instrumented function is trivial, and returning back into the caller
happens to be equivalent.

On arm64, regular function calls place the return address in x30, and
an ftrace patch-site in an instrumented function saves this into r9
and places the trampoline's return address (within the instrumented
function) in x30. A regular C function will return to the address in
x30, but will not restore x9 into x30. Consequently, using a C function
as the trampoline results in returning to the trampoline's return
address having corrupted x30, such that when the instrumented function
returns, it will return back into itself.

To avoid future issues in this area, remove the trace_direct_tramp()
function, and require that each architecture with direct calls provides
a stub trampoline, named ftrace_stub_direct_tramp. This can be written
to handle the architecture's trampoline calling convention, and in
future could be used elsewhere (e.g. in the ftrace ops sample, to
measure the overhead of direct calls), so we may as well always build it
in.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Xu Kuohai <xukuohai@huawei.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Florent Revest <revest@chromium.org>
Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/s390/kernel/mcount.S     |  5 +++++
 arch/x86/kernel/ftrace_32.S   |  5 +++++
 arch/x86/kernel/ftrace_64.S   |  4 ++++
 include/linux/ftrace.h        |  2 ++
 kernel/trace/trace_selftest.c | 15 ++-------------
 5 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
index 4786bfe02144..ad13a0e2c307 100644
--- a/arch/s390/kernel/mcount.S
+++ b/arch/s390/kernel/mcount.S
@@ -32,6 +32,11 @@ ENTRY(ftrace_stub)
 	BR_EX	%r14
 ENDPROC(ftrace_stub)
 
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	lgr	%r1, %r0
+	BR_EX	%r1
+SYM_CODE_END(ftrace_stub_direct_tramp)
+
 	.macro	ftrace_regs_entry, allregs=0
 	stg	%r14,(__SF_GPRS+8*8)(%r15)	# save traced function caller
 
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index a0ed0e4a2c0c..0d9a14528176 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
 	jmp	.Lftrace_ret
 SYM_CODE_END(ftrace_regs_caller)
 
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+	CALL_DEPTH_ACCOUNT
+	RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 SYM_CODE_START(ftrace_graph_caller)
 	pushl	%eax
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 1265ad519249..8fc77e3e039c 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
 SYM_FUNC_END(ftrace_regs_caller)
 STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
 
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+	CALL_DEPTH_ACCOUNT
+	RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
 
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index cabb40146da9..48b13bb888bf 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
 int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
 int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
 
+void ftrace_stub_direct_tramp(void);
+
 #else
 struct ftrace_ops;
 # define ftrace_direct_func_count 0
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 06218fc9374b..e6530b7b42e4 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata  = {
 	.retfunc		= &trace_graph_return,
 };
 
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
-#ifndef CALL_DEPTH_ACCOUNT
-#define CALL_DEPTH_ACCOUNT ""
-#endif
-
-noinline __noclone static void trace_direct_tramp(void)
-{
-	asm(CALL_DEPTH_ACCOUNT);
-}
-#endif
-
 /*
  * Pretty much the same than for the function tracer from which the selftest
  * has been borrowed.
@@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	 */
 	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
 	ret = register_ftrace_direct(&direct,
-				     (unsigned long)trace_direct_tramp);
+				     (unsigned long)ftrace_stub_direct_tramp);
 	if (ret)
 		goto out;
 
@@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 	unregister_ftrace_graph(&fgraph_ops);
 
 	ret = unregister_ftrace_direct(&direct,
-				       (unsigned long)trace_direct_tramp);
+				       (unsigned long)ftrace_stub_direct_tramp);
 	if (ret)
 		goto out;
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 07/10] arm64: ftrace: Add direct call support
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (5 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 08/10] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

This builds up on the CALL_OPS work which extends the ftrace patchsite
on arm64 with an ops pointer usable by the ftrace trampoline.

This ops pointer is valid at all time. Indeed, it is either pointing to
ftrace_list_ops or to the single ops which should be called from that
patchsite.

There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function:
  - If the direct called trampoline is within the reach of a BL
    instruction
     -> the ftrace patchsite jumps to the trampoline
  - Else
     -> the ftrace patchsite jumps to the ftrace_caller trampoline which
        reads the ops pointer in the patchsite and jumps to the direct
        call address stored in the ops
- Else
  -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
     ops literal points to ftrace_list_ops so it iterates over all
     registered ftrace ops, including the direct call ops and calls its
     call_direct_funcs handler which stores the direct called
     trampoline's address in the ftrace_regs and the ftrace_caller
     trampoline will return to that address instead of returning to the
     traced function

Signed-off-by: Florent Revest <revest@chromium.org>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/Kconfig               |  2 +
 arch/arm64/include/asm/ftrace.h  | 22 ++++++++
 arch/arm64/kernel/asm-offsets.c  |  6 +++
 arch/arm64/kernel/entry-ftrace.S | 90 ++++++++++++++++++++++++++------
 arch/arm64/kernel/ftrace.c       | 36 +++++++++++--
 5 files changed, 136 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6f6f37161cf6..7deafd653c42 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -188,6 +188,8 @@ config ARM64
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
 		if $(cc-option,-fpatchable-function-entry=2)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+		if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
 	select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
 		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1c2672bbbf37..b87d70b693c6 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -70,10 +70,19 @@ struct ftrace_ops;
 
 #define arch_ftrace_get_regs(regs) NULL
 
+/*
+ * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
+ * stack alignment
+ */
 struct ftrace_regs {
 	/* x0 - x8 */
 	unsigned long regs[9];
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	unsigned long direct_tramp;
+#else
 	unsigned long __unused;
+#endif
 
 	unsigned long fp;
 	unsigned long lr;
@@ -136,6 +145,19 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *op, struct ftrace_regs *fregs);
 #define ftrace_graph_func ftrace_graph_func
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
+						 unsigned long addr)
+{
+	/*
+	 * The ftrace trampoline will return to this address instead of the
+	 * instrumented function.
+	 */
+	fregs->direct_tramp = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #endif
 
 #define ftrace_return_address(n) return_address(n)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index ae345b06e9f7..0996094b0d22 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -93,6 +93,9 @@ int main(void)
   DEFINE(FREGS_LR,		offsetof(struct ftrace_regs, lr));
   DEFINE(FREGS_SP,		offsetof(struct ftrace_regs, sp));
   DEFINE(FREGS_PC,		offsetof(struct ftrace_regs, pc));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FREGS_DIRECT_TRAMP,	offsetof(struct ftrace_regs, direct_tramp));
+#endif
   DEFINE(FREGS_SIZE,		sizeof(struct ftrace_regs));
   BLANK();
 #endif
@@ -197,6 +200,9 @@ int main(void)
 #endif
 #ifdef CONFIG_FUNCTION_TRACER
   DEFINE(FTRACE_OPS_FUNC,		offsetof(struct ftrace_ops, func));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FTRACE_OPS_DIRECT_CALL,	offsetof(struct ftrace_ops, direct_call));
+#endif
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 350ed81324ac..16034928a445 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -36,6 +36,31 @@
 SYM_CODE_START(ftrace_caller)
 	bti	c
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+	/*
+	 * The literal pointer to the ops is at an 8-byte aligned boundary
+	 * which is either 12 or 16 bytes before the BL instruction in the call
+	 * site. See ftrace_call_adjust() for details.
+	 *
+	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
+	 * and we can find the address of the literal in either case by
+	 * aligning to an 8-byte boundary and subtracting 16. We do the
+	 * alignment first as this allows us to fold the subtraction into the
+	 * LDR.
+	 */
+	bic	x11, x30, 0x7
+	ldr	x11, [x11, #-(4 * AARCH64_INSN_SIZE)]		// op
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	/*
+	 * If the op has a direct call, handle it immediately without
+	 * saving/restoring registers.
+	 */
+	ldr	x17, [x11, #FTRACE_OPS_DIRECT_CALL]		// op->direct_call
+	cbnz	x17, ftrace_caller_direct
+#endif
+#endif
+
 	/* Save original SP */
 	mov	x10, sp
 
@@ -49,6 +74,10 @@ SYM_CODE_START(ftrace_caller)
 	stp	x6, x7, [sp, #FREGS_X6]
 	str	x8,     [sp, #FREGS_X8]
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	str	xzr, [sp, #FREGS_DIRECT_TRAMP]
+#endif
+
 	/* Save the callsite's FP, LR, SP */
 	str	x29, [sp, #FREGS_FP]
 	str	x9,  [sp, #FREGS_LR]
@@ -71,20 +100,7 @@ SYM_CODE_START(ftrace_caller)
 	mov	x3, sp					// regs
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
-	/*
-	 * The literal pointer to the ops is at an 8-byte aligned boundary
-	 * which is either 12 or 16 bytes before the BL instruction in the call
-	 * site. See ftrace_call_adjust() for details.
-	 *
-	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
-	 * and we can find the address of the literal in either case by
-	 * aligning to an 8-byte boundary and subtracting 16. We do the
-	 * alignment first as this allows us to fold the subtraction into the
-	 * LDR.
-	 */
-	bic	x2, x30, 0x7
-	ldr	x2, [x2, #-16]				// op
-
+	mov	x2, x11					// op
 	ldr	x4, [x2, #FTRACE_OPS_FUNC]		// op->func
 	blr	x4					// op->func(ip, parent_ip, op, regs)
 
@@ -107,8 +123,15 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	ldp	x6, x7, [sp, #FREGS_X6]
 	ldr	x8,     [sp, #FREGS_X8]
 
-	/* Restore the callsite's FP, LR, PC */
+	/* Restore the callsite's FP */
 	ldr	x29, [sp, #FREGS_FP]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	ldr	x17, [sp, #FREGS_DIRECT_TRAMP]
+	cbnz	x17, ftrace_caller_direct_late
+#endif
+
+	/* Restore the callsite's LR and PC */
 	ldr	x30, [sp, #FREGS_LR]
 	ldr	x9,  [sp, #FREGS_PC]
 
@@ -116,8 +139,45 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	add	sp, sp, #FREGS_SIZE + 32
 
 	ret	x9
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
+	/*
+	 * Head to a direct trmapoline in x17 after having run other tracers.
+	 * The ftrace_regs are live, and x0-x8 and FP have been restored. The
+	 * LR, PC, and SP have not been restored.
+	 */
+
+	/*
+	 * Restore the callsite's LR and PC matching the trampoline calling
+	 * convention.
+	 */
+	ldr	x9,  [sp, #FREGS_LR]
+	ldr	x30, [sp, #FREGS_PC]
+
+	/* Restore the callsite's SP */
+	add	sp, sp, #FREGS_SIZE + 32
+
+SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
+	/*
+	 * Head to a direct trampoline in x17.
+	 *
+	 * We use `BR X17` as this can safely land on a `BTI C` or `PACIASP` in
+	 * the trampoline, and will not unbalance any return stack.
+	 */
+	br	x17
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 SYM_CODE_END(ftrace_caller)
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	bti	c
+	mov	x10, x30
+	mov	x30, x9
+	ret	x10
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
 
 /*
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 5545fe1a9012..758436727fba 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -206,6 +206,13 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
 	return NULL;
 }
 
+static bool reachable_by_bl(unsigned long addr, unsigned long pc)
+{
+	long offset = (long)addr - (long)pc;
+
+	return offset >= -SZ_128M && offset < SZ_128M;
+}
+
 /*
  * Find the address the callsite must branch to in order to reach '*addr'.
  *
@@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
 				      unsigned long *addr)
 {
 	unsigned long pc = rec->ip;
-	long offset = (long)*addr - (long)pc;
 	struct plt_entry *plt;
 
+	/*
+	 * If a custom trampoline is unreachable, rely on the ftrace_caller
+	 * trampoline which knows how to indirectly reach that trampoline
+	 * through ops->direct_call.
+	 */
+	if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
+		*addr = FTRACE_ADDR;
+
 	/*
 	 * When the target is within range of the 'BL' instruction, use 'addr'
 	 * as-is and branch to that directly.
 	 */
-	if (offset >= -SZ_128M && offset < SZ_128M)
+	if (reachable_by_bl(*addr, pc))
 		return true;
 
 	/*
@@ -330,12 +344,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 		       unsigned long addr)
 {
-	if (WARN_ON_ONCE(old_addr != (unsigned long)ftrace_caller))
+	unsigned long pc = rec->ip;
+	u32 old, new;
+	int ret;
+
+	ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+	if (ret)
+		return ret;
+
+	if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
 		return -EINVAL;
-	if (WARN_ON_ONCE(addr != (unsigned long)ftrace_caller))
+	if (!ftrace_find_callable_addr(rec, NULL, &addr))
 		return -EINVAL;
 
-	return ftrace_rec_update_ops(rec);
+	old = aarch64_insn_gen_branch_imm(pc, old_addr,
+					  AARCH64_INSN_BRANCH_LINK);
+	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+
+	return ftrace_modify_code(pc, old, new, true);
 }
 #endif
 
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 08/10] arm64: ftrace: Simplify get_ftrace_plt
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (6 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 07/10] arm64: ftrace: Add direct call support Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 09/10] arm64: ftrace: Add direct call trampoline samples support Florent Revest
  2023-02-07 18:21 ` [PATCH v2 10/10] selftests/bpf: Update the tests deny list on aarch64 Florent Revest
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Following recent refactorings, the get_ftrace_plt function only ever
gets called with addr = FTRACE_ADDR so its code can be simplified to
always return the ftrace trampoline plt.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/arm64/kernel/ftrace.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 758436727fba..432626c866a8 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -195,15 +195,15 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
-static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+static struct plt_entry *get_ftrace_plt(struct module *mod)
 {
 #ifdef CONFIG_ARM64_MODULE_PLTS
 	struct plt_entry *plt = mod->arch.ftrace_trampolines;
 
-	if (addr == FTRACE_ADDR)
-		return &plt[FTRACE_PLT_IDX];
-#endif
+	return &plt[FTRACE_PLT_IDX];
+#else
 	return NULL;
+#endif
 }
 
 static bool reachable_by_bl(unsigned long addr, unsigned long pc)
@@ -270,7 +270,7 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
 	if (WARN_ON(!mod))
 		return false;
 
-	plt = get_ftrace_plt(mod, *addr);
+	plt = get_ftrace_plt(mod);
 	if (!plt) {
 		pr_err("ftrace: no module PLT for %ps\n", (void *)*addr);
 		return false;
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 09/10] arm64: ftrace: Add direct call trampoline samples support
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (7 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 08/10] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  2023-02-07 18:21 ` [PATCH v2 10/10] selftests/bpf: Update the tests deny list on aarch64 Florent Revest
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

The ftrace samples need per-architecture trampoline implementations
to save and restore argument registers around the calls to
my_direct_func* and to restore polluted registers (eg: x30).

These samples also include <asm/nospec-branch.h> which does not exist on
arm64 and <asm/asm-offsets.h> which, on arm64, is not necessary and
redefines previously defined macros (resulting in warnings) so these
includes are guarded by !CONFIG_ARM64.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/arm64/Kconfig                          |  2 ++
 samples/ftrace/ftrace-direct-modify.c       | 32 ++++++++++++++++++
 samples/ftrace/ftrace-direct-multi-modify.c | 36 +++++++++++++++++++++
 samples/ftrace/ftrace-direct-multi.c        | 22 +++++++++++++
 samples/ftrace/ftrace-direct-too.c          | 25 ++++++++++++++
 samples/ftrace/ftrace-direct.c              | 23 +++++++++++++
 6 files changed, 140 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7deafd653c42..5480ef8eaa2a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -194,6 +194,8 @@ config ARM64
 		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
 		if DYNAMIC_FTRACE_WITH_ARGS
+	select HAVE_SAMPLE_FTRACE_DIRECT
+	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_FAST_GUP
 	select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 150c06b489ee..0178554c55f2 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -2,8 +2,10 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func1(void);
 extern void my_direct_func2(void);
@@ -96,6 +98,36 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static unsigned long my_tramp = (unsigned long)my_tramp1;
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 407c56325e65..ae1f97271d1a 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -2,8 +2,10 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func1(unsigned long ip);
 extern void my_direct_func2(unsigned long ip);
@@ -103,6 +105,40 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static unsigned long my_tramp = (unsigned long)my_tramp1;
 static unsigned long tramps[2] = {
 	(unsigned long)my_tramp1,
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 46cf1873fda7..52bf238fcd7e 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -4,8 +4,10 @@
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
 #include <linux/sched/stat.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(unsigned long ip);
 
@@ -66,6 +68,26 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_multi_init(void)
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 7ee5dd3cc61d..f46ee08caa2b 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -3,8 +3,10 @@
 
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(struct vm_area_struct *vma,
 			   unsigned long address, unsigned int flags);
@@ -70,6 +72,29 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #48\n"
+"	stp	x9, x30, [sp]\n"
+"	stp	x0, x1, [sp, #16]\n"
+"	str	x2, [sp, #32]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldp	x0, x1, [sp, #16]\n"
+"	ldr	x2, [sp, #32]\n"
+"	add	sp, sp, #48\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index 5ffce87fa83e..e37e8d9e855c 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -3,8 +3,10 @@
 
 #include <linux/sched.h> /* for wake_up_process() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
 #include <asm/nospec-branch.h>
+#endif
 
 extern void my_direct_func(struct task_struct *p);
 
@@ -63,6 +65,27 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 10/10] selftests/bpf: Update the tests deny list on aarch64
  2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
                   ` (8 preceding siblings ...)
  2023-02-07 18:21 ` [PATCH v2 09/10] arm64: ftrace: Add direct call trampoline samples support Florent Revest
@ 2023-02-07 18:21 ` Florent Revest
  9 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-02-07 18:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Now that ftrace supports direct call on arm64, BPF tracing programs work
on that architecture. This fixes the vast majority of BPF selftests
except for:

- multi_kprobe programs which require fprobe, not available on arm64 yet
- tracing_struct which requires trampoline support to access struct args

This patch updates the list of BPF selftests which are known to fail so
the BPF CI can validate the tests which pass now.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++------------------
 1 file changed, 5 insertions(+), 77 deletions(-)

diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64
index 99cc33c51eaa..6b95cb544094 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -1,33 +1,5 @@
-bloom_filter_map                                 # libbpf: prog 'check_bloom': failed to attach: ERROR: strerror_r(-524)=22
-bpf_cookie/lsm
-bpf_cookie/multi_kprobe_attach_api
-bpf_cookie/multi_kprobe_link_api
-bpf_cookie/trampoline
-bpf_loop/check_callback_fn_stop                  # link unexpected error: -524
-bpf_loop/check_invalid_flags
-bpf_loop/check_nested_calls
-bpf_loop/check_non_constant_callback
-bpf_loop/check_nr_loops
-bpf_loop/check_null_callback_ctx
-bpf_loop/check_stack
-bpf_mod_race                                     # bpf_mod_kfunc_race__attach unexpected error: -524 (errno 524)
-bpf_tcp_ca/dctcp_fallback
-btf_dump/btf_dump: var_data                      # find type id unexpected find type id: actual -2 < expected 0
-cgroup_hierarchical_stats                        # attach unexpected error: -524 (errno 524)
-d_path/basic                                     # setup attach failed: -524
-deny_namespace                                   # attach unexpected error: -524 (errno 524)
-fentry_fexit                                     # fentry_attach unexpected error: -1 (errno 524)
-fentry_test                                      # fentry_attach unexpected error: -1 (errno 524)
-fexit_sleep                                      # fexit_attach fexit attach failed: -1
-fexit_stress                                     # fexit attach unexpected fexit attach: actual -524 < expected 0
-fexit_test                                       # fexit_attach unexpected error: -1 (errno 524)
-get_func_args_test                               # get_func_args_test__attach unexpected error: -524 (errno 524) (trampoline)
-get_func_ip_test                                 # get_func_ip_test__attach unexpected error: -524 (errno 524) (trampoline)
-htab_update/reenter_update
-kfree_skb                                        # attach fentry unexpected error: -524 (trampoline)
-kfunc_call/subprog                               # extern (var ksym) 'bpf_prog_active': not found in kernel BTF
-kfunc_call/subprog_lskel                         # skel unexpected error: -2
-kfunc_dynptr_param/dynptr_data_null              # libbpf: prog 'dynptr_data_null': failed to attach: ERROR: strerror_r(-524)=22
+bpf_cookie/multi_kprobe_attach_api               # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
+bpf_cookie/multi_kprobe_link_api                 # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
 kprobe_multi_bench_attach                        # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/attach_api_addrs               # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/attach_api_pattern             # bpf_program__attach_kprobe_multi_opts unexpected error: -95
@@ -35,50 +7,6 @@ kprobe_multi_test/attach_api_syms                # bpf_program__attach_kprobe_mu
 kprobe_multi_test/bench_attach                   # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/link_api_addrs                 # link_fd unexpected link_fd: actual -95 < expected 0
 kprobe_multi_test/link_api_syms                  # link_fd unexpected link_fd: actual -95 < expected 0
-kprobe_multi_test/skel_api                       # kprobe_multi__attach unexpected error: -524 (errno 524)
-ksyms_module/libbpf                              # 'bpf_testmod_ksym_percpu': not found in kernel BTF
-ksyms_module/lskel                               # test_ksyms_module_lskel__open_and_load unexpected error: -2
-libbpf_get_fd_by_id_opts                         # test_libbpf_get_fd_by_id_opts__attach unexpected error: -524 (errno 524)
-linked_list
-lookup_key                                       # test_lookup_key__attach unexpected error: -524 (errno 524)
-lru_bug                                          # lru_bug__attach unexpected error: -524 (errno 524)
-modify_return                                    # modify_return__attach failed unexpected error: -524 (errno 524)
-module_attach                                    # skel_attach skeleton attach failed: -524
-mptcp/base                                       # run_test mptcp unexpected error: -524 (errno 524)
-netcnt                                           # packets unexpected packets: actual 10001 != expected 10000
-rcu_read_lock                                    # failed to attach: ERROR: strerror_r(-524)=22
-recursion                                        # skel_attach unexpected error: -524 (errno 524)
-ringbuf                                          # skel_attach skeleton attachment failed: -1
-setget_sockopt                                   # attach_cgroup unexpected error: -524
-sk_storage_tracing                               # test_sk_storage_tracing__attach unexpected error: -524 (errno 524)
-skc_to_unix_sock                                 # could not attach BPF object unexpected error: -524 (errno 524)
-socket_cookie                                    # prog_attach unexpected error: -524
-stacktrace_build_id                              # compare_stack_ips stackmap vs. stack_amap err -1 errno 2
-task_local_storage/exit_creds                    # skel_attach unexpected error: -524 (errno 524)
-task_local_storage/recursion                     # skel_attach unexpected error: -524 (errno 524)
-test_bprm_opts                                   # attach attach failed: -524
-test_ima                                         # attach attach failed: -524
-test_local_storage                               # attach lsm attach failed: -524
-test_lsm                                         # test_lsm_first_attach unexpected error: -524 (errno 524)
-test_overhead                                    # attach_fentry unexpected error: -524
-timer                                            # timer unexpected error: -524 (errno 524)
-timer_crash                                      # timer_crash__attach unexpected error: -524 (errno 524)
-timer_mim                                        # timer_mim unexpected error: -524 (errno 524)
-trace_printk                                     # trace_printk__attach unexpected error: -1 (errno 524)
-trace_vprintk                                    # trace_vprintk__attach unexpected error: -1 (errno 524)
-tracing_struct                                   # tracing_struct__attach unexpected error: -524 (errno 524)
-trampoline_count                                 # attach_prog unexpected error: -524
-unpriv_bpf_disabled                              # skel_attach unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_misaligned   # misaligned_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_producer_wrong_offset
-user_ringbuf/test_user_ringbuf_post_larger_than_ringbuf_sz
-user_ringbuf/test_user_ringbuf_basic             # ringbuf_basic_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_sample_full_ring_buffer
-user_ringbuf/test_user_ringbuf_post_alignment_autoadjust
-user_ringbuf/test_user_ringbuf_overfill
-user_ringbuf/test_user_ringbuf_discards_properly_ignored
-user_ringbuf/test_user_ringbuf_loop
-user_ringbuf/test_user_ringbuf_msg_protocol
-user_ringbuf/test_user_ringbuf_blocking_reserve
-verify_pkcs7_sig                                 # test_verify_pkcs7_sig__attach unexpected error: -524 (errno 524)
-vmlinux                                          # skel_attach skeleton attach failed: -524
+kprobe_multi_test/skel_api                       # libbpf: failed to load BPF skeleton 'kprobe_multi': -3
+module_attach                                    # prog 'kprobe_multi': failed to auto-attach: -95
+tracing_struct                                   # tracing_struct__attach unexpected error: -524 (errno 524)
\ No newline at end of file
-- 
2.39.1.519.gcb327c4b5f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-02-07 18:21 ` [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
@ 2023-03-15 23:33   ` Steven Rostedt
  2023-03-16 15:40     ` Florent Revest
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2023-03-15 23:33 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Tue,  7 Feb 2023 19:21:26 +0100
Florent Revest <revest@chromium.org> wrote:

> The _multi API requires that users keep their own ops but can enforce
> that an op is only associated to one direct call.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Tested-by: Mark Rutland <mark.rutland@arm.com>
> ---
>  kernel/trace/trace_selftest.c         | 11 +++++++----
>  samples/ftrace/ftrace-direct-modify.c | 12 ++++++++----
>  samples/ftrace/ftrace-direct-too.c    | 12 +++++++-----
>  samples/ftrace/ftrace-direct.c        | 12 +++++++-----
>  4 files changed, 29 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index ff0536cea968..57221f69a33b 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -806,6 +806,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  	int ret;
>  	unsigned long count;
>  	char *func_name __maybe_unused;
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +	struct ftrace_ops direct = {};

Make this static to the file and move it above to where there's already an
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS block.

Less #ifdef is better, and also, I don't like having as an example an
ftrace_ops structure allocated on the stack. It can become increasingly
larger as time goes by. I don't want to encourage placing it on stacks.

> +#endif
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  	if (ftrace_filter_param) {
> @@ -870,8 +873,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  	 * Register direct function together with graph tracer
>  	 * and make sure we get graph trace.
>  	 */
> -	ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> -				     (unsigned long) trace_direct_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
> +	ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
>  	if (ret)
>  		goto out;

I had to rebase this code on top of the latest tree because of updates.
Which is good, because this patch requires those same. When using
ftrace_set_filter_ip() one needs to call

   ftrace_free_filter(&direct);

Because the ftrace_set_filter_ip() allocates hashs on the ftrace_ops and
those need to be freed.


>  
> @@ -891,8 +894,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
>  
>  	unregister_ftrace_graph(&fgraph_ops);
>  
> -	ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> -				       (unsigned long) trace_direct_tramp);
> +	ret = unregister_ftrace_direct_multi(&direct,
> +					     (unsigned long) trace_direct_tramp);
>  	if (ret)
>  		goto out;
>  
> diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
> index de5a0f67f320..ecd76f75cb80 100644
> --- a/samples/ftrace/ftrace-direct-modify.c
> +++ b/samples/ftrace/ftrace-direct-modify.c
> @@ -96,6 +96,8 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static unsigned long my_tramp = (unsigned long)my_tramp1;
>  static unsigned long tramps[2] = {
>  	(unsigned long)my_tramp1,
> @@ -114,7 +116,7 @@ static int simple_thread(void *arg)
>  		if (ret)
>  			continue;
>  		t ^= 1;
> -		ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
> +		ret = modify_ftrace_direct_multi(&direct, tramps[t]);
>  		if (!ret)
>  			my_tramp = tramps[t];
>  		WARN_ON_ONCE(ret);
> @@ -129,7 +131,9 @@ static int __init ftrace_direct_init(void)
>  {
>  	int ret;
>  
> -	ret = register_ftrace_direct(my_ip, my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
> +	ret = register_ftrace_direct_multi(&direct, my_tramp);
> +
>  	if (!ret)
>  		simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
>  	return ret;
> @@ -138,12 +142,12 @@ static int __init ftrace_direct_init(void)
>  static void __exit ftrace_direct_exit(void)
>  {
>  	kthread_stop(simple_tsk);
> -	unregister_ftrace_direct(my_ip, my_tramp);
> +	unregister_ftrace_direct_multi(&direct, my_tramp);

Same here.

>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
> +MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> index e13fb59a2b47..0e907092e2c0 100644
> --- a/samples/ftrace/ftrace-direct-too.c
> +++ b/samples/ftrace/ftrace-direct-too.c
> @@ -70,21 +70,23 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static int __init ftrace_direct_init(void)
>  {
> -	return register_ftrace_direct((unsigned long)handle_mm_fault,
> -				     (unsigned long)my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
> +
> +	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct((unsigned long)handle_mm_fault,
> -				 (unsigned long)my_tramp);
> +	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);

And here.

>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
> +MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");
> diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
> index 1f769d0db20f..e446c38f6b58 100644
> --- a/samples/ftrace/ftrace-direct.c
> +++ b/samples/ftrace/ftrace-direct.c
> @@ -63,21 +63,23 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +static struct ftrace_ops direct;
> +
>  static int __init ftrace_direct_init(void)
>  {
> -	return register_ftrace_direct((unsigned long)wake_up_process,
> -				     (unsigned long)my_tramp);
> +	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
> +
> +	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
>  }
>  
>  static void __exit ftrace_direct_exit(void)
>  {
> -	unregister_ftrace_direct((unsigned long)wake_up_process,
> -				 (unsigned long)my_tramp);
> +	unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);

And here.

Maybe we could add a parameter to unregister_ftrace_direct_multi():

  bool free_filters

when set, it will call the ftrace_free_filter(), as it looks like there's a
common code path here:

	ftrace_set_filter_ip();
	register_ftrace_direct_multi();
	[..]
	unregister_ftrace_direct_multi();
	ftrace_free_filter();

Add the option will save people from having to do that last step, and also
remind people that the filters need to be freed.

-- Steve

>  }
>  
>  module_init(ftrace_direct_init);
>  module_exit(ftrace_direct_exit);
>  
>  MODULE_AUTHOR("Steven Rostedt");
> -MODULE_DESCRIPTION("Example use case of using register_ftrace_direct()");
> +MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
>  MODULE_LICENSE("GPL");


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 04/10] ftrace: Store direct called addresses in their ops
  2023-02-07 18:21 ` [PATCH v2 04/10] ftrace: Store direct called addresses in their ops Florent Revest
@ 2023-03-15 23:43   ` Steven Rostedt
  2023-03-16 15:40     ` Florent Revest
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2023-03-15 23:43 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Tue,  7 Feb 2023 19:21:29 +0100
Florent Revest <revest@chromium.org> wrote:

> @@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  	/* Enable the tmp_ops to have the same functions as the direct ops */
>  	ftrace_ops_init(&tmp_ops);
>  	tmp_ops.func_hash = ops->func_hash;
> +	tmp_ops.direct_call = addr;
>  
>  	err = register_ftrace_function_nolock(&tmp_ops);
>  	if (err)
> @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  			entry->direct = addr;
>  		}
>  	}
> +	WRITE_ONCE(ops->direct_call, addr);

I'm curious about the use of WRITE_ONCE(). It should not go outside the
mutex barrier.

-- Steve

>  
>  	mutex_unlock(&ftrace_lock);
>  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp
  2023-02-07 18:21 ` [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp Florent Revest
@ 2023-03-15 23:51   ` Steven Rostedt
  2023-03-16 15:41     ` Florent Revest
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2023-03-15 23:51 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Xu Kuohai

On Tue,  7 Feb 2023 19:21:31 +0100
Florent Revest <revest@chromium.org> wrote:

> From: Mark Rutland <mark.rutland@arm.com>
> 
> The ftrace selftest code has a trace_direct_tramp() function which it
> uses as a direct call trampoline. This happens to work on x86, since the
> direct call's return address is in the usual place, and can be returned
> to via a RET, but in general the calling convention for direct calls is
> different from regular function calls, and requires a trampoline written
> in assembly.
> 
> On s390, regular function calls place the return address in %r14, and an
> ftrace patch-site in an instrumented function places the trampoline's
> return address (which is within the instrumented function) in %r0,
> preserving the original %r14 value in-place. As a regular C function
> will return to the address in %r14, using a C function as the trampoline
> results in the trampoline returning to the caller of the instrumented
> function, skipping the body of the instrumented function.
> 
> Note that the s390 issue is not detcted by the ftrace selftest code, as
> the instrumented function is trivial, and returning back into the caller
> happens to be equivalent.
> 
> On arm64, regular function calls place the return address in x30, and
> an ftrace patch-site in an instrumented function saves this into r9
> and places the trampoline's return address (within the instrumented
> function) in x30. A regular C function will return to the address in
> x30, but will not restore x9 into x30. Consequently, using a C function
> as the trampoline results in returning to the trampoline's return
> address having corrupted x30, such that when the instrumented function
> returns, it will return back into itself.
> 
> To avoid future issues in this area, remove the trace_direct_tramp()
> function, and require that each architecture with direct calls provides
> a stub trampoline, named ftrace_stub_direct_tramp. This can be written
> to handle the architecture's trampoline calling convention, and in
> future could be used elsewhere (e.g. in the ftrace ops sample, to
> measure the overhead of direct calls), so we may as well always build it
> in.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Li Huafei <lihuafei1@huawei.com>
> Cc: Xu Kuohai <xukuohai@huawei.com>
> Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> Cc: Florent Revest <revest@chromium.org>
> Signed-off-by: Florent Revest <revest@chromium.org>


Care to respin with my update requests? I can take up to this patch and
base it directly on v6.3-rc3 when it comes out. I'm expecting that to have
the fixes in other code that is breaking my tests.

Then I'll push it out after it passes all my tests, and you can take it
and add the arm64 specific bits on top. I'm currently running these patches
as is on my tests to see if they fail (with a patched kernel for the other
code that's breaking my tests).

Does that sound OK?

-- Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  2023-03-15 23:33   ` Steven Rostedt
@ 2023-03-16 15:40     ` Florent Revest
  0 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-03-16 15:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Thu, Mar 16, 2023 at 12:33 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue,  7 Feb 2023 19:21:26 +0100
> Florent Revest <revest@chromium.org> wrote:
>
> > The _multi API requires that users keep their own ops but can enforce
> > that an op is only associated to one direct call.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > Acked-by: Mark Rutland <mark.rutland@arm.com>
> > Tested-by: Mark Rutland <mark.rutland@arm.com>
> > ---
> >  kernel/trace/trace_selftest.c         | 11 +++++++----
> >  samples/ftrace/ftrace-direct-modify.c | 12 ++++++++----
> >  samples/ftrace/ftrace-direct-too.c    | 12 +++++++-----
> >  samples/ftrace/ftrace-direct.c        | 12 +++++++-----
> >  4 files changed, 29 insertions(+), 18 deletions(-)
> >
> > diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> > index ff0536cea968..57221f69a33b 100644
> > --- a/kernel/trace/trace_selftest.c
> > +++ b/kernel/trace/trace_selftest.c
> > @@ -806,6 +806,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
> >       int ret;
> >       unsigned long count;
> >       char *func_name __maybe_unused;
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > +     struct ftrace_ops direct = {};
>
> Make this static to the file and move it above to where there's already an
> #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS block.
>
> Less #ifdef is better, and also, I don't like having as an example an
> ftrace_ops structure allocated on the stack. It can become increasingly
> larger as time goes by. I don't want to encourage placing it on stacks.

Agreed.

> > +#endif
> >
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> >       if (ftrace_filter_param) {
> > @@ -870,8 +873,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
> >        * Register direct function together with graph tracer
> >        * and make sure we get graph trace.
> >        */
> > -     ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> > -                                  (unsigned long) trace_direct_tramp);
> > +     ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
> > +     ret = register_ftrace_direct_multi(&direct, (unsigned long)trace_direct_tramp);
> >       if (ret)
> >               goto out;
>
> I had to rebase this code on top of the latest tree because of updates.
> Which is good, because this patch requires those same. When using
> ftrace_set_filter_ip() one needs to call
>
>    ftrace_free_filter(&direct);
>
> Because the ftrace_set_filter_ip() allocates hashs on the ftrace_ops and
> those need to be freed.

Ah yes, good catch.

>
> >
> > @@ -891,8 +894,8 @@ trace_selftest_startup_function_graph(struct tracer *trace,
> >
> >       unregister_ftrace_graph(&fgraph_ops);
> >
> > -     ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
> > -                                    (unsigned long) trace_direct_tramp);
> > +     ret = unregister_ftrace_direct_multi(&direct,
> > +                                          (unsigned long) trace_direct_tramp);
> >       if (ret)
> >               goto out;
> >
> > diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
> > index de5a0f67f320..ecd76f75cb80 100644
> > --- a/samples/ftrace/ftrace-direct-modify.c
> > +++ b/samples/ftrace/ftrace-direct-modify.c
> > @@ -96,6 +96,8 @@ asm (
> >
> >  #endif /* CONFIG_S390 */
> >
> > +static struct ftrace_ops direct;
> > +
> >  static unsigned long my_tramp = (unsigned long)my_tramp1;
> >  static unsigned long tramps[2] = {
> >       (unsigned long)my_tramp1,
> > @@ -114,7 +116,7 @@ static int simple_thread(void *arg)
> >               if (ret)
> >                       continue;
> >               t ^= 1;
> > -             ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
> > +             ret = modify_ftrace_direct_multi(&direct, tramps[t]);
> >               if (!ret)
> >                       my_tramp = tramps[t];
> >               WARN_ON_ONCE(ret);
> > @@ -129,7 +131,9 @@ static int __init ftrace_direct_init(void)
> >  {
> >       int ret;
> >
> > -     ret = register_ftrace_direct(my_ip, my_tramp);
> > +     ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
> > +     ret = register_ftrace_direct_multi(&direct, my_tramp);
> > +
> >       if (!ret)
> >               simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
> >       return ret;
> > @@ -138,12 +142,12 @@ static int __init ftrace_direct_init(void)
> >  static void __exit ftrace_direct_exit(void)
> >  {
> >       kthread_stop(simple_tsk);
> > -     unregister_ftrace_direct(my_ip, my_tramp);
> > +     unregister_ftrace_direct_multi(&direct, my_tramp);
>
> Same here.
>
> >  }
> >
> >  module_init(ftrace_direct_init);
> >  module_exit(ftrace_direct_exit);
> >
> >  MODULE_AUTHOR("Steven Rostedt");
> > -MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
> > +MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
> >  MODULE_LICENSE("GPL");
> > diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> > index e13fb59a2b47..0e907092e2c0 100644
> > --- a/samples/ftrace/ftrace-direct-too.c
> > +++ b/samples/ftrace/ftrace-direct-too.c
> > @@ -70,21 +70,23 @@ asm (
> >
> >  #endif /* CONFIG_S390 */
> >
> > +static struct ftrace_ops direct;
> > +
> >  static int __init ftrace_direct_init(void)
> >  {
> > -     return register_ftrace_direct((unsigned long)handle_mm_fault,
> > -                                  (unsigned long)my_tramp);
> > +     ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
> > +
> > +     return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> >  }
> >
> >  static void __exit ftrace_direct_exit(void)
> >  {
> > -     unregister_ftrace_direct((unsigned long)handle_mm_fault,
> > -                              (unsigned long)my_tramp);
> > +     unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
>
> And here.
>
> >  }
> >
> >  module_init(ftrace_direct_init);
> >  module_exit(ftrace_direct_exit);
> >
> >  MODULE_AUTHOR("Steven Rostedt");
> > -MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct()");
> > +MODULE_DESCRIPTION("Another example use case of using register_ftrace_direct_multi()");
> >  MODULE_LICENSE("GPL");
> > diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
> > index 1f769d0db20f..e446c38f6b58 100644
> > --- a/samples/ftrace/ftrace-direct.c
> > +++ b/samples/ftrace/ftrace-direct.c
> > @@ -63,21 +63,23 @@ asm (
> >
> >  #endif /* CONFIG_S390 */
> >
> > +static struct ftrace_ops direct;
> > +
> >  static int __init ftrace_direct_init(void)
> >  {
> > -     return register_ftrace_direct((unsigned long)wake_up_process,
> > -                                  (unsigned long)my_tramp);
> > +     ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
> > +
> > +     return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
> >  }
> >
> >  static void __exit ftrace_direct_exit(void)
> >  {
> > -     unregister_ftrace_direct((unsigned long)wake_up_process,
> > -                              (unsigned long)my_tramp);
> > +     unregister_ftrace_direct_multi(&direct, (unsigned long)my_tramp);
>
> And here.
>
> Maybe we could add a parameter to unregister_ftrace_direct_multi():
>
>   bool free_filters
>
> when set, it will call the ftrace_free_filter(), as it looks like there's a
> common code path here:
>
>         ftrace_set_filter_ip();
>         register_ftrace_direct_multi();
>         [..]
>         unregister_ftrace_direct_multi();
>         ftrace_free_filter();
>
> Add the option will save people from having to do that last step, and also
> remind people that the filters need to be freed.

I agree, it's otherwise quite easy to miss the ftrace_free_filter. :)
I'll add a patch introducing this "free_filters" arg in the new subset
series that I'll send you directly.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 04/10] ftrace: Store direct called addresses in their ops
  2023-03-15 23:43   ` Steven Rostedt
@ 2023-03-16 15:40     ` Florent Revest
  2023-03-16 15:45       ` Steven Rostedt
  0 siblings, 1 reply; 19+ messages in thread
From: Florent Revest @ 2023-03-16 15:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Thu, Mar 16, 2023 at 12:43 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue,  7 Feb 2023 19:21:29 +0100
> Florent Revest <revest@chromium.org> wrote:
>
> > @@ -5445,6 +5445,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >       /* Enable the tmp_ops to have the same functions as the direct ops */
> >       ftrace_ops_init(&tmp_ops);
> >       tmp_ops.func_hash = ops->func_hash;
> > +     tmp_ops.direct_call = addr;
> >
> >       err = register_ftrace_function_nolock(&tmp_ops);
> >       if (err)
> > @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >                       entry->direct = addr;
> >               }
> >       }
> > +     WRITE_ONCE(ops->direct_call, addr);
>
> I'm curious about the use of WRITE_ONCE(). It should not go outside the
> mutex barrier.

This WRITE_ONCE was originally suggested by Mark here:
https://lore.kernel.org/all/Y9vW99htjOphDXqY@FVFF77S0Q05N.cambridge.arm.com/#t

My understanding is that it's not so much about avoiding re-ordering
but rather about avoiding store tearing since a ftrace_caller
trampoline could concurrently read ops->direct_call. Does that make
sense ?

> -- Steve
>
> >
> >       mutex_unlock(&ftrace_lock);
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp
  2023-03-15 23:51   ` Steven Rostedt
@ 2023-03-16 15:41     ` Florent Revest
  0 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-03-16 15:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Xu Kuohai

On Thu, Mar 16, 2023 at 12:51 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue,  7 Feb 2023 19:21:31 +0100
> Florent Revest <revest@chromium.org> wrote:
>
> > From: Mark Rutland <mark.rutland@arm.com>
> >
> > The ftrace selftest code has a trace_direct_tramp() function which it
> > uses as a direct call trampoline. This happens to work on x86, since the
> > direct call's return address is in the usual place, and can be returned
> > to via a RET, but in general the calling convention for direct calls is
> > different from regular function calls, and requires a trampoline written
> > in assembly.
> >
> > On s390, regular function calls place the return address in %r14, and an
> > ftrace patch-site in an instrumented function places the trampoline's
> > return address (which is within the instrumented function) in %r0,
> > preserving the original %r14 value in-place. As a regular C function
> > will return to the address in %r14, using a C function as the trampoline
> > results in the trampoline returning to the caller of the instrumented
> > function, skipping the body of the instrumented function.
> >
> > Note that the s390 issue is not detcted by the ftrace selftest code, as
> > the instrumented function is trivial, and returning back into the caller
> > happens to be equivalent.
> >
> > On arm64, regular function calls place the return address in x30, and
> > an ftrace patch-site in an instrumented function saves this into r9
> > and places the trampoline's return address (within the instrumented
> > function) in x30. A regular C function will return to the address in
> > x30, but will not restore x9 into x30. Consequently, using a C function
> > as the trampoline results in returning to the trampoline's return
> > address having corrupted x30, such that when the instrumented function
> > returns, it will return back into itself.
> >
> > To avoid future issues in this area, remove the trace_direct_tramp()
> > function, and require that each architecture with direct calls provides
> > a stub trampoline, named ftrace_stub_direct_tramp. This can be written
> > to handle the architecture's trampoline calling convention, and in
> > future could be used elsewhere (e.g. in the ftrace ops sample, to
> > measure the overhead of direct calls), so we may as well always build it
> > in.
> >
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Li Huafei <lihuafei1@huawei.com>
> > Cc: Xu Kuohai <xukuohai@huawei.com>
> > Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> > Cc: Florent Revest <revest@chromium.org>
> > Signed-off-by: Florent Revest <revest@chromium.org>
>
>
> Care to respin with my update requests? I can take up to this patch and
> base it directly on v6.3-rc3 when it comes out. I'm expecting that to have
> the fixes in other code that is breaking my tests.

Okay! :) I'll send you a subset of this series (first 6 patches with
your review addressed and rebased on v6.3-rc2 for now).

> Then I'll push it out after it passes all my tests, and you can take it
> and add the arm64 specific bits on top. I'm currently running these patches
> as is on my tests to see if they fail (with a patched kernel for the other
> code that's breaking my tests).
>
> Does that sound OK?

Sounds good to me, yes!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 04/10] ftrace: Store direct called addresses in their ops
  2023-03-16 15:40     ` Florent Revest
@ 2023-03-16 15:45       ` Steven Rostedt
  2023-03-16 16:15         ` Florent Revest
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2023-03-16 15:45 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Thu, 16 Mar 2023 16:40:48 +0100
Florent Revest <revest@chromium.org> wrote:

> > > @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> > >                       entry->direct = addr;
> > >               }
> > >       }
> > > +     WRITE_ONCE(ops->direct_call, addr);  
> >
> > I'm curious about the use of WRITE_ONCE(). It should not go outside the
> > mutex barrier.  
> 
> This WRITE_ONCE was originally suggested by Mark here:
> https://lore.kernel.org/all/Y9vW99htjOphDXqY@FVFF77S0Q05N.cambridge.arm.com/#t
> 
> My understanding is that it's not so much about avoiding re-ordering
> but rather about avoiding store tearing since a ftrace_caller
> trampoline could concurrently read ops->direct_call. Does that make
> sense ?

Yes, but a comment needs to be added:

     /* Prevent store tearing on some archs */
     WRITE_ONCE(ops->direct_call, addr);  

Or something to that affect. Otherwise I can see it confusing others in the
future. And probably me too, as I'll forget why it was a WRITE_ONCE() by
next month. ;-)

-- Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 04/10] ftrace: Store direct called addresses in their ops
  2023-03-16 15:45       ` Steven Rostedt
@ 2023-03-16 16:15         ` Florent Revest
  0 siblings, 0 replies; 19+ messages in thread
From: Florent Revest @ 2023-03-16 16:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Thu, Mar 16, 2023 at 4:45 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 16 Mar 2023 16:40:48 +0100
> Florent Revest <revest@chromium.org> wrote:
>
> > > > @@ -5466,6 +5467,7 @@ __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> > > >                       entry->direct = addr;
> > > >               }
> > > >       }
> > > > +     WRITE_ONCE(ops->direct_call, addr);
> > >
> > > I'm curious about the use of WRITE_ONCE(). It should not go outside the
> > > mutex barrier.
> >
> > This WRITE_ONCE was originally suggested by Mark here:
> > https://lore.kernel.org/all/Y9vW99htjOphDXqY@FVFF77S0Q05N.cambridge.arm.com/#t
> >
> > My understanding is that it's not so much about avoiding re-ordering
> > but rather about avoiding store tearing since a ftrace_caller
> > trampoline could concurrently read ops->direct_call. Does that make
> > sense ?
>
> Yes, but a comment needs to be added:
>
>      /* Prevent store tearing on some archs */
>      WRITE_ONCE(ops->direct_call, addr);
>
> Or something to that affect. Otherwise I can see it confusing others in the
> future. And probably me too, as I'll forget why it was a WRITE_ONCE() by
> next month. ;-)

Definitely :) I was myself confused after a few weeks of adding it so
I'll add a clarifying comment. Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-03-16 16:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-07 18:21 [PATCH v2 00/10] Add ftrace direct call for arm64 Florent Revest
2023-02-07 18:21 ` [PATCH v2 01/10] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
2023-03-15 23:33   ` Steven Rostedt
2023-03-16 15:40     ` Florent Revest
2023-02-07 18:21 ` [PATCH v2 02/10] ftrace: Remove the legacy _ftrace_direct API Florent Revest
2023-02-07 18:21 ` [PATCH v2 03/10] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
2023-02-07 18:21 ` [PATCH v2 04/10] ftrace: Store direct called addresses in their ops Florent Revest
2023-03-15 23:43   ` Steven Rostedt
2023-03-16 15:40     ` Florent Revest
2023-03-16 15:45       ` Steven Rostedt
2023-03-16 16:15         ` Florent Revest
2023-02-07 18:21 ` [PATCH v2 05/10] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
2023-02-07 18:21 ` [PATCH v2 06/10] ftrace: selftest: remove broken trace_direct_tramp Florent Revest
2023-03-15 23:51   ` Steven Rostedt
2023-03-16 15:41     ` Florent Revest
2023-02-07 18:21 ` [PATCH v2 07/10] arm64: ftrace: Add direct call support Florent Revest
2023-02-07 18:21 ` [PATCH v2 08/10] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
2023-02-07 18:21 ` [PATCH v2 09/10] arm64: ftrace: Add direct call trampoline samples support Florent Revest
2023-02-07 18:21 ` [PATCH v2 10/10] selftests/bpf: Update the tests deny list on aarch64 Florent Revest

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).