linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* aarch64 boot time increases with rt
@ 2020-07-13 14:56 Itai Handler
  2020-07-14  1:20 ` Steven Rostedt
  0 siblings, 1 reply; 6+ messages in thread
From: Itai Handler @ 2020-07-13 14:56 UTC (permalink / raw)
  To: linux-rt-users; +Cc: Steven Rostedt

Hi,

I observed that aarch64 boot time increases considerably with rt
(PREEMPT_RT_FULL=y).
This is observable also in latest 4.19 rt (4.19.127-rt55).
Following is a fragment from kernel log (with rt) which shows that
bringing up 4 CPUs takes about 210 ms:

[    0.120951] smp: Bringing up secondary CPUs ...
[    0.184860] Detected VIPT I-cache on CPU1
[    0.187344] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[    0.262931] Detected VIPT I-cache on CPU2
[    0.263390] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[    0.329296] Detected VIPT I-cache on CPU3
[    0.329872] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[    0.331557] smp: Brought up 1 node, 4 CPUs

Note that these times are from qemu, however the problem is also
noticeable on real hardware.

I traced the code and found that the slow down is caused in
workqueue_prepare_cpu().
This function creates the worker threads.

On real hardware, it takes about 8 ms to create each worker thread.
Before digging into it in more detail, I would like to hear your opinion.

Steven, what ftrace command line parameters do you recommend to use in
this case?

Thank you,
Itai Handler

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 boot time increases with rt
  2020-07-13 14:56 aarch64 boot time increases with rt Itai Handler
@ 2020-07-14  1:20 ` Steven Rostedt
  2020-07-14 12:18   ` Itai Handler
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2020-07-14  1:20 UTC (permalink / raw)
  To: Itai Handler; +Cc: linux-rt-users

On Mon, 13 Jul 2020 17:56:05 +0300
Itai Handler <itai.handler@gmail.com> wrote:

> Hi,
> 
> I observed that aarch64 boot time increases considerably with rt
> (PREEMPT_RT_FULL=y).

You mean compared to non PREEMPT_RT_FULL of the same kernel?

> This is observable also in latest 4.19 rt (4.19.127-rt55).
> Following is a fragment from kernel log (with rt) which shows that
> bringing up 4 CPUs takes about 210 ms:

I'd suggest trying 5.4 or even 5.6-rt.

> 
> [    0.120951] smp: Bringing up secondary CPUs ...
> [    0.184860] Detected VIPT I-cache on CPU1
> [    0.187344] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
> [    0.262931] Detected VIPT I-cache on CPU2
> [    0.263390] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
> [    0.329296] Detected VIPT I-cache on CPU3
> [    0.329872] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
> [    0.331557] smp: Brought up 1 node, 4 CPUs
> 
> Note that these times are from qemu, however the problem is also
> noticeable on real hardware.
> 
> I traced the code and found that the slow down is caused in
> workqueue_prepare_cpu().
> This function creates the worker threads.
> 
> On real hardware, it takes about 8 ms to create each worker thread.
> Before digging into it in more detail, I would like to hear your opinion.
> 
> Steven, what ftrace command line parameters do you recommend to use in
> this case?

Well, you could try: ftrace=function_graph ftrace_graph_filter workqueue_prepare_cpu

If workqueue_prepare_cpu is a traceable function (not inlined nor notrace set).

-- Steve


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 boot time increases with rt
  2020-07-14  1:20 ` Steven Rostedt
@ 2020-07-14 12:18   ` Itai Handler
  2020-07-14 14:34     ` Steven Rostedt
  2020-07-15 19:36     ` Steven Rostedt
  0 siblings, 2 replies; 6+ messages in thread
From: Itai Handler @ 2020-07-14 12:18 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-rt-users

Hi Steven,

On Tue, Jul 14, 2020 at 4:20 AM Steven Rostedt <rostedt@goodmis.org> wrote:

> You mean compared to non PREEMPT_RT_FULL of the same kernel?

Yes.

> I'd suggest trying 5.4 or even 5.6-rt.

I tried 5.4.47-rt28, and the problem still persists.

> Well, you could try: ftrace=function_graph ftrace_graph_filter workqueue_prepare_cpu
>
> If workqueue_prepare_cpu is a traceable function (not inlined nor notrace set).

I tried "ftrace=function_graph ftrace_graph_filter=workqueue_prepare_cpu",
however, the trace appears to be empty.
The function workqueue_prepare_cpu is not defined inline nor notrace.

Please see log below:
# mount -t debugfs none /sys/kernel/debug
# cd /sys/kernel/debug/tracing/
# cat trace
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
# cat set_graph_function
workqueue_prepare_cpu
# cat tracing_on
1

Do you have any idea why it's empty?

Thank you,
Itai Handler

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 boot time increases with rt
  2020-07-14 12:18   ` Itai Handler
@ 2020-07-14 14:34     ` Steven Rostedt
  2020-07-15 19:36     ` Steven Rostedt
  1 sibling, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2020-07-14 14:34 UTC (permalink / raw)
  To: Itai Handler; +Cc: linux-rt-users

On Tue, 14 Jul 2020 15:18:01 +0300
Itai Handler <itai.handler@gmail.com> wrote:

> I tried "ftrace=function_graph ftrace_graph_filter=workqueue_prepare_cpu",
> however, the trace appears to be empty.
> The function workqueue_prepare_cpu is not defined inline nor notrace.
> 
> Please see log below:
> # mount -t debugfs none /sys/kernel/debug
> # cd /sys/kernel/debug/tracing/
> # cat trace
> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
> # cat set_graph_function
> workqueue_prepare_cpu
> # cat tracing_on
> 1
> 
> Do you have any idea why it's empty?

Hmm, does the place in question happen before ftrace is initialized?

That is, where do you see the message that has something like this:

 ftrace: allocating 41803 entries in 164 pages
 ftrace: allocated 164 pages with 3 groups

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 boot time increases with rt
  2020-07-14 12:18   ` Itai Handler
  2020-07-14 14:34     ` Steven Rostedt
@ 2020-07-15 19:36     ` Steven Rostedt
  2020-07-16 13:04       ` Itai Handler
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2020-07-15 19:36 UTC (permalink / raw)
  To: Itai Handler; +Cc: linux-rt-users

On Tue, 14 Jul 2020 15:18:01 +0300
Itai Handler <itai.handler@gmail.com> wrote:

> Hi Steven,
> 
> On Tue, Jul 14, 2020 at 4:20 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > You mean compared to non PREEMPT_RT_FULL of the same kernel?  
> 
> Yes.
> 
> > I'd suggest trying 5.4 or even 5.6-rt.  
> 
> I tried 5.4.47-rt28, and the problem still persists.
> 
> > Well, you could try: ftrace=function_graph ftrace_graph_filter workqueue_prepare_cpu
> >
> > If workqueue_prepare_cpu is a traceable function (not inlined nor notrace set).  
> 
> I tried "ftrace=function_graph ftrace_graph_filter=workqueue_prepare_cpu",
> however, the trace appears to be empty.
> The function workqueue_prepare_cpu is not defined inline nor notrace.
> 
> Please see log below:
> # mount -t debugfs none /sys/kernel/debug
> # cd /sys/kernel/debug/tracing/
> # cat trace
> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
> # cat set_graph_function
> workqueue_prepare_cpu
> # cat tracing_on
> 1
> 
> Do you have any idea why it's empty?

To rule out that this may be an issue with the function_graph tracer,
can you also try it with:

 ftrace=function ftrace_filter=worqueue_prepare_cpu

And see if the function tracer shows anything.

/me just looked at the code.

Yes, the function tracer is set up to trace really early, but the
function_graph tracer is initialized by core_initcall(), which is much
later.

Perhaps try this patch (note, it's from latest mainline):

And retry the function_graph tracer.

-- Steve


diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4aab712f9567..35bcac3edcee 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -9475,6 +9475,7 @@ __init static int tracer_alloc_buffers(void)
 
 	/* Function tracing may start here (via kernel command line) */
 	init_function_trace();
+	init_graph_trace();
 
 	/* All seems OK, enable tracing */
 	tracing_disabled = 0;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index f21607f87189..d861bd34dfb6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1129,6 +1129,11 @@ void ftrace_init_tracefs_toplevel(struct trace_array *tr,
 void ftrace_clear_pids(struct trace_array *tr);
 int init_function_trace(void);
 void ftrace_pid_follow_fork(struct trace_array *tr, bool enable);
+# ifdef CONFIG_FUNCTION_GRAPH_TRACER
+int init_graph_trace(void);
+# else
+static inline int init_graph_trace(void) { return 0; }
+# endif
 #else
 static inline int ftrace_trace_task(struct trace_array *tr)
 {
@@ -1149,6 +1154,7 @@ static inline void ftrace_init_tracefs(struct trace_array *tr, struct dentry *d)
 static inline void ftrace_init_tracefs_toplevel(struct trace_array *tr, struct dentry *d) { }
 static inline void ftrace_clear_pids(struct trace_array *tr) { }
 static inline int init_function_trace(void) { return 0; }
+static inline int init_graph_trace(void) { return 0; }
 static inline void ftrace_pid_follow_fork(struct trace_array *tr, bool enable) { }
 /* ftace_func_t type is not defined, use macro instead of static inline */
 #define ftrace_init_array_ops(tr, func) do { } while (0)
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 4a9c49c08ec9..e18cc90585fb 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -1349,7 +1349,7 @@ static __init int init_graph_tracefs(void)
 }
 fs_initcall(init_graph_tracefs);
 
-static __init int init_graph_trace(void)
+__init int init_graph_trace(void)
 {
 	max_bytes_for_cpu = snprintf(NULL, 0, "%u", nr_cpu_ids - 1);
 
@@ -1365,5 +1365,3 @@ static __init int init_graph_trace(void)
 
 	return register_tracer(&graph_trace);
 }
-
-core_initcall(init_graph_trace);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: aarch64 boot time increases with rt
  2020-07-15 19:36     ` Steven Rostedt
@ 2020-07-16 13:04       ` Itai Handler
  0 siblings, 0 replies; 6+ messages in thread
From: Itai Handler @ 2020-07-16 13:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-rt-users

Hi Steven,

On Wed, Jul 15, 2020 at 10:36 PM Steven Rostedt <rostedt@goodmis.org> wrote:

> To rule out that this may be an issue with the function_graph tracer,
> can you also try it with:
>
>  ftrace=function ftrace_filter=worqueue_prepare_cpu
>
> And see if the function tracer shows anything.
>
> /me just looked at the code.
>
> Yes, the function tracer is set up to trace really early, but the
> function_graph tracer is initialized by core_initcall(), which is much
> later.
>
> Perhaps try this patch (note, it's from latest mainline):
>
> And retry the function_graph tracer.

With this patch, the function graph tracer doesn't work at boot time.
Please see the log below:

# mount -t debugfs debugfs /sys/kernel/debug/
# cd /sys/kernel/debug/tracing/trace
# cat current_tracer
nop

Note that I tested it both on 5.4.47-rt28 and 5.6.19-rt11 (but not on
latest mainline).

Regarding the function tracer, you were correct: it can trace
workqueue_prepare_cpu
without any issue.

Thank you,
Itai Handler

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-16 13:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-13 14:56 aarch64 boot time increases with rt Itai Handler
2020-07-14  1:20 ` Steven Rostedt
2020-07-14 12:18   ` Itai Handler
2020-07-14 14:34     ` Steven Rostedt
2020-07-15 19:36     ` Steven Rostedt
2020-07-16 13:04       ` Itai Handler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).