linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6
@ 2020-02-05 10:49 Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 1/4] bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton


Amol Grover (2):
      tracing: Annotate ftrace_graph_hash pointer with __rcu
      tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu

Steven Rostedt (VMware) (2):
      bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline
      ftrace: Add comment to why rcu_dereference_sched() is open coded

----
 Documentation/admin-guide/bootconfig.rst        |  2 ++
 Documentation/admin-guide/kernel-parameters.txt |  6 ++++++
 init/main.c                                     | 28 ++++++++++++++++++-------
 kernel/trace/ftrace.c                           |  4 ++--
 kernel/trace/trace.h                            | 27 +++++++++++++++++++-----
 5 files changed, 53 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [for-next][PATCH 1/4] bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline
  2020-02-05 10:49 [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6 Steven Rostedt
@ 2020-02-05 10:49 ` Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 2/4] tracing: Annotate ftrace_graph_hash pointer with __rcu Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Linus Torvalds

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

As the bootconfig is appended to the initrd it is not as easy to modify as
the kernel command line. If there's some issue with the kernel, and the
developer wants to boot a pristine kernel, it should not be needed to modify
the initrd to remove the bootconfig for a single boot.

As bootconfig is silently added (if the admin does not know where to look
they may not know it's being loaded). It should be explicitly added to the
kernel cmdline. The loading of the bootconfig is only done if "bootconfig"
is on the kernel command line. This will let admins know that the kernel
command line is extended.

Note, after adding printk()s for when the size is too great or the checksum
is wrong, exposed that the current method always looked for the boot config,
and if this size and checksum matched, it would parse it (as if either is
wrong a printk has been added to show this). It's better to only check this
if the boot config is asked to be looked for.

Link: https://lore.kernel.org/r/CAHk-=wjfjO+h6bQzrTf=YCZA53Y3EDyAs3Z4gEsT7icA3u_Psw@mail.gmail.com

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 Documentation/admin-guide/bootconfig.rst      |  2 ++
 .../admin-guide/kernel-parameters.txt         |  6 ++++
 init/main.c                                   | 28 ++++++++++++++-----
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst
index 4d617693c0c8..b342a6796392 100644
--- a/Documentation/admin-guide/bootconfig.rst
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -123,6 +123,8 @@ To remove the config from the image, you can use -d option as below::
 
  # tools/bootconfig/bootconfig -d /boot/initrd.img-X.Y.Z
 
+Then add "bootconfig" on the normal kernel command line to tell the
+kernel to look for the bootconfig at the end of the initrd file.
 
 Config File Limitation
 ======================
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ade4e6ec23e0..b48c70ba9841 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -437,6 +437,12 @@
 			no delay (0).
 			Format: integer
 
+	bootconfig	[KNL]
+			Extended command line options can be added to an initrd
+			and this will cause the kernel to look for it.
+
+			See Documentation/admin-guide/bootconfig.rst
+
 	bert_disable	[ACPI]
 			Disable BERT OS support on buggy BIOSes.
 
diff --git a/init/main.c b/init/main.c
index dd7da62d99a5..f174a59d3903 100644
--- a/init/main.c
+++ b/init/main.c
@@ -336,28 +336,39 @@ u32 boot_config_checksum(unsigned char *p, u32 size)
 	return ret;
 }
 
-static void __init setup_boot_config(void)
+static void __init setup_boot_config(const char *cmdline)
 {
 	u32 size, csum;
 	char *data, *copy;
+	const char *p;
 	u32 *hdr;
 
-	if (!initrd_end)
+	p = strstr(cmdline, "bootconfig");
+	if (!p || (p != cmdline && !isspace(*(p-1))) ||
+	    (p[10] && !isspace(p[10])))
 		return;
 
+	if (!initrd_end)
+		goto not_found;
+
 	hdr = (u32 *)(initrd_end - 8);
 	size = hdr[0];
 	csum = hdr[1];
 
-	if (size >= XBC_DATA_MAX)
+	if (size >= XBC_DATA_MAX) {
+		pr_err("bootconfig size %d greater than max size %d\n",
+			size, XBC_DATA_MAX);
 		return;
+	}
 
 	data = ((void *)hdr) - size;
 	if ((unsigned long)data < initrd_start)
-		return;
+		goto not_found;
 
-	if (boot_config_checksum((unsigned char *)data, size) != csum)
+	if (boot_config_checksum((unsigned char *)data, size) != csum) {
+		pr_err("bootconfig checksum failed\n");
 		return;
+	}
 
 	copy = memblock_alloc(size + 1, SMP_CACHE_BYTES);
 	if (!copy) {
@@ -377,9 +388,12 @@ static void __init setup_boot_config(void)
 		/* Also, "init." keys are init arguments */
 		extra_init_args = xbc_make_cmdline("init");
 	}
+	return;
+not_found:
+	pr_err("'bootconfig' found on command line, but no bootconfig found\n");
 }
 #else
-#define setup_boot_config()	do { } while (0)
+#define setup_boot_config(cmdline)	do { } while (0)
 #endif
 
 /* Change NUL term back to "=", to make "param" the whole string. */
@@ -760,7 +774,7 @@ asmlinkage __visible void __init start_kernel(void)
 	pr_notice("%s", linux_banner);
 	early_security_init();
 	setup_arch(&command_line);
-	setup_boot_config();
+	setup_boot_config(command_line);
 	setup_command_line(command_line);
 	setup_nr_cpu_ids();
 	setup_per_cpu_areas();
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 2/4] tracing: Annotate ftrace_graph_hash pointer with __rcu
  2020-02-05 10:49 [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6 Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 1/4] bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline Steven Rostedt
@ 2020-02-05 10:49 ` Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 3/4] tracing: Annotate ftrace_graph_notrace_hash " Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded Steven Rostedt
  3 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Amol Grover

From: Amol Grover <frextrite@gmail.com>

Fix following instances of sparse error
kernel/trace/ftrace.c:5664:29: error: incompatible types in comparison
kernel/trace/ftrace.c:5785:21: error: incompatible types in comparison
kernel/trace/ftrace.c:5864:36: error: incompatible types in comparison
kernel/trace/ftrace.c:5866:25: error: incompatible types in comparison

Use rcu_dereference_protected to access the __rcu annotated pointer.

Link: http://lkml.kernel.org/r/20200201072703.17330-1-frextrite@gmail.com

Signed-off-by: Amol Grover <frextrite@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 2 +-
 kernel/trace/trace.h  | 9 ++++++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 0e9612c30995..01d2ecd66161 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5591,7 +5591,7 @@ static const struct file_operations ftrace_notrace_fops = {
 
 static DEFINE_MUTEX(graph_lock);
 
-struct ftrace_hash *ftrace_graph_hash = EMPTY_HASH;
+struct ftrace_hash __rcu *ftrace_graph_hash = EMPTY_HASH;
 struct ftrace_hash *ftrace_graph_notrace_hash = EMPTY_HASH;
 
 enum graph_filter_type {
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index f5480a2aa334..18ceab59a5ba 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -964,22 +964,25 @@ extern void __trace_graph_return(struct trace_array *tr,
 				 unsigned long flags, int pc);
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-extern struct ftrace_hash *ftrace_graph_hash;
+extern struct ftrace_hash __rcu *ftrace_graph_hash;
 extern struct ftrace_hash *ftrace_graph_notrace_hash;
 
 static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
 {
 	unsigned long addr = trace->func;
 	int ret = 0;
+	struct ftrace_hash *hash;
 
 	preempt_disable_notrace();
 
-	if (ftrace_hash_empty(ftrace_graph_hash)) {
+	hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
+
+	if (ftrace_hash_empty(hash)) {
 		ret = 1;
 		goto out;
 	}
 
-	if (ftrace_lookup_ip(ftrace_graph_hash, addr)) {
+	if (ftrace_lookup_ip(hash, addr)) {
 
 		/*
 		 * This needs to be cleared on the return functions
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 3/4] tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu
  2020-02-05 10:49 [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6 Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 1/4] bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 2/4] tracing: Annotate ftrace_graph_hash pointer with __rcu Steven Rostedt
@ 2020-02-05 10:49 ` Steven Rostedt
  2020-02-05 10:49 ` [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded Steven Rostedt
  3 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Amol Grover

From: Amol Grover <frextrite@gmail.com>

Fix following instances of sparse error
kernel/trace/ftrace.c:5667:29: error: incompatible types in comparison
kernel/trace/ftrace.c:5813:21: error: incompatible types in comparison
kernel/trace/ftrace.c:5868:36: error: incompatible types in comparison
kernel/trace/ftrace.c:5870:25: error: incompatible types in comparison

Use rcu_dereference_protected to dereference the newly annotated pointer.

Link: http://lkml.kernel.org/r/20200205055701.30195-1-frextrite@gmail.com

Signed-off-by: Amol Grover <frextrite@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 2 +-
 kernel/trace/trace.h  | 8 ++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 01d2ecd66161..481ede3eac13 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5592,7 +5592,7 @@ static const struct file_operations ftrace_notrace_fops = {
 static DEFINE_MUTEX(graph_lock);
 
 struct ftrace_hash __rcu *ftrace_graph_hash = EMPTY_HASH;
-struct ftrace_hash *ftrace_graph_notrace_hash = EMPTY_HASH;
+struct ftrace_hash __rcu *ftrace_graph_notrace_hash = EMPTY_HASH;
 
 enum graph_filter_type {
 	GRAPH_FILTER_NOTRACE	= 0,
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 18ceab59a5ba..022def96d307 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -965,7 +965,7 @@ extern void __trace_graph_return(struct trace_array *tr,
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 extern struct ftrace_hash __rcu *ftrace_graph_hash;
-extern struct ftrace_hash *ftrace_graph_notrace_hash;
+extern struct ftrace_hash __rcu *ftrace_graph_notrace_hash;
 
 static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
 {
@@ -1018,10 +1018,14 @@ static inline void ftrace_graph_addr_finish(struct ftrace_graph_ret *trace)
 static inline int ftrace_graph_notrace_addr(unsigned long addr)
 {
 	int ret = 0;
+	struct ftrace_hash *notrace_hash;
 
 	preempt_disable_notrace();
 
-	if (ftrace_lookup_ip(ftrace_graph_notrace_hash, addr))
+	notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
+						 !preemptible());
+
+	if (ftrace_lookup_ip(notrace_hash, addr))
 		ret = 1;
 
 	preempt_enable_notrace();
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 10:49 [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6 Steven Rostedt
                   ` (2 preceding siblings ...)
  2020-02-05 10:49 ` [for-next][PATCH 3/4] tracing: Annotate ftrace_graph_notrace_hash " Steven Rostedt
@ 2020-02-05 10:49 ` Steven Rostedt
  2020-02-05 11:33   ` Steven Rostedt
  3 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Because the function graph tracer can execute in sections where RCU is not
"watching", the rcu_dereference_sched() for the has needs to be open coded.
This is fine because the RCU "flavor" of the ftrace hash is protected by
its own RCU handling (it does its own little synchronization on every CPU
and does not rely on RCU sched).

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 022def96d307..8c52f5de9384 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -975,6 +975,11 @@ static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
 
 	preempt_disable_notrace();
 
+	/*
+	 * Have to open code "rcu_dereference_sched()" because the
+	 * function graph tracer can be called when RCU is not
+	 * "watching".
+	 */
 	hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
 
 	if (ftrace_hash_empty(hash)) {
@@ -1022,6 +1027,11 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
 
 	preempt_disable_notrace();
 
+	/*
+	 * Have to open code "rcu_dereference_sched()" because the
+	 * function graph tracer can be called when RCU is not
+	 * "watching".
+	 */
 	notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
 						 !preemptible());
 
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 10:49 ` [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded Steven Rostedt
@ 2020-02-05 11:33   ` Steven Rostedt
  2020-02-05 14:19     ` Joel Fernandes
  2020-02-05 21:59     ` Joel Fernandes
  0 siblings, 2 replies; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 11:33 UTC (permalink / raw)
  To: Paul E. McKenney, Joel Fernandes
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover


Paul and Joel,

Care to ack this patch (or complain about it ;-) ?

-- Steve


On Wed, 05 Feb 2020 05:49:33 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Because the function graph tracer can execute in sections where RCU is not
> "watching", the rcu_dereference_sched() for the has needs to be open coded.
> This is fine because the RCU "flavor" of the ftrace hash is protected by
> its own RCU handling (it does its own little synchronization on every CPU
> and does not rely on RCU sched).
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 022def96d307..8c52f5de9384 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -975,6 +975,11 @@ static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
>  
>  	preempt_disable_notrace();
>  
> +	/*
> +	 * Have to open code "rcu_dereference_sched()" because the
> +	 * function graph tracer can be called when RCU is not
> +	 * "watching".
> +	 */
>  	hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
>  
>  	if (ftrace_hash_empty(hash)) {
> @@ -1022,6 +1027,11 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
>  
>  	preempt_disable_notrace();
>  
> +	/*
> +	 * Have to open code "rcu_dereference_sched()" because the
> +	 * function graph tracer can be called when RCU is not
> +	 * "watching".
> +	 */
>  	notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
>  						 !preemptible());
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 11:33   ` Steven Rostedt
@ 2020-02-05 14:19     ` Joel Fernandes
  2020-02-05 14:28       ` Steven Rostedt
  2020-02-05 21:59     ` Joel Fernandes
  1 sibling, 1 reply; 13+ messages in thread
From: Joel Fernandes @ 2020-02-05 14:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, Feb 05, 2020 at 06:33:49AM -0500, Steven Rostedt wrote:
> 
> Paul and Joel,
> 
> Care to ack this patch (or complain about it ;-) ?
> 
> -- Steve
> 
> 
> On Wed, 05 Feb 2020 05:49:33 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > Because the function graph tracer can execute in sections where RCU is not
> > "watching", the rcu_dereference_sched() for the has needs to be open coded.
> > This is fine because the RCU "flavor" of the ftrace hash is protected by
> > its own RCU handling (it does its own little synchronization on every CPU
> > and does not rely on RCU sched).
> > 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---
> >  kernel/trace/trace.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 022def96d307..8c52f5de9384 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -975,6 +975,11 @@ static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
> >  
> >  	preempt_disable_notrace();
> >  
> > +	/*
> > +	 * Have to open code "rcu_dereference_sched()" because the
> > +	 * function graph tracer can be called when RCU is not
> > +	 * "watching".
> > +	 */
> >  	hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
> >  
> >  	if (ftrace_hash_empty(hash)) {
> > @@ -1022,6 +1027,11 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
> >  
> >  	preempt_disable_notrace();
> >  
> > +	/*
> > +	 * Have to open code "rcu_dereference_sched()" because the
> > +	 * function graph tracer can be called when RCU is not
> > +	 * "watching".
> > +	 */
> >  	notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> >  						 !preemptible());
> >  

Could you paste the stack here when RCU is not watching? In trace event code
IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
code can be called from idle loop. Should we doing the same here as well?

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 14:19     ` Joel Fernandes
@ 2020-02-05 14:28       ` Steven Rostedt
  2020-02-05 15:42         ` Joel Fernandes
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 14:28 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, 5 Feb 2020 09:19:15 -0500
Joel Fernandes <joel@joelfernandes.org> wrote:

> Could you paste the stack here when RCU is not watching? In trace event code
> IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> code can be called from idle loop. Should we doing the same here as well?

Unfortunately I lost the stack trace. And the last time we tried to use
rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
properly. Ftrace is much more invasive then going into idle. The
problem is that ftrace traces RCU itself, and calling
"rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
lots of bugs ;-)

This is why we have the schedule_on_each_cpu(ftrace_sync) hack.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 14:28       ` Steven Rostedt
@ 2020-02-05 15:42         ` Joel Fernandes
  2020-02-05 15:49           ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes @ 2020-02-05 15:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> On Wed, 5 Feb 2020 09:19:15 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > Could you paste the stack here when RCU is not watching? In trace event code
> > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > code can be called from idle loop. Should we doing the same here as well?
> 
> Unfortunately I lost the stack trace. And the last time we tried to use
> rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> properly. Ftrace is much more invasive then going into idle. The
> problem is that ftrace traces RCU itself, and calling
> "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> lots of bugs ;-)
> 
> This is why we have the schedule_on_each_cpu(ftrace_sync) hack.

The "schedule a task on each CPU" trick works on !PREEMPT though right?

Because it is possible in PREEMPT=y to get preempted in the middle of a
read-side critical section, switch to the worker thread executing the
ftrace_sync() and then switch back. But RCU still has to watch that CPU since
the read-side critical section was not completed.

Or is there a subtlety here with ftrace that I missed?

thanks,

 - Joel


> 
> -- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 15:42         ` Joel Fernandes
@ 2020-02-05 15:49           ` Steven Rostedt
  2020-02-05 16:08             ` Joel Fernandes
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2020-02-05 15:49 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, 5 Feb 2020 10:42:12 -0500
Joel Fernandes <joel@joelfernandes.org> wrote:

> On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> > On Wed, 5 Feb 2020 09:19:15 -0500
> > Joel Fernandes <joel@joelfernandes.org> wrote:
> >   
> > > Could you paste the stack here when RCU is not watching? In trace event code
> > > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > > code can be called from idle loop. Should we doing the same here as well?  
> > 
> > Unfortunately I lost the stack trace. And the last time we tried to use
> > rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> > properly. Ftrace is much more invasive then going into idle. The
> > problem is that ftrace traces RCU itself, and calling
> > "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> > lots of bugs ;-)
> > 
> > This is why we have the schedule_on_each_cpu(ftrace_sync) hack.  
> 
> The "schedule a task on each CPU" trick works on !PREEMPT though right?

It works on both, as I care more about the PREEMPT=y case then
the !PREEMPT, and the PREEMPT_RT which is even more preemptive than
PREEMPT!

> 
> Because it is possible in PREEMPT=y to get preempted in the middle of a
> read-side critical section, switch to the worker thread executing the
> ftrace_sync() and then switch back. But RCU still has to watch that CPU since
> the read-side critical section was not completed.
> 
> Or is there a subtlety here with ftrace that I missed?
> 

Hence Amol's patch:

> +       notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> +                                                !preemptible());

It checks to make sure preemption is off. There is no chance of being
preempted in the read side critical section.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 15:49           ` Steven Rostedt
@ 2020-02-05 16:08             ` Joel Fernandes
  2020-02-05 21:54               ` Joel Fernandes
  0 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes @ 2020-02-05 16:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, Feb 05, 2020 at 10:49:45AM -0500, Steven Rostedt wrote:
> On Wed, 5 Feb 2020 10:42:12 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> > > On Wed, 5 Feb 2020 09:19:15 -0500
> > > Joel Fernandes <joel@joelfernandes.org> wrote:
> > >   
> > > > Could you paste the stack here when RCU is not watching? In trace event code
> > > > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > > > code can be called from idle loop. Should we doing the same here as well?  
> > > 
> > > Unfortunately I lost the stack trace. And the last time we tried to use
> > > rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> > > properly. Ftrace is much more invasive then going into idle. The
> > > problem is that ftrace traces RCU itself, and calling
> > > "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> > > lots of bugs ;-)
> > > 
> > > This is why we have the schedule_on_each_cpu(ftrace_sync) hack.  
> > 
> > The "schedule a task on each CPU" trick works on !PREEMPT though right?
> 
> It works on both, as I care more about the PREEMPT=y case then
> the !PREEMPT, and the PREEMPT_RT which is even more preemptive than
> PREEMPT!
> 
> > 
> > Because it is possible in PREEMPT=y to get preempted in the middle of a
> > read-side critical section, switch to the worker thread executing the
> > ftrace_sync() and then switch back. But RCU still has to watch that CPU since
> > the read-side critical section was not completed.
> > 
> > Or is there a subtlety here with ftrace that I missed?
> > 
> 
> Hence Amol's patch:
> 
> > +       notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> > +                                                !preemptible());
> 
> It checks to make sure preemption is off. There is no chance of being
> preempted in the read side critical section.

Yes, this makes sense. Sorry for the noise.  For "sched" RCU cases,
scheduling on each CPU would work regardless of PREEMPT configuration.

( I guess I was confusing this case with the non-sched RCU usages (such as using
rcu_read_lock()) where scheduling a task on each CPU obviously would not work
with PREEMPT=y. )

By the way would SRCU not work instead of the ftrace_sync() technique? Or is
the concern that SRCU cannot be used from NMI?

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 16:08             ` Joel Fernandes
@ 2020-02-05 21:54               ` Joel Fernandes
  0 siblings, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2020-02-05 21:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, Feb 05, 2020 at 11:08:24AM -0500, Joel Fernandes wrote:
> On Wed, Feb 05, 2020 at 10:49:45AM -0500, Steven Rostedt wrote:
> > On Wed, 5 Feb 2020 10:42:12 -0500
> > Joel Fernandes <joel@joelfernandes.org> wrote:
> > 
> > > On Wed, Feb 05, 2020 at 09:28:47AM -0500, Steven Rostedt wrote:
> > > > On Wed, 5 Feb 2020 09:19:15 -0500
> > > > Joel Fernandes <joel@joelfernandes.org> wrote:
> > > >   
> > > > > Could you paste the stack here when RCU is not watching? In trace event code
> > > > > IIRC we call rcu_enter_irqs_on() to have RCU temporarily watch, since that
> > > > > code can be called from idle loop. Should we doing the same here as well?  
> > > > 
> > > > Unfortunately I lost the stack trace. And the last time we tried to use
> > > > rcu_enter_irqs_on() for ftrace, we couldn't find a way to do this
> > > > properly. Ftrace is much more invasive then going into idle. The
> > > > problem is that ftrace traces RCU itself, and calling
> > > > "rcu_enter_irqs_on()" in pretty much any place in the RCU code caused
> > > > lots of bugs ;-)
> > > > 
> > > > This is why we have the schedule_on_each_cpu(ftrace_sync) hack.  
> > > 
> > > The "schedule a task on each CPU" trick works on !PREEMPT though right?
> > 
> > It works on both, as I care more about the PREEMPT=y case then
> > the !PREEMPT, and the PREEMPT_RT which is even more preemptive than
> > PREEMPT!
> > 
> > > 
> > > Because it is possible in PREEMPT=y to get preempted in the middle of a
> > > read-side critical section, switch to the worker thread executing the
> > > ftrace_sync() and then switch back. But RCU still has to watch that CPU since
> > > the read-side critical section was not completed.
> > > 
> > > Or is there a subtlety here with ftrace that I missed?
> > > 
> > 
> > Hence Amol's patch:
> > 
> > > +       notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> > > +                                                !preemptible());
> > 
> > It checks to make sure preemption is off. There is no chance of being
> > preempted in the read side critical section.
> 
> Yes, this makes sense. Sorry for the noise.  For "sched" RCU cases,
> scheduling on each CPU would work regardless of PREEMPT configuration.
> 
> ( I guess I was confusing this case with the non-sched RCU usages (such as using
> rcu_read_lock()) where scheduling a task on each CPU obviously would not work
> with PREEMPT=y. )
> 
> By the way would SRCU not work instead of the ftrace_sync() technique? Or is
> the concern that SRCU cannot be used from NMI?

Answering my own question, SRCU would likely slow down ftrace_graph_addr()
unnecessarily so is probably not worth doing so in this path (especially
because ftrace_graph_addr() already starts an implict read-side critical
section anyway via preempt_disable()).

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded
  2020-02-05 11:33   ` Steven Rostedt
  2020-02-05 14:19     ` Joel Fernandes
@ 2020-02-05 21:59     ` Joel Fernandes
  1 sibling, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2020-02-05 21:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrew Morton, Amol Grover

On Wed, Feb 05, 2020 at 06:33:49AM -0500, Steven Rostedt wrote:
> 
> Paul and Joel,
> 
> Care to ack this patch (or complain about it ;-) ?
> 
> -- Steve
> 
> 
> On Wed, 05 Feb 2020 05:49:33 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > Because the function graph tracer can execute in sections where RCU is not
> > "watching", the rcu_dereference_sched() for the has needs to be open coded.
> > This is fine because the RCU "flavor" of the ftrace hash is protected by
> > its own RCU handling (it does its own little synchronization on every CPU
> > and does not rely on RCU sched).
> > 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>

thanks,

 - Joel


> > ---
> >  kernel/trace/trace.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 022def96d307..8c52f5de9384 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -975,6 +975,11 @@ static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
> >  
> >  	preempt_disable_notrace();
> >  
> > +	/*
> > +	 * Have to open code "rcu_dereference_sched()" because the
> > +	 * function graph tracer can be called when RCU is not
> > +	 * "watching".
> > +	 */
> >  	hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
> >  
> >  	if (ftrace_hash_empty(hash)) {
> > @@ -1022,6 +1027,11 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
> >  
> >  	preempt_disable_notrace();
> >  
> > +	/*
> > +	 * Have to open code "rcu_dereference_sched()" because the
> > +	 * function graph tracer can be called when RCU is not
> > +	 * "watching".
> > +	 */
> >  	notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
> >  						 !preemptible());
> >  
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-02-05 21:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05 10:49 [for-next][PATCH 0/4] tracing: Hopefully last update for 5.6 Steven Rostedt
2020-02-05 10:49 ` [for-next][PATCH 1/4] bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline Steven Rostedt
2020-02-05 10:49 ` [for-next][PATCH 2/4] tracing: Annotate ftrace_graph_hash pointer with __rcu Steven Rostedt
2020-02-05 10:49 ` [for-next][PATCH 3/4] tracing: Annotate ftrace_graph_notrace_hash " Steven Rostedt
2020-02-05 10:49 ` [for-next][PATCH 4/4] ftrace: Add comment to why rcu_dereference_sched() is open coded Steven Rostedt
2020-02-05 11:33   ` Steven Rostedt
2020-02-05 14:19     ` Joel Fernandes
2020-02-05 14:28       ` Steven Rostedt
2020-02-05 15:42         ` Joel Fernandes
2020-02-05 15:49           ` Steven Rostedt
2020-02-05 16:08             ` Joel Fernandes
2020-02-05 21:54               ` Joel Fernandes
2020-02-05 21:59     ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).