All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] rcu/ftrace: Add rcu_dereference_raw_notrace() and friends
@ 2013-05-28 18:38 Steven Rostedt
  2013-05-28 18:38 ` [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu() Steven Rostedt
  2013-05-28 18:38 ` [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends Steven Rostedt
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-28 18:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton

Paul,

As you suggesed using the name "*_notrace", I updated my patch. There's
two patches now.

The first patch adds two new APIs to the RCU system.
 rcu_dereference_raw_notrace()
 hlist_for_each_entry_rcu_notrace()

I needed the second one as the function tracer uses that too and it
calls rcu_dereference_raw().

Is this acceptable for you? If so can you give me your acked-by, so that
I can push this up to Linus. Otherwise, the system locks up when doing
function tracing with too much RCU debug running.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()
  2013-05-28 18:38 [PATCH 0/2] rcu/ftrace: Add rcu_dereference_raw_notrace() and friends Steven Rostedt
@ 2013-05-28 18:38 ` Steven Rostedt
  2013-05-29  1:50   ` Paul E. McKenney
  2013-05-28 18:38 ` [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends Steven Rostedt
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2013-05-28 18:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton

[-- Attachment #1: add_rcu_dereference_raw_notrace.patch --]
[-- Type: text/plain, Size: 2684 bytes --]

As rcu_dereference_raw() under RCU debug config options can add quite a
bit of checks, and that tracing uses rcu_dereference_raw(), these checks
happen with the function tracer. The function tracer also happens to trace
these debug checks too. This added overhead can livelock the system.

Add a new interface to RCU for both rcu_dereference_raw_notrace() as well
as hlist_for_each_entry_rcu_notrace() as the hlist iterator uses the
rcu_dereference_raw() as well, and is used a bit with the function tracer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-trace.git/include/linux/rculist.h
===================================================================
--- linux-trace.git.orig/include/linux/rculist.h
+++ linux-trace.git/include/linux/rculist.h
@@ -461,6 +461,26 @@ static inline void hlist_add_after_rcu(s
 			&(pos)->member)), typeof(*(pos)), member))
 
 /**
+ * hlist_for_each_entry_rcu_notrace - iterate over rcu list of given type (for tracing)
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the hlist_node within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as hlist_add_head_rcu()
+ * as long as the traversal is guarded by rcu_read_lock().
+ *
+ * This is the same as hlist_for_each_entry_rcu() except that it does
+ * not do any RCU debugging or tracing.
+ */
+#define hlist_for_each_entry_rcu_notrace(pos, head, member)			\
+	for (pos = hlist_entry_safe (rcu_dereference_raw_notrace(hlist_first_rcu(head)),\
+			typeof(*(pos)), member);			\
+		pos;							\
+		pos = hlist_entry_safe(rcu_dereference_raw_notrace(hlist_next_rcu(\
+			&(pos)->member)), typeof(*(pos)), member))
+
+/**
  * hlist_for_each_entry_rcu_bh - iterate over rcu list of given type
  * @pos:	the type * to use as a loop cursor.
  * @head:	the head for your list.
Index: linux-trace.git/include/linux/rcupdate.h
===================================================================
--- linux-trace.git.orig/include/linux/rcupdate.h
+++ linux-trace.git/include/linux/rcupdate.h
@@ -640,6 +640,15 @@ static inline void rcu_preempt_sleep_che
 
 #define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ needed? @@@*/
 
+/*
+ * The tracing infrastructure traces RCU (we want that), but unfortunately
+ * some of the RCU checks causes tracing to lock up the system.
+ *
+ * The tracing version of rcu_dereference_raw() must not call
+ * rcu_read_lock_held().
+ */
+#define rcu_dereference_raw_notrace(p) __rcu_dereference_check((p), 1, __rcu)
+
 /**
  * rcu_access_index() - fetch RCU index with no dereferencing
  * @p: The index to read


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
  2013-05-28 18:38 [PATCH 0/2] rcu/ftrace: Add rcu_dereference_raw_notrace() and friends Steven Rostedt
  2013-05-28 18:38 ` [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu() Steven Rostedt
@ 2013-05-28 18:38 ` Steven Rostedt
  2013-05-29  1:53   ` Paul E. McKenney
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2013-05-28 18:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney, Ingo Molnar, Andrew Morton

[-- Attachment #1: ftrace-rcu-dereference.patch --]
[-- Type: text/plain, Size: 3072 bytes --]

As rcu_dereference_raw() under RCU debug config options can add quite a
bit of checks, and that tracing uses rcu_dereference_raw(), these checks
happen with the function tracer. The function tracer also happens to trace
these debug checks too. This added overhead can livelock the system.

Have the function tracer use the new RCU _notrace equivalents that do
not do the debug checks for RCU.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-trace.git/kernel/trace/ftrace.c
===================================================================
--- linux-trace.git.orig/kernel/trace/ftrace.c
+++ linux-trace.git/kernel/trace/ftrace.c
@@ -120,22 +120,22 @@ static void ftrace_ops_no_ops(unsigned l
 
 /*
  * Traverse the ftrace_global_list, invoking all entries.  The reason that we
- * can use rcu_dereference_raw() is that elements removed from this list
+ * can use rcu_dereference_raw_notrace() is that elements removed from this list
  * are simply leaked, so there is no need to interact with a grace-period
- * mechanism.  The rcu_dereference_raw() calls are needed to handle
+ * mechanism.  The rcu_dereference_raw_notrace() calls are needed to handle
  * concurrent insertions into the ftrace_global_list.
  *
  * Silly Alpha and silly pointer-speculation compiler optimizations!
  */
 #define do_for_each_ftrace_op(op, list)			\
-	op = rcu_dereference_raw(list);			\
+	op = rcu_dereference_raw_notrace(list);			\
 	do
 
 /*
  * Optimized for just a single item in the list (as that is the normal case).
  */
 #define while_for_each_ftrace_op(op)				\
-	while (likely(op = rcu_dereference_raw((op)->next)) &&	\
+	while (likely(op = rcu_dereference_raw_notrace((op)->next)) &&	\
 	       unlikely((op) != &ftrace_list_end))
 
 static inline void ftrace_ops_init(struct ftrace_ops *ops)
@@ -779,7 +779,7 @@ ftrace_find_profiled_func(struct ftrace_
 	if (hlist_empty(hhd))
 		return NULL;
 
-	hlist_for_each_entry_rcu(rec, hhd, node) {
+	hlist_for_each_entry_rcu_notrace(rec, hhd, node) {
 		if (rec->ip == ip)
 			return rec;
 	}
@@ -1165,7 +1165,7 @@ ftrace_lookup_ip(struct ftrace_hash *has
 
 	hhd = &hash->buckets[key];
 
-	hlist_for_each_entry_rcu(entry, hhd, hlist) {
+	hlist_for_each_entry_rcu_notrace(entry, hhd, hlist) {
 		if (entry->ip == ip)
 			return entry;
 	}
@@ -1422,8 +1422,8 @@ ftrace_ops_test(struct ftrace_ops *ops, 
 	struct ftrace_hash *notrace_hash;
 	int ret;
 
-	filter_hash = rcu_dereference_raw(ops->filter_hash);
-	notrace_hash = rcu_dereference_raw(ops->notrace_hash);
+	filter_hash = rcu_dereference_raw_notrace(ops->filter_hash);
+	notrace_hash = rcu_dereference_raw_notrace(ops->notrace_hash);
 
 	if ((ftrace_hash_empty(filter_hash) ||
 	     ftrace_lookup_ip(filter_hash, ip)) &&
@@ -2920,7 +2920,7 @@ static void function_trace_probe_call(un
 	 * on the hash. rcu_read_lock is too dangerous here.
 	 */
 	preempt_disable_notrace();
-	hlist_for_each_entry_rcu(entry, hhd, node) {
+	hlist_for_each_entry_rcu_notrace(entry, hhd, node) {
 		if (entry->ip == ip)
 			entry->ops->func(ip, parent_ip, &entry->data);
 	}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()
  2013-05-28 18:38 ` [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu() Steven Rostedt
@ 2013-05-29  1:50   ` Paul E. McKenney
  0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-29  1:50 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Tue, May 28, 2013 at 02:38:42PM -0400, Steven Rostedt wrote:
> As rcu_dereference_raw() under RCU debug config options can add quite a
> bit of checks, and that tracing uses rcu_dereference_raw(), these checks
> happen with the function tracer. The function tracer also happens to trace
> these debug checks too. This added overhead can livelock the system.
> 
> Add a new interface to RCU for both rcu_dereference_raw_notrace() as well
> as hlist_for_each_entry_rcu_notrace() as the hlist iterator uses the
> rcu_dereference_raw() as well, and is used a bit with the function tracer.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Much nicer!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Index: linux-trace.git/include/linux/rculist.h
> ===================================================================
> --- linux-trace.git.orig/include/linux/rculist.h
> +++ linux-trace.git/include/linux/rculist.h
> @@ -461,6 +461,26 @@ static inline void hlist_add_after_rcu(s
>  			&(pos)->member)), typeof(*(pos)), member))
> 
>  /**
> + * hlist_for_each_entry_rcu_notrace - iterate over rcu list of given type (for tracing)
> + * @pos:	the type * to use as a loop cursor.
> + * @head:	the head for your list.
> + * @member:	the name of the hlist_node within the struct.
> + *
> + * This list-traversal primitive may safely run concurrently with
> + * the _rcu list-mutation primitives such as hlist_add_head_rcu()
> + * as long as the traversal is guarded by rcu_read_lock().
> + *
> + * This is the same as hlist_for_each_entry_rcu() except that it does
> + * not do any RCU debugging or tracing.
> + */
> +#define hlist_for_each_entry_rcu_notrace(pos, head, member)			\
> +	for (pos = hlist_entry_safe (rcu_dereference_raw_notrace(hlist_first_rcu(head)),\
> +			typeof(*(pos)), member);			\
> +		pos;							\
> +		pos = hlist_entry_safe(rcu_dereference_raw_notrace(hlist_next_rcu(\
> +			&(pos)->member)), typeof(*(pos)), member))
> +
> +/**
>   * hlist_for_each_entry_rcu_bh - iterate over rcu list of given type
>   * @pos:	the type * to use as a loop cursor.
>   * @head:	the head for your list.
> Index: linux-trace.git/include/linux/rcupdate.h
> ===================================================================
> --- linux-trace.git.orig/include/linux/rcupdate.h
> +++ linux-trace.git/include/linux/rcupdate.h
> @@ -640,6 +640,15 @@ static inline void rcu_preempt_sleep_che
> 
>  #define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ needed? @@@*/
> 
> +/*
> + * The tracing infrastructure traces RCU (we want that), but unfortunately
> + * some of the RCU checks causes tracing to lock up the system.
> + *
> + * The tracing version of rcu_dereference_raw() must not call
> + * rcu_read_lock_held().
> + */
> +#define rcu_dereference_raw_notrace(p) __rcu_dereference_check((p), 1, __rcu)
> +
>  /**
>   * rcu_access_index() - fetch RCU index with no dereferencing
>   * @p: The index to read
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
  2013-05-28 18:38 ` [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends Steven Rostedt
@ 2013-05-29  1:53   ` Paul E. McKenney
  0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-05-29  1:53 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Tue, May 28, 2013 at 02:38:43PM -0400, Steven Rostedt wrote:
> As rcu_dereference_raw() under RCU debug config options can add quite a
> bit of checks, and that tracing uses rcu_dereference_raw(), these checks
> happen with the function tracer. The function tracer also happens to trace
> these debug checks too. This added overhead can livelock the system.
> 
> Have the function tracer use the new RCU _notrace equivalents that do
> not do the debug checks for RCU.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Looks good to me!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Index: linux-trace.git/kernel/trace/ftrace.c
> ===================================================================
> --- linux-trace.git.orig/kernel/trace/ftrace.c
> +++ linux-trace.git/kernel/trace/ftrace.c
> @@ -120,22 +120,22 @@ static void ftrace_ops_no_ops(unsigned l
> 
>  /*
>   * Traverse the ftrace_global_list, invoking all entries.  The reason that we
> - * can use rcu_dereference_raw() is that elements removed from this list
> + * can use rcu_dereference_raw_notrace() is that elements removed from this list
>   * are simply leaked, so there is no need to interact with a grace-period
> - * mechanism.  The rcu_dereference_raw() calls are needed to handle
> + * mechanism.  The rcu_dereference_raw_notrace() calls are needed to handle
>   * concurrent insertions into the ftrace_global_list.
>   *
>   * Silly Alpha and silly pointer-speculation compiler optimizations!
>   */
>  #define do_for_each_ftrace_op(op, list)			\
> -	op = rcu_dereference_raw(list);			\
> +	op = rcu_dereference_raw_notrace(list);			\
>  	do
> 
>  /*
>   * Optimized for just a single item in the list (as that is the normal case).
>   */
>  #define while_for_each_ftrace_op(op)				\
> -	while (likely(op = rcu_dereference_raw((op)->next)) &&	\
> +	while (likely(op = rcu_dereference_raw_notrace((op)->next)) &&	\
>  	       unlikely((op) != &ftrace_list_end))
> 
>  static inline void ftrace_ops_init(struct ftrace_ops *ops)
> @@ -779,7 +779,7 @@ ftrace_find_profiled_func(struct ftrace_
>  	if (hlist_empty(hhd))
>  		return NULL;
> 
> -	hlist_for_each_entry_rcu(rec, hhd, node) {
> +	hlist_for_each_entry_rcu_notrace(rec, hhd, node) {
>  		if (rec->ip == ip)
>  			return rec;
>  	}
> @@ -1165,7 +1165,7 @@ ftrace_lookup_ip(struct ftrace_hash *has
> 
>  	hhd = &hash->buckets[key];
> 
> -	hlist_for_each_entry_rcu(entry, hhd, hlist) {
> +	hlist_for_each_entry_rcu_notrace(entry, hhd, hlist) {
>  		if (entry->ip == ip)
>  			return entry;
>  	}
> @@ -1422,8 +1422,8 @@ ftrace_ops_test(struct ftrace_ops *ops, 
>  	struct ftrace_hash *notrace_hash;
>  	int ret;
> 
> -	filter_hash = rcu_dereference_raw(ops->filter_hash);
> -	notrace_hash = rcu_dereference_raw(ops->notrace_hash);
> +	filter_hash = rcu_dereference_raw_notrace(ops->filter_hash);
> +	notrace_hash = rcu_dereference_raw_notrace(ops->notrace_hash);
> 
>  	if ((ftrace_hash_empty(filter_hash) ||
>  	     ftrace_lookup_ip(filter_hash, ip)) &&
> @@ -2920,7 +2920,7 @@ static void function_trace_probe_call(un
>  	 * on the hash. rcu_read_lock is too dangerous here.
>  	 */
>  	preempt_disable_notrace();
> -	hlist_for_each_entry_rcu(entry, hhd, node) {
> +	hlist_for_each_entry_rcu_notrace(entry, hhd, node) {
>  		if (entry->ip == ip)
>  			entry->ops->func(ip, parent_ip, &entry->data);
>  	}
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
  2013-05-29 18:53 [PATCH 0/2] [GIT PULL] rcu/ftrace: Fix livelock from overhead of RCU debugging Steven Rostedt
@ 2013-05-29 18:53 ` Steven Rostedt
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-29 18:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 3538 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

As rcu_dereference_raw() under RCU debug config options can add quite a
bit of checks, and that tracing uses rcu_dereference_raw(), these checks
happen with the function tracer. The function tracer also happens to trace
these debug checks too. This added overhead can livelock the system.

Have the function tracer use the new RCU _notrace equivalents that do
not do the debug checks for RCU.

Link: http://lkml.kernel.org/r/20130528184209.467603904@goodmis.org

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index b549b0f..6c508ff 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -120,22 +120,22 @@ static void ftrace_ops_no_ops(unsigned long ip, unsigned long parent_ip);
 
 /*
  * Traverse the ftrace_global_list, invoking all entries.  The reason that we
- * can use rcu_dereference_raw() is that elements removed from this list
+ * can use rcu_dereference_raw_notrace() is that elements removed from this list
  * are simply leaked, so there is no need to interact with a grace-period
- * mechanism.  The rcu_dereference_raw() calls are needed to handle
+ * mechanism.  The rcu_dereference_raw_notrace() calls are needed to handle
  * concurrent insertions into the ftrace_global_list.
  *
  * Silly Alpha and silly pointer-speculation compiler optimizations!
  */
 #define do_for_each_ftrace_op(op, list)			\
-	op = rcu_dereference_raw(list);			\
+	op = rcu_dereference_raw_notrace(list);			\
 	do
 
 /*
  * Optimized for just a single item in the list (as that is the normal case).
  */
 #define while_for_each_ftrace_op(op)				\
-	while (likely(op = rcu_dereference_raw((op)->next)) &&	\
+	while (likely(op = rcu_dereference_raw_notrace((op)->next)) &&	\
 	       unlikely((op) != &ftrace_list_end))
 
 static inline void ftrace_ops_init(struct ftrace_ops *ops)
@@ -779,7 +779,7 @@ ftrace_find_profiled_func(struct ftrace_profile_stat *stat, unsigned long ip)
 	if (hlist_empty(hhd))
 		return NULL;
 
-	hlist_for_each_entry_rcu(rec, hhd, node) {
+	hlist_for_each_entry_rcu_notrace(rec, hhd, node) {
 		if (rec->ip == ip)
 			return rec;
 	}
@@ -1165,7 +1165,7 @@ ftrace_lookup_ip(struct ftrace_hash *hash, unsigned long ip)
 
 	hhd = &hash->buckets[key];
 
-	hlist_for_each_entry_rcu(entry, hhd, hlist) {
+	hlist_for_each_entry_rcu_notrace(entry, hhd, hlist) {
 		if (entry->ip == ip)
 			return entry;
 	}
@@ -1422,8 +1422,8 @@ ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip)
 	struct ftrace_hash *notrace_hash;
 	int ret;
 
-	filter_hash = rcu_dereference_raw(ops->filter_hash);
-	notrace_hash = rcu_dereference_raw(ops->notrace_hash);
+	filter_hash = rcu_dereference_raw_notrace(ops->filter_hash);
+	notrace_hash = rcu_dereference_raw_notrace(ops->notrace_hash);
 
 	if ((ftrace_hash_empty(filter_hash) ||
 	     ftrace_lookup_ip(filter_hash, ip)) &&
@@ -2920,7 +2920,7 @@ static void function_trace_probe_call(unsigned long ip, unsigned long parent_ip,
 	 * on the hash. rcu_read_lock is too dangerous here.
 	 */
 	preempt_disable_notrace();
-	hlist_for_each_entry_rcu(entry, hhd, node) {
+	hlist_for_each_entry_rcu_notrace(entry, hhd, node) {
 		if (entry->ip == ip)
 			entry->ops->func(ip, parent_ip, &entry->data);
 	}
-- 
1.7.10.4



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-29 18:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-28 18:38 [PATCH 0/2] rcu/ftrace: Add rcu_dereference_raw_notrace() and friends Steven Rostedt
2013-05-28 18:38 ` [PATCH 1/2] rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu() Steven Rostedt
2013-05-29  1:50   ` Paul E. McKenney
2013-05-28 18:38 ` [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends Steven Rostedt
2013-05-29  1:53   ` Paul E. McKenney
2013-05-29 18:53 [PATCH 0/2] [GIT PULL] rcu/ftrace: Fix livelock from overhead of RCU debugging Steven Rostedt
2013-05-29 18:53 ` [PATCH 2/2] ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.