linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] tracing/stat: sort in ascending order
@ 2009-05-25  8:46 Li Zefan
  2009-05-25  8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Li Zefan @ 2009-05-25  8:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML

Currently the output of trace_stat/workqueues is totally reversed:

 # cat /debug/tracing/trace_stat/workqueues
    ...
    1       17       17      210       37   `-blk_unplug_work+0x0/0x57
    1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
    1     3796     3796                     kblockd/1:120
    ...

The correct output should be:

    1     3796     3796                     kblockd/1:120
    1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
    1       17       17      210       37   `-blk_unplug_work+0x0/0x57

It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
(53059c9b67a62a3dc8c80204d3da42b9267ea5a0).

Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
it's better to always do ascending sorting in trace_stat.c, and leave each
stat tracer to decide whether to sort in descending or ascending order.

[ Impact: fix the output of trace_stat/workqueue ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/trace/ftrace.c       |   12 ++++++------
 kernel/trace/trace_branch.c |    5 +++--
 kernel/trace/trace_stat.c   |    6 +-----
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 140699a..3dd16bd 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -315,29 +315,29 @@ static void *function_stat_start(struct tracer_stat *trace)
 }
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-/* function graph compares on total time */
+/* function graph compares on total time in reverse order */
 static int function_stat_cmp(void *p1, void *p2)
 {
 	struct ftrace_profile *a = p1;
 	struct ftrace_profile *b = p2;
 
-	if (a->time < b->time)
-		return -1;
 	if (a->time > b->time)
+		return -1;
+	if (a->time < b->time)
 		return 1;
 	else
 		return 0;
 }
 #else
-/* not function graph compares against hits */
+/* not function graph compares against hits in reverse order */
 static int function_stat_cmp(void *p1, void *p2)
 {
 	struct ftrace_profile *a = p1;
 	struct ftrace_profile *b = p2;
 
-	if (a->counter < b->counter)
-		return -1;
 	if (a->counter > b->counter)
+		return -1;
+	if (a->counter < b->counter)
 		return 1;
 	else
 		return 0;
diff --git a/kernel/trace/trace_branch.c b/kernel/trace/trace_branch.c
index 7a7a9fd..df58411 100644
--- a/kernel/trace/trace_branch.c
+++ b/kernel/trace/trace_branch.c
@@ -301,9 +301,10 @@ static int annotated_branch_stat_cmp(void *p1, void *p2)
 	percent_a = get_incorrect_percent(a);
 	percent_b = get_incorrect_percent(b);
 
-	if (percent_a < percent_b)
-		return -1;
+	/* sort in descending order */
 	if (percent_a > percent_b)
+		return -1;
+	if (percent_a < percent_b)
 		return 1;
 	else
 		return 0;
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 2e849b5..6efbcb4 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -98,10 +98,6 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
 {
 	struct rb_node **new = &(root->rb_node), *parent = NULL;
 
-	/*
-	 * Figure out where to put new node
-	 * This is a descendent sorting
-	 */
 	while (*new) {
 		struct stat_node *this;
 		int result;
@@ -110,7 +106,7 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
 		result = cmp(data->stat, this->stat);
 
 		parent = *new;
-		if (result >= 0)
+		if (result < 0)
 			new = &((*new)->rb_left);
 		else
 			new = &((*new)->rb_right);
-- 
1.5.4.rc3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] tracing/stat: simplify rbtree freeing code
  2009-05-25  8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
@ 2009-05-25  8:46 ` Li Zefan
  2009-05-25 15:59   ` Frederic Weisbecker
  2009-05-25  8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
  2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
  2 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-25  8:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML

When closing a trace_stat file, we destroy the rbtree constructed during
file open, but there is memory leak that the root node is not freed:

static struct rb_node *release_next(struct rb_node *node)
{
	...
	else {
		if (!parent)		<-- here we leak @node
			return NULL;
	...
}

This patch keeps removing root node until the tree is empty. We regress
from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
it's a slow path, this change won't affect scalibility.

[ Impact: fix memory leak when closing a trace_stat file ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/trace/trace_stat.c |   39 +++++----------------------------------
 1 files changed, 5 insertions(+), 34 deletions(-)

diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 6efbcb4..ed18701 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
 /* The root directory for all stat files */
 static struct dentry		*stat_dir;
 
-/*
- * Iterate through the rbtree using a post order traversal path
- * to release the next node.
- * It won't necessary release one at each iteration
- * but it will at least advance closer to the next one
- * to be released.
- */
-static struct rb_node *release_next(struct rb_node *node)
+static void reset_stat_session(struct stat_session *session)
 {
 	struct stat_node *snode;
-	struct rb_node *parent = rb_parent(node);
-
-	if (node->rb_left)
-		return node->rb_left;
-	else if (node->rb_right)
-		return node->rb_right;
-	else {
-		if (!parent)
-			return NULL;
-		if (parent->rb_left == node)
-			parent->rb_left = NULL;
-		else
-			parent->rb_right = NULL;
+	struct rb_root *sroot = &session->stat_root;
 
-		snode = container_of(node, struct stat_node, node);
+	while (!RB_EMPTY_ROOT(sroot)) {
+		snode = rb_entry(sroot->rb_node, struct stat_node, node);
+		rb_erase(&snode->node, sroot);
 		kfree(snode);
-
-		return parent;
 	}
 }
 
-static void reset_stat_session(struct stat_session *session)
-{
-	struct rb_node *node = session->stat_root.rb_node;
-
-	while (node)
-		node = release_next(node);
-
-	session->stat_root = RB_ROOT;
-}
-
 static void destroy_session(struct stat_session *session)
 {
 	debugfs_remove(session->file);
-- 
1.5.4.rc3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] tracing/stat: do some cleanups
  2009-05-25  8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
  2009-05-25  8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
@ 2009-05-25  8:46 ` Li Zefan
  2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
  2 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2009-05-25  8:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML

- remove duplicate code in stat_seq_init()
- update comments to reflect the change from stat list to stat rbtree

[ Impact: clean up ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/trace/trace_stat.c |   56 +++++++++++++++++---------------------------
 1 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index ed18701..abf2ba1 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -8,7 +8,6 @@
  *
  */
 
-
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include <linux/debugfs.h>
@@ -64,10 +63,15 @@ static void destroy_session(struct stat_session *session)
 
 typedef int (*cmp_stat_t)(void *, void *);
 
-static void
-insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
+static int insert_stat(struct rb_root *root, void *stat, cmp_stat_t cmp)
 {
 	struct rb_node **new = &(root->rb_node), *parent = NULL;
+	struct stat_node *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+	data->stat = stat;
 
 	while (*new) {
 		struct stat_node *this;
@@ -85,12 +89,13 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
 
 	rb_link_node(&data->node, parent, new);
 	rb_insert_color(&data->node, root);
+	return 0;
 }
 
 /*
  * For tracers that don't provide a stat_cmp callback.
- * This one will force an immediate insertion on tail of
- * the list.
+ * This one will force an insertion on rightmost node
+ * of the tree.
  */
 static int dummy_cmp(void *p1, void *p2)
 {
@@ -98,15 +103,14 @@ static int dummy_cmp(void *p1, void *p2)
 }
 
 /*
- * Initialize the stat list at each trace_stat file opening.
+ * Initialize the stat rbtree at each trace_stat file opening.
  * All of these copies and sorting are required on all opening
  * since the stats could have changed between two file sessions.
  */
 static int stat_seq_init(struct stat_session *session)
 {
 	struct tracer_stat *ts = session->ts;
-	struct stat_node *new_entry;
-	struct rb_root *root;
+	struct rb_root *root = &session->stat_root;
 	void *stat;
 	int ret = 0;
 	int i;
@@ -121,23 +125,13 @@ static int stat_seq_init(struct stat_session *session)
 	if (!stat)
 		goto exit;
 
-	/*
-	 * The first entry. Actually this is the second, but the first
-	 * one (the stat_list head) is pointless.
-	 */
-	new_entry = kzalloc(sizeof(*new_entry), GFP_KERNEL);
-	if (!new_entry) {
-		ret = -ENOMEM;
+	ret = insert_stat(root, stat, ts->stat_cmp);
+	if (ret)
 		goto exit;
-	}
-	root = &session->stat_root;
-	insert_stat(root, new_entry, dummy_cmp);
-
-	new_entry->stat = stat;
 
 	/*
-	 * Iterate over the tracer stat entries and store them in a sorted
-	 * list.
+	 * Iterate over the tracer stat entries and store them
+	 * in an rbtree.
 	 */
 	for (i = 1; ; i++) {
 		stat = ts->stat_next(stat, i);
@@ -146,22 +140,16 @@ static int stat_seq_init(struct stat_session *session)
 		if (!stat)
 			break;
 
-		new_entry = kzalloc(sizeof(*new_entry), GFP_KERNEL);
-		if (!new_entry) {
-			ret = -ENOMEM;
-			goto exit_free_list;
-		}
-
-		new_entry->stat = stat;
-
-		insert_stat(root, new_entry, ts->stat_cmp);
+		ret = insert_stat(root, stat, ts->stat_cmp);
+		if (ret)
+			goto exit_free_rbtree;
 	}
 
 exit:
 	mutex_unlock(&session->stat_mutex);
 	return ret;
 
-exit_free_list:
+exit_free_rbtree:
 	reset_stat_session(session);
 	mutex_unlock(&session->stat_mutex);
 	return ret;
@@ -174,7 +162,7 @@ static void *stat_seq_start(struct seq_file *s, loff_t *pos)
 	struct rb_node *node;
 	int i;
 
-	/* Prevent from tracer switch or stat_list modification */
+	/* Prevent from tracer switch or rbtree modification */
 	mutex_lock(&session->stat_mutex);
 
 	/* If we are in the beginning of the file, print the headers */
@@ -252,7 +240,7 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
 }
 
 /*
- * Avoid consuming memory with our now useless list.
+ * Avoid consuming memory with our now useless rbtree.
  */
 static int tracing_stat_release(struct inode *i, struct file *f)
 {
-- 
1.5.4.rc3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] tracing/stat: sort in ascending order
  2009-05-25  8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
  2009-05-25  8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
  2009-05-25  8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
@ 2009-05-25 15:42 ` Frederic Weisbecker
  2009-05-26  1:09   ` Li Zefan
  2 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-25 15:42 UTC (permalink / raw)
  To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML

On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
> Currently the output of trace_stat/workqueues is totally reversed:
> 
>  # cat /debug/tracing/trace_stat/workqueues
>     ...
>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
>     1     3796     3796                     kblockd/1:120
>     ...
> 
> The correct output should be:
> 
>     1     3796     3796                     kblockd/1:120
>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
> 
> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
> 
> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
> it's better to always do ascending sorting in trace_stat.c, and leave each
> stat tracer to decide whether to sort in descending or ascending order.
> 
> [ Impact: fix the output of trace_stat/workqueue ]
> 
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>


For now in stat tracing, the ascendent sorting is the most relevant.
Especially because we always want to see the highest problems first.

-1 (or < 0) usually means lower and 1 ( > 0) is higher.

I wonder what would most confuse the developers of stat tracers:

- to reverse these common sort values (-1 turn into "higher")
- keep the default ascendent sorting, which is not natural because the default
  is often descendent.

I don't know. 

Anyone else. Do you have a preference?

Thanks,

Frederic.


> ---
>  kernel/trace/ftrace.c       |   12 ++++++------
>  kernel/trace/trace_branch.c |    5 +++--
>  kernel/trace/trace_stat.c   |    6 +-----
>  3 files changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 140699a..3dd16bd 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -315,29 +315,29 @@ static void *function_stat_start(struct tracer_stat *trace)
>  }
>  
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -/* function graph compares on total time */
> +/* function graph compares on total time in reverse order */
>  static int function_stat_cmp(void *p1, void *p2)
>  {
>  	struct ftrace_profile *a = p1;
>  	struct ftrace_profile *b = p2;
>  
> -	if (a->time < b->time)
> -		return -1;
>  	if (a->time > b->time)
> +		return -1;
> +	if (a->time < b->time)
>  		return 1;
>  	else
>  		return 0;
>  }
>  #else
> -/* not function graph compares against hits */
> +/* not function graph compares against hits in reverse order */
>  static int function_stat_cmp(void *p1, void *p2)
>  {
>  	struct ftrace_profile *a = p1;
>  	struct ftrace_profile *b = p2;
>  
> -	if (a->counter < b->counter)
> -		return -1;
>  	if (a->counter > b->counter)
> +		return -1;
> +	if (a->counter < b->counter)
>  		return 1;
>  	else
>  		return 0;
> diff --git a/kernel/trace/trace_branch.c b/kernel/trace/trace_branch.c
> index 7a7a9fd..df58411 100644
> --- a/kernel/trace/trace_branch.c
> +++ b/kernel/trace/trace_branch.c
> @@ -301,9 +301,10 @@ static int annotated_branch_stat_cmp(void *p1, void *p2)
>  	percent_a = get_incorrect_percent(a);
>  	percent_b = get_incorrect_percent(b);
>  
> -	if (percent_a < percent_b)
> -		return -1;
> +	/* sort in descending order */
>  	if (percent_a > percent_b)
> +		return -1;
> +	if (percent_a < percent_b)
>  		return 1;
>  	else
>  		return 0;
> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> index 2e849b5..6efbcb4 100644
> --- a/kernel/trace/trace_stat.c
> +++ b/kernel/trace/trace_stat.c
> @@ -98,10 +98,6 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
>  {
>  	struct rb_node **new = &(root->rb_node), *parent = NULL;
>  
> -	/*
> -	 * Figure out where to put new node
> -	 * This is a descendent sorting
> -	 */
>  	while (*new) {
>  		struct stat_node *this;
>  		int result;
> @@ -110,7 +106,7 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
>  		result = cmp(data->stat, this->stat);
>  
>  		parent = *new;
> -		if (result >= 0)
> +		if (result < 0)
>  			new = &((*new)->rb_left);
>  		else
>  			new = &((*new)->rb_right);
> -- 
> 1.5.4.rc3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
  2009-05-25  8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
@ 2009-05-25 15:59   ` Frederic Weisbecker
  2009-05-26  1:16     ` Li Zefan
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-25 15:59 UTC (permalink / raw)
  To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML

On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
> When closing a trace_stat file, we destroy the rbtree constructed during
> file open, but there is memory leak that the root node is not freed:
> 
> static struct rb_node *release_next(struct rb_node *node)
> {
> 	...
> 	else {
> 		if (!parent)		<-- here we leak @node
> 			return NULL;
> 	...
> }
> 
> This patch keeps removing root node until the tree is empty. We regress
> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
> it's a slow path, this change won't affect scalibility.
> 
> [ Impact: fix memory leak when closing a trace_stat file ]
> 
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  kernel/trace/trace_stat.c |   39 +++++----------------------------------
>  1 files changed, 5 insertions(+), 34 deletions(-)
> 
> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> index 6efbcb4..ed18701 100644
> --- a/kernel/trace/trace_stat.c
> +++ b/kernel/trace/trace_stat.c
> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
>  /* The root directory for all stat files */
>  static struct dentry		*stat_dir;
>  
> -/*
> - * Iterate through the rbtree using a post order traversal path
> - * to release the next node.
> - * It won't necessary release one at each iteration
> - * but it will at least advance closer to the next one
> - * to be released.
> - */
> -static struct rb_node *release_next(struct rb_node *node)
> +static void reset_stat_session(struct stat_session *session)
>  {
>  	struct stat_node *snode;
> -	struct rb_node *parent = rb_parent(node);
> -
> -	if (node->rb_left)
> -		return node->rb_left;
> -	else if (node->rb_right)
> -		return node->rb_right;
> -	else {
> -		if (!parent)
> -			return NULL;
> -		if (parent->rb_left == node)
> -			parent->rb_left = NULL;
> -		else
> -			parent->rb_right = NULL;
> +	struct rb_root *sroot = &session->stat_root;
>  
> -		snode = container_of(node, struct stat_node, node);
> +	while (!RB_EMPTY_ROOT(sroot)) {
> +		snode = rb_entry(sroot->rb_node, struct stat_node, node);
> +		rb_erase(&snode->node, sroot);



Why not just keep the previous version but sligthly
modified:


	while (node)
		node = release_next(node);

	if (!RB_EMPTY_ROOT(root)) {
		node = rb_entry(...)
		kfree(....)
		root = RB_ROOT
	}

Because the problem with rb_erase() is the wasteful color based rebalancing
performed whereas here we just need to walk into the tree and free
the nodes.

Hm?

Frederic.


>  		kfree(snode);
> -
> -		return parent;
>  	}
>  }
>  
> -static void reset_stat_session(struct stat_session *session)
> -{
> -	struct rb_node *node = session->stat_root.rb_node;
> -
> -	while (node)
> -		node = release_next(node);
> -
> -	session->stat_root = RB_ROOT;
> -}
> -
>  static void destroy_session(struct stat_session *session)
>  {
>  	debugfs_remove(session->file);
> -- 
> 1.5.4.rc3
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] tracing/stat: sort in ascending order
  2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
@ 2009-05-26  1:09   ` Li Zefan
  2009-05-26 20:46     ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-26  1:09 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, Steven Rostedt, LKML

Frederic Weisbecker wrote:
> On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
>> Currently the output of trace_stat/workqueues is totally reversed:
>>
>>  # cat /debug/tracing/trace_stat/workqueues
>>     ...
>>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
>>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
>>     1     3796     3796                     kblockd/1:120
>>     ...
>>
>> The correct output should be:
>>
>>     1     3796     3796                     kblockd/1:120
>>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
>>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
>>
>> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
>> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
>>
>> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
>> it's better to always do ascending sorting in trace_stat.c, and leave each
>> stat tracer to decide whether to sort in descending or ascending order.
>>
>> [ Impact: fix the output of trace_stat/workqueue ]
>>
>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> 
> 
> For now in stat tracing, the ascendent sorting is the most relevant.
> Especially because we always want to see the highest problems first.
> 

Yeah, I saw this.

> -1 (or < 0) usually means lower and 1 ( > 0) is higher.
> 
> I wonder what would most confuse the developers of stat tracers:
> 
> - to reverse these common sort values (-1 turn into "higher")
> - keep the default ascendent sorting, which is not natural because the default
>   is often descendent.
> 
> I don't know. 
> 
> Anyone else. Do you have a preference?
> 

When I looked into this bug, I was confused why it's descending sorting, though
then I found out the answer.

Imagine a new stat tracer wants to sort in ascending order, but it has to define
a cmp which compares in reverse order. This seems to be odd and confusing.

But to be honest, I'm also not sure which is better, And it doesn't seem to be
a big issue. So I think I'll make concessions.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
  2009-05-25 15:59   ` Frederic Weisbecker
@ 2009-05-26  1:16     ` Li Zefan
  2009-05-26 19:33       ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-26  1:16 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, Steven Rostedt, LKML

Frederic Weisbecker wrote:
> On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
>> When closing a trace_stat file, we destroy the rbtree constructed during
>> file open, but there is memory leak that the root node is not freed:
>>
>> static struct rb_node *release_next(struct rb_node *node)
>> {
>> 	...
>> 	else {
>> 		if (!parent)		<-- here we leak @node
>> 			return NULL;
>> 	...
>> }
>>
>> This patch keeps removing root node until the tree is empty. We regress
>> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
>> it's a slow path, this change won't affect scalibility.
>>
>> [ Impact: fix memory leak when closing a trace_stat file ]
>>
>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
>> ---
>>  kernel/trace/trace_stat.c |   39 +++++----------------------------------
>>  1 files changed, 5 insertions(+), 34 deletions(-)
>>
>> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
>> index 6efbcb4..ed18701 100644
>> --- a/kernel/trace/trace_stat.c
>> +++ b/kernel/trace/trace_stat.c
>> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
>>  /* The root directory for all stat files */
>>  static struct dentry		*stat_dir;
>>  
>> -/*
>> - * Iterate through the rbtree using a post order traversal path
>> - * to release the next node.
>> - * It won't necessary release one at each iteration
>> - * but it will at least advance closer to the next one
>> - * to be released.
>> - */
>> -static struct rb_node *release_next(struct rb_node *node)
>> +static void reset_stat_session(struct stat_session *session)
>>  {
>>  	struct stat_node *snode;
>> -	struct rb_node *parent = rb_parent(node);
>> -
>> -	if (node->rb_left)
>> -		return node->rb_left;
>> -	else if (node->rb_right)
>> -		return node->rb_right;
>> -	else {
>> -		if (!parent)
>> -			return NULL;
>> -		if (parent->rb_left == node)
>> -			parent->rb_left = NULL;
>> -		else
>> -			parent->rb_right = NULL;
>> +	struct rb_root *sroot = &session->stat_root;
>>  
>> -		snode = container_of(node, struct stat_node, node);
>> +	while (!RB_EMPTY_ROOT(sroot)) {
>> +		snode = rb_entry(sroot->rb_node, struct stat_node, node);
>> +		rb_erase(&snode->node, sroot);
> 
> 
> 
> Why not just keep the previous version but sligthly
> modified:
> 
> 
> 	while (node)
> 		node = release_next(node);
> 
> 	if (!RB_EMPTY_ROOT(root)) {
> 		node = rb_entry(...)
> 		kfree(....)
> 		root = RB_ROOT
> 	}
> 

The easiest fix:

 		if (!parent)
-			return NULL;
-		if (parent->rb_left == node)
+			;
+		else if (parent->rb_left == node)

;)

> Because the problem with rb_erase() is the wasteful color based rebalancing
> performed whereas here we just need to walk into the tree and free
> the nodes.
> 
> Hm?
> 

Less code, less bugs.

But actually I don't know how costly rb_erase() is, if it's really better to
avoid rb_erase(), I'll send another fix.

> Frederic.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
  2009-05-26  1:16     ` Li Zefan
@ 2009-05-26 19:33       ` Frederic Weisbecker
  0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-26 19:33 UTC (permalink / raw)
  To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML

On Tue, May 26, 2009 at 09:16:08AM +0800, Li Zefan wrote:
> Frederic Weisbecker wrote:
> > On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
> >> When closing a trace_stat file, we destroy the rbtree constructed during
> >> file open, but there is memory leak that the root node is not freed:
> >>
> >> static struct rb_node *release_next(struct rb_node *node)
> >> {
> >> 	...
> >> 	else {
> >> 		if (!parent)		<-- here we leak @node
> >> 			return NULL;
> >> 	...
> >> }
> >>
> >> This patch keeps removing root node until the tree is empty. We regress
> >> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
> >> it's a slow path, this change won't affect scalibility.
> >>
> >> [ Impact: fix memory leak when closing a trace_stat file ]
> >>
> >> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> >> ---
> >>  kernel/trace/trace_stat.c |   39 +++++----------------------------------
> >>  1 files changed, 5 insertions(+), 34 deletions(-)
> >>
> >> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> >> index 6efbcb4..ed18701 100644
> >> --- a/kernel/trace/trace_stat.c
> >> +++ b/kernel/trace/trace_stat.c
> >> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
> >>  /* The root directory for all stat files */
> >>  static struct dentry		*stat_dir;
> >>  
> >> -/*
> >> - * Iterate through the rbtree using a post order traversal path
> >> - * to release the next node.
> >> - * It won't necessary release one at each iteration
> >> - * but it will at least advance closer to the next one
> >> - * to be released.
> >> - */
> >> -static struct rb_node *release_next(struct rb_node *node)
> >> +static void reset_stat_session(struct stat_session *session)
> >>  {
> >>  	struct stat_node *snode;
> >> -	struct rb_node *parent = rb_parent(node);
> >> -
> >> -	if (node->rb_left)
> >> -		return node->rb_left;
> >> -	else if (node->rb_right)
> >> -		return node->rb_right;
> >> -	else {
> >> -		if (!parent)
> >> -			return NULL;
> >> -		if (parent->rb_left == node)
> >> -			parent->rb_left = NULL;
> >> -		else
> >> -			parent->rb_right = NULL;
> >> +	struct rb_root *sroot = &session->stat_root;
> >>  
> >> -		snode = container_of(node, struct stat_node, node);
> >> +	while (!RB_EMPTY_ROOT(sroot)) {
> >> +		snode = rb_entry(sroot->rb_node, struct stat_node, node);
> >> +		rb_erase(&snode->node, sroot);
> > 
> > 
> > 
> > Why not just keep the previous version but sligthly
> > modified:
> > 
> > 
> > 	while (node)
> > 		node = release_next(node);
> > 
> > 	if (!RB_EMPTY_ROOT(root)) {
> > 		node = rb_entry(...)
> > 		kfree(....)
> > 		root = RB_ROOT
> > 	}
> > 
> 
> The easiest fix:
> 
>  		if (!parent)
> -			return NULL;
> -		if (parent->rb_left == node)
> +			;
> +		else if (parent->rb_left == node)
> 
> ;)



Ah, yeah :)


> 
> > Because the problem with rb_erase() is the wasteful color based rebalancing
> > performed whereas here we just need to walk into the tree and free
> > the nodes.
> > 
> > Hm?
> > 
> 
> Less code, less bugs.
> 
> But actually I don't know how costly rb_erase() is, if it's really better to
> avoid rb_erase(), I'll send another fix.



It is more costly because the tree is rebalanced after erasing
each node.
I don't think the change could be really visualized though. Not
for now at least, but it could if a future tracer has a huge mass of
entries.

 
> > Frederic.
> > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] tracing/stat: sort in ascending order
  2009-05-26  1:09   ` Li Zefan
@ 2009-05-26 20:46     ` Frederic Weisbecker
  0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-26 20:46 UTC (permalink / raw)
  To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML

On Tue, May 26, 2009 at 09:09:24AM +0800, Li Zefan wrote:
> Frederic Weisbecker wrote:
> > On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
> >> Currently the output of trace_stat/workqueues is totally reversed:
> >>
> >>  # cat /debug/tracing/trace_stat/workqueues
> >>     ...
> >>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
> >>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
> >>     1     3796     3796                     kblockd/1:120
> >>     ...
> >>
> >> The correct output should be:
> >>
> >>     1     3796     3796                     kblockd/1:120
> >>     1     3779     3779      181       11   |-cfq_kick_queue+0x0/0x2f
> >>     1       17       17      210       37   `-blk_unplug_work+0x0/0x57
> >>
> >> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
> >> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
> >>
> >> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
> >> it's better to always do ascending sorting in trace_stat.c, and leave each
> >> stat tracer to decide whether to sort in descending or ascending order.
> >>
> >> [ Impact: fix the output of trace_stat/workqueue ]
> >>
> >> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> > 
> > 
> > For now in stat tracing, the ascendent sorting is the most relevant.
> > Especially because we always want to see the highest problems first.
> > 
> 
> Yeah, I saw this.
> 
> > -1 (or < 0) usually means lower and 1 ( > 0) is higher.
> > 
> > I wonder what would most confuse the developers of stat tracers:
> > 
> > - to reverse these common sort values (-1 turn into "higher")
> > - keep the default ascendent sorting, which is not natural because the default
> >   is often descendent.
> > 
> > I don't know. 
> > 
> > Anyone else. Do you have a preference?
> > 
> 
> When I looked into this bug, I was confused why it's descending sorting, though
> then I found out the answer.
> 
> Imagine a new stat tracer wants to sort in ascending order, but it has to define
> a cmp which compares in reverse order. This seems to be odd and confusing.
> 
> But to be honest, I'm also not sure which is better, And it doesn't seem to be
> a big issue. So I think I'll make concessions.


Ok, so let's keep it as is and we'll see if developers complain :)


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-05-26 20:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-25  8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
2009-05-25  8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
2009-05-25 15:59   ` Frederic Weisbecker
2009-05-26  1:16     ` Li Zefan
2009-05-26 19:33       ` Frederic Weisbecker
2009-05-25  8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
2009-05-26  1:09   ` Li Zefan
2009-05-26 20:46     ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).