* [PATCH 1/3] tracing/stat: sort in ascending order
@ 2009-05-25 8:46 Li Zefan
2009-05-25 8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Li Zefan @ 2009-05-25 8:46 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML
Currently the output of trace_stat/workqueues is totally reversed:
# cat /debug/tracing/trace_stat/workqueues
...
1 17 17 210 37 `-blk_unplug_work+0x0/0x57
1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
1 3796 3796 kblockd/1:120
...
The correct output should be:
1 3796 3796 kblockd/1:120
1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
1 17 17 210 37 `-blk_unplug_work+0x0/0x57
It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
(53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
it's better to always do ascending sorting in trace_stat.c, and leave each
stat tracer to decide whether to sort in descending or ascending order.
[ Impact: fix the output of trace_stat/workqueue ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
kernel/trace/ftrace.c | 12 ++++++------
kernel/trace/trace_branch.c | 5 +++--
kernel/trace/trace_stat.c | 6 +-----
3 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 140699a..3dd16bd 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -315,29 +315,29 @@ static void *function_stat_start(struct tracer_stat *trace)
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-/* function graph compares on total time */
+/* function graph compares on total time in reverse order */
static int function_stat_cmp(void *p1, void *p2)
{
struct ftrace_profile *a = p1;
struct ftrace_profile *b = p2;
- if (a->time < b->time)
- return -1;
if (a->time > b->time)
+ return -1;
+ if (a->time < b->time)
return 1;
else
return 0;
}
#else
-/* not function graph compares against hits */
+/* not function graph compares against hits in reverse order */
static int function_stat_cmp(void *p1, void *p2)
{
struct ftrace_profile *a = p1;
struct ftrace_profile *b = p2;
- if (a->counter < b->counter)
- return -1;
if (a->counter > b->counter)
+ return -1;
+ if (a->counter < b->counter)
return 1;
else
return 0;
diff --git a/kernel/trace/trace_branch.c b/kernel/trace/trace_branch.c
index 7a7a9fd..df58411 100644
--- a/kernel/trace/trace_branch.c
+++ b/kernel/trace/trace_branch.c
@@ -301,9 +301,10 @@ static int annotated_branch_stat_cmp(void *p1, void *p2)
percent_a = get_incorrect_percent(a);
percent_b = get_incorrect_percent(b);
- if (percent_a < percent_b)
- return -1;
+ /* sort in descending order */
if (percent_a > percent_b)
+ return -1;
+ if (percent_a < percent_b)
return 1;
else
return 0;
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 2e849b5..6efbcb4 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -98,10 +98,6 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
{
struct rb_node **new = &(root->rb_node), *parent = NULL;
- /*
- * Figure out where to put new node
- * This is a descendent sorting
- */
while (*new) {
struct stat_node *this;
int result;
@@ -110,7 +106,7 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
result = cmp(data->stat, this->stat);
parent = *new;
- if (result >= 0)
+ if (result < 0)
new = &((*new)->rb_left);
else
new = &((*new)->rb_right);
--
1.5.4.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] tracing/stat: simplify rbtree freeing code
2009-05-25 8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
@ 2009-05-25 8:46 ` Li Zefan
2009-05-25 15:59 ` Frederic Weisbecker
2009-05-25 8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
2 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-25 8:46 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML
When closing a trace_stat file, we destroy the rbtree constructed during
file open, but there is memory leak that the root node is not freed:
static struct rb_node *release_next(struct rb_node *node)
{
...
else {
if (!parent) <-- here we leak @node
return NULL;
...
}
This patch keeps removing root node until the tree is empty. We regress
from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
it's a slow path, this change won't affect scalibility.
[ Impact: fix memory leak when closing a trace_stat file ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
kernel/trace/trace_stat.c | 39 +++++----------------------------------
1 files changed, 5 insertions(+), 34 deletions(-)
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 6efbcb4..ed18701 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
/* The root directory for all stat files */
static struct dentry *stat_dir;
-/*
- * Iterate through the rbtree using a post order traversal path
- * to release the next node.
- * It won't necessary release one at each iteration
- * but it will at least advance closer to the next one
- * to be released.
- */
-static struct rb_node *release_next(struct rb_node *node)
+static void reset_stat_session(struct stat_session *session)
{
struct stat_node *snode;
- struct rb_node *parent = rb_parent(node);
-
- if (node->rb_left)
- return node->rb_left;
- else if (node->rb_right)
- return node->rb_right;
- else {
- if (!parent)
- return NULL;
- if (parent->rb_left == node)
- parent->rb_left = NULL;
- else
- parent->rb_right = NULL;
+ struct rb_root *sroot = &session->stat_root;
- snode = container_of(node, struct stat_node, node);
+ while (!RB_EMPTY_ROOT(sroot)) {
+ snode = rb_entry(sroot->rb_node, struct stat_node, node);
+ rb_erase(&snode->node, sroot);
kfree(snode);
-
- return parent;
}
}
-static void reset_stat_session(struct stat_session *session)
-{
- struct rb_node *node = session->stat_root.rb_node;
-
- while (node)
- node = release_next(node);
-
- session->stat_root = RB_ROOT;
-}
-
static void destroy_session(struct stat_session *session)
{
debugfs_remove(session->file);
--
1.5.4.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] tracing/stat: do some cleanups
2009-05-25 8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
2009-05-25 8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
@ 2009-05-25 8:46 ` Li Zefan
2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
2 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2009-05-25 8:46 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frederic Weisbecker, Steven Rostedt, LKML
- remove duplicate code in stat_seq_init()
- update comments to reflect the change from stat list to stat rbtree
[ Impact: clean up ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
kernel/trace/trace_stat.c | 56 +++++++++++++++++---------------------------
1 files changed, 22 insertions(+), 34 deletions(-)
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index ed18701..abf2ba1 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -8,7 +8,6 @@
*
*/
-
#include <linux/list.h>
#include <linux/rbtree.h>
#include <linux/debugfs.h>
@@ -64,10 +63,15 @@ static void destroy_session(struct stat_session *session)
typedef int (*cmp_stat_t)(void *, void *);
-static void
-insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
+static int insert_stat(struct rb_root *root, void *stat, cmp_stat_t cmp)
{
struct rb_node **new = &(root->rb_node), *parent = NULL;
+ struct stat_node *data;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+ data->stat = stat;
while (*new) {
struct stat_node *this;
@@ -85,12 +89,13 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
rb_link_node(&data->node, parent, new);
rb_insert_color(&data->node, root);
+ return 0;
}
/*
* For tracers that don't provide a stat_cmp callback.
- * This one will force an immediate insertion on tail of
- * the list.
+ * This one will force an insertion on rightmost node
+ * of the tree.
*/
static int dummy_cmp(void *p1, void *p2)
{
@@ -98,15 +103,14 @@ static int dummy_cmp(void *p1, void *p2)
}
/*
- * Initialize the stat list at each trace_stat file opening.
+ * Initialize the stat rbtree at each trace_stat file opening.
* All of these copies and sorting are required on all opening
* since the stats could have changed between two file sessions.
*/
static int stat_seq_init(struct stat_session *session)
{
struct tracer_stat *ts = session->ts;
- struct stat_node *new_entry;
- struct rb_root *root;
+ struct rb_root *root = &session->stat_root;
void *stat;
int ret = 0;
int i;
@@ -121,23 +125,13 @@ static int stat_seq_init(struct stat_session *session)
if (!stat)
goto exit;
- /*
- * The first entry. Actually this is the second, but the first
- * one (the stat_list head) is pointless.
- */
- new_entry = kzalloc(sizeof(*new_entry), GFP_KERNEL);
- if (!new_entry) {
- ret = -ENOMEM;
+ ret = insert_stat(root, stat, ts->stat_cmp);
+ if (ret)
goto exit;
- }
- root = &session->stat_root;
- insert_stat(root, new_entry, dummy_cmp);
-
- new_entry->stat = stat;
/*
- * Iterate over the tracer stat entries and store them in a sorted
- * list.
+ * Iterate over the tracer stat entries and store them
+ * in an rbtree.
*/
for (i = 1; ; i++) {
stat = ts->stat_next(stat, i);
@@ -146,22 +140,16 @@ static int stat_seq_init(struct stat_session *session)
if (!stat)
break;
- new_entry = kzalloc(sizeof(*new_entry), GFP_KERNEL);
- if (!new_entry) {
- ret = -ENOMEM;
- goto exit_free_list;
- }
-
- new_entry->stat = stat;
-
- insert_stat(root, new_entry, ts->stat_cmp);
+ ret = insert_stat(root, stat, ts->stat_cmp);
+ if (ret)
+ goto exit_free_rbtree;
}
exit:
mutex_unlock(&session->stat_mutex);
return ret;
-exit_free_list:
+exit_free_rbtree:
reset_stat_session(session);
mutex_unlock(&session->stat_mutex);
return ret;
@@ -174,7 +162,7 @@ static void *stat_seq_start(struct seq_file *s, loff_t *pos)
struct rb_node *node;
int i;
- /* Prevent from tracer switch or stat_list modification */
+ /* Prevent from tracer switch or rbtree modification */
mutex_lock(&session->stat_mutex);
/* If we are in the beginning of the file, print the headers */
@@ -252,7 +240,7 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
}
/*
- * Avoid consuming memory with our now useless list.
+ * Avoid consuming memory with our now useless rbtree.
*/
static int tracing_stat_release(struct inode *i, struct file *f)
{
--
1.5.4.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/stat: sort in ascending order
2009-05-25 8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
2009-05-25 8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
2009-05-25 8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
@ 2009-05-25 15:42 ` Frederic Weisbecker
2009-05-26 1:09 ` Li Zefan
2 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-25 15:42 UTC (permalink / raw)
To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML
On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
> Currently the output of trace_stat/workqueues is totally reversed:
>
> # cat /debug/tracing/trace_stat/workqueues
> ...
> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
> 1 3796 3796 kblockd/1:120
> ...
>
> The correct output should be:
>
> 1 3796 3796 kblockd/1:120
> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
>
> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
>
> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
> it's better to always do ascending sorting in trace_stat.c, and leave each
> stat tracer to decide whether to sort in descending or ascending order.
>
> [ Impact: fix the output of trace_stat/workqueue ]
>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
For now in stat tracing, the ascendent sorting is the most relevant.
Especially because we always want to see the highest problems first.
-1 (or < 0) usually means lower and 1 ( > 0) is higher.
I wonder what would most confuse the developers of stat tracers:
- to reverse these common sort values (-1 turn into "higher")
- keep the default ascendent sorting, which is not natural because the default
is often descendent.
I don't know.
Anyone else. Do you have a preference?
Thanks,
Frederic.
> ---
> kernel/trace/ftrace.c | 12 ++++++------
> kernel/trace/trace_branch.c | 5 +++--
> kernel/trace/trace_stat.c | 6 +-----
> 3 files changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 140699a..3dd16bd 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -315,29 +315,29 @@ static void *function_stat_start(struct tracer_stat *trace)
> }
>
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -/* function graph compares on total time */
> +/* function graph compares on total time in reverse order */
> static int function_stat_cmp(void *p1, void *p2)
> {
> struct ftrace_profile *a = p1;
> struct ftrace_profile *b = p2;
>
> - if (a->time < b->time)
> - return -1;
> if (a->time > b->time)
> + return -1;
> + if (a->time < b->time)
> return 1;
> else
> return 0;
> }
> #else
> -/* not function graph compares against hits */
> +/* not function graph compares against hits in reverse order */
> static int function_stat_cmp(void *p1, void *p2)
> {
> struct ftrace_profile *a = p1;
> struct ftrace_profile *b = p2;
>
> - if (a->counter < b->counter)
> - return -1;
> if (a->counter > b->counter)
> + return -1;
> + if (a->counter < b->counter)
> return 1;
> else
> return 0;
> diff --git a/kernel/trace/trace_branch.c b/kernel/trace/trace_branch.c
> index 7a7a9fd..df58411 100644
> --- a/kernel/trace/trace_branch.c
> +++ b/kernel/trace/trace_branch.c
> @@ -301,9 +301,10 @@ static int annotated_branch_stat_cmp(void *p1, void *p2)
> percent_a = get_incorrect_percent(a);
> percent_b = get_incorrect_percent(b);
>
> - if (percent_a < percent_b)
> - return -1;
> + /* sort in descending order */
> if (percent_a > percent_b)
> + return -1;
> + if (percent_a < percent_b)
> return 1;
> else
> return 0;
> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> index 2e849b5..6efbcb4 100644
> --- a/kernel/trace/trace_stat.c
> +++ b/kernel/trace/trace_stat.c
> @@ -98,10 +98,6 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
> {
> struct rb_node **new = &(root->rb_node), *parent = NULL;
>
> - /*
> - * Figure out where to put new node
> - * This is a descendent sorting
> - */
> while (*new) {
> struct stat_node *this;
> int result;
> @@ -110,7 +106,7 @@ insert_stat(struct rb_root *root, struct stat_node *data, cmp_stat_t cmp)
> result = cmp(data->stat, this->stat);
>
> parent = *new;
> - if (result >= 0)
> + if (result < 0)
> new = &((*new)->rb_left);
> else
> new = &((*new)->rb_right);
> --
> 1.5.4.rc3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
2009-05-25 8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
@ 2009-05-25 15:59 ` Frederic Weisbecker
2009-05-26 1:16 ` Li Zefan
0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-25 15:59 UTC (permalink / raw)
To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML
On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
> When closing a trace_stat file, we destroy the rbtree constructed during
> file open, but there is memory leak that the root node is not freed:
>
> static struct rb_node *release_next(struct rb_node *node)
> {
> ...
> else {
> if (!parent) <-- here we leak @node
> return NULL;
> ...
> }
>
> This patch keeps removing root node until the tree is empty. We regress
> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
> it's a slow path, this change won't affect scalibility.
>
> [ Impact: fix memory leak when closing a trace_stat file ]
>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> ---
> kernel/trace/trace_stat.c | 39 +++++----------------------------------
> 1 files changed, 5 insertions(+), 34 deletions(-)
>
> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> index 6efbcb4..ed18701 100644
> --- a/kernel/trace/trace_stat.c
> +++ b/kernel/trace/trace_stat.c
> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
> /* The root directory for all stat files */
> static struct dentry *stat_dir;
>
> -/*
> - * Iterate through the rbtree using a post order traversal path
> - * to release the next node.
> - * It won't necessary release one at each iteration
> - * but it will at least advance closer to the next one
> - * to be released.
> - */
> -static struct rb_node *release_next(struct rb_node *node)
> +static void reset_stat_session(struct stat_session *session)
> {
> struct stat_node *snode;
> - struct rb_node *parent = rb_parent(node);
> -
> - if (node->rb_left)
> - return node->rb_left;
> - else if (node->rb_right)
> - return node->rb_right;
> - else {
> - if (!parent)
> - return NULL;
> - if (parent->rb_left == node)
> - parent->rb_left = NULL;
> - else
> - parent->rb_right = NULL;
> + struct rb_root *sroot = &session->stat_root;
>
> - snode = container_of(node, struct stat_node, node);
> + while (!RB_EMPTY_ROOT(sroot)) {
> + snode = rb_entry(sroot->rb_node, struct stat_node, node);
> + rb_erase(&snode->node, sroot);
Why not just keep the previous version but sligthly
modified:
while (node)
node = release_next(node);
if (!RB_EMPTY_ROOT(root)) {
node = rb_entry(...)
kfree(....)
root = RB_ROOT
}
Because the problem with rb_erase() is the wasteful color based rebalancing
performed whereas here we just need to walk into the tree and free
the nodes.
Hm?
Frederic.
> kfree(snode);
> -
> - return parent;
> }
> }
>
> -static void reset_stat_session(struct stat_session *session)
> -{
> - struct rb_node *node = session->stat_root.rb_node;
> -
> - while (node)
> - node = release_next(node);
> -
> - session->stat_root = RB_ROOT;
> -}
> -
> static void destroy_session(struct stat_session *session)
> {
> debugfs_remove(session->file);
> --
> 1.5.4.rc3
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/stat: sort in ascending order
2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
@ 2009-05-26 1:09 ` Li Zefan
2009-05-26 20:46 ` Frederic Weisbecker
0 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-26 1:09 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Ingo Molnar, Steven Rostedt, LKML
Frederic Weisbecker wrote:
> On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
>> Currently the output of trace_stat/workqueues is totally reversed:
>>
>> # cat /debug/tracing/trace_stat/workqueues
>> ...
>> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
>> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
>> 1 3796 3796 kblockd/1:120
>> ...
>>
>> The correct output should be:
>>
>> 1 3796 3796 kblockd/1:120
>> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
>> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
>>
>> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
>> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
>>
>> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
>> it's better to always do ascending sorting in trace_stat.c, and leave each
>> stat tracer to decide whether to sort in descending or ascending order.
>>
>> [ Impact: fix the output of trace_stat/workqueue ]
>>
>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
>
>
> For now in stat tracing, the ascendent sorting is the most relevant.
> Especially because we always want to see the highest problems first.
>
Yeah, I saw this.
> -1 (or < 0) usually means lower and 1 ( > 0) is higher.
>
> I wonder what would most confuse the developers of stat tracers:
>
> - to reverse these common sort values (-1 turn into "higher")
> - keep the default ascendent sorting, which is not natural because the default
> is often descendent.
>
> I don't know.
>
> Anyone else. Do you have a preference?
>
When I looked into this bug, I was confused why it's descending sorting, though
then I found out the answer.
Imagine a new stat tracer wants to sort in ascending order, but it has to define
a cmp which compares in reverse order. This seems to be odd and confusing.
But to be honest, I'm also not sure which is better, And it doesn't seem to be
a big issue. So I think I'll make concessions.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
2009-05-25 15:59 ` Frederic Weisbecker
@ 2009-05-26 1:16 ` Li Zefan
2009-05-26 19:33 ` Frederic Weisbecker
0 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2009-05-26 1:16 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Ingo Molnar, Steven Rostedt, LKML
Frederic Weisbecker wrote:
> On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
>> When closing a trace_stat file, we destroy the rbtree constructed during
>> file open, but there is memory leak that the root node is not freed:
>>
>> static struct rb_node *release_next(struct rb_node *node)
>> {
>> ...
>> else {
>> if (!parent) <-- here we leak @node
>> return NULL;
>> ...
>> }
>>
>> This patch keeps removing root node until the tree is empty. We regress
>> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
>> it's a slow path, this change won't affect scalibility.
>>
>> [ Impact: fix memory leak when closing a trace_stat file ]
>>
>> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
>> ---
>> kernel/trace/trace_stat.c | 39 +++++----------------------------------
>> 1 files changed, 5 insertions(+), 34 deletions(-)
>>
>> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
>> index 6efbcb4..ed18701 100644
>> --- a/kernel/trace/trace_stat.c
>> +++ b/kernel/trace/trace_stat.c
>> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
>> /* The root directory for all stat files */
>> static struct dentry *stat_dir;
>>
>> -/*
>> - * Iterate through the rbtree using a post order traversal path
>> - * to release the next node.
>> - * It won't necessary release one at each iteration
>> - * but it will at least advance closer to the next one
>> - * to be released.
>> - */
>> -static struct rb_node *release_next(struct rb_node *node)
>> +static void reset_stat_session(struct stat_session *session)
>> {
>> struct stat_node *snode;
>> - struct rb_node *parent = rb_parent(node);
>> -
>> - if (node->rb_left)
>> - return node->rb_left;
>> - else if (node->rb_right)
>> - return node->rb_right;
>> - else {
>> - if (!parent)
>> - return NULL;
>> - if (parent->rb_left == node)
>> - parent->rb_left = NULL;
>> - else
>> - parent->rb_right = NULL;
>> + struct rb_root *sroot = &session->stat_root;
>>
>> - snode = container_of(node, struct stat_node, node);
>> + while (!RB_EMPTY_ROOT(sroot)) {
>> + snode = rb_entry(sroot->rb_node, struct stat_node, node);
>> + rb_erase(&snode->node, sroot);
>
>
>
> Why not just keep the previous version but sligthly
> modified:
>
>
> while (node)
> node = release_next(node);
>
> if (!RB_EMPTY_ROOT(root)) {
> node = rb_entry(...)
> kfree(....)
> root = RB_ROOT
> }
>
The easiest fix:
if (!parent)
- return NULL;
- if (parent->rb_left == node)
+ ;
+ else if (parent->rb_left == node)
;)
> Because the problem with rb_erase() is the wasteful color based rebalancing
> performed whereas here we just need to walk into the tree and free
> the nodes.
>
> Hm?
>
Less code, less bugs.
But actually I don't know how costly rb_erase() is, if it's really better to
avoid rb_erase(), I'll send another fix.
> Frederic.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] tracing/stat: simplify rbtree freeing code
2009-05-26 1:16 ` Li Zefan
@ 2009-05-26 19:33 ` Frederic Weisbecker
0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-26 19:33 UTC (permalink / raw)
To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML
On Tue, May 26, 2009 at 09:16:08AM +0800, Li Zefan wrote:
> Frederic Weisbecker wrote:
> > On Mon, May 25, 2009 at 04:46:29PM +0800, Li Zefan wrote:
> >> When closing a trace_stat file, we destroy the rbtree constructed during
> >> file open, but there is memory leak that the root node is not freed:
> >>
> >> static struct rb_node *release_next(struct rb_node *node)
> >> {
> >> ...
> >> else {
> >> if (!parent) <-- here we leak @node
> >> return NULL;
> >> ...
> >> }
> >>
> >> This patch keeps removing root node until the tree is empty. We regress
> >> from O(n) to O(nlogn), but since both open() and read() are O(nlogn) and
> >> it's a slow path, this change won't affect scalibility.
> >>
> >> [ Impact: fix memory leak when closing a trace_stat file ]
> >>
> >> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> >> ---
> >> kernel/trace/trace_stat.c | 39 +++++----------------------------------
> >> 1 files changed, 5 insertions(+), 34 deletions(-)
> >>
> >> diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
> >> index 6efbcb4..ed18701 100644
> >> --- a/kernel/trace/trace_stat.c
> >> +++ b/kernel/trace/trace_stat.c
> >> @@ -42,47 +42,18 @@ static DEFINE_MUTEX(all_stat_sessions_mutex);
> >> /* The root directory for all stat files */
> >> static struct dentry *stat_dir;
> >>
> >> -/*
> >> - * Iterate through the rbtree using a post order traversal path
> >> - * to release the next node.
> >> - * It won't necessary release one at each iteration
> >> - * but it will at least advance closer to the next one
> >> - * to be released.
> >> - */
> >> -static struct rb_node *release_next(struct rb_node *node)
> >> +static void reset_stat_session(struct stat_session *session)
> >> {
> >> struct stat_node *snode;
> >> - struct rb_node *parent = rb_parent(node);
> >> -
> >> - if (node->rb_left)
> >> - return node->rb_left;
> >> - else if (node->rb_right)
> >> - return node->rb_right;
> >> - else {
> >> - if (!parent)
> >> - return NULL;
> >> - if (parent->rb_left == node)
> >> - parent->rb_left = NULL;
> >> - else
> >> - parent->rb_right = NULL;
> >> + struct rb_root *sroot = &session->stat_root;
> >>
> >> - snode = container_of(node, struct stat_node, node);
> >> + while (!RB_EMPTY_ROOT(sroot)) {
> >> + snode = rb_entry(sroot->rb_node, struct stat_node, node);
> >> + rb_erase(&snode->node, sroot);
> >
> >
> >
> > Why not just keep the previous version but sligthly
> > modified:
> >
> >
> > while (node)
> > node = release_next(node);
> >
> > if (!RB_EMPTY_ROOT(root)) {
> > node = rb_entry(...)
> > kfree(....)
> > root = RB_ROOT
> > }
> >
>
> The easiest fix:
>
> if (!parent)
> - return NULL;
> - if (parent->rb_left == node)
> + ;
> + else if (parent->rb_left == node)
>
> ;)
Ah, yeah :)
>
> > Because the problem with rb_erase() is the wasteful color based rebalancing
> > performed whereas here we just need to walk into the tree and free
> > the nodes.
> >
> > Hm?
> >
>
> Less code, less bugs.
>
> But actually I don't know how costly rb_erase() is, if it's really better to
> avoid rb_erase(), I'll send another fix.
It is more costly because the tree is rebalanced after erasing
each node.
I don't think the change could be really visualized though. Not
for now at least, but it could if a future tracer has a huge mass of
entries.
> > Frederic.
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] tracing/stat: sort in ascending order
2009-05-26 1:09 ` Li Zefan
@ 2009-05-26 20:46 ` Frederic Weisbecker
0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2009-05-26 20:46 UTC (permalink / raw)
To: Li Zefan; +Cc: Ingo Molnar, Steven Rostedt, LKML
On Tue, May 26, 2009 at 09:09:24AM +0800, Li Zefan wrote:
> Frederic Weisbecker wrote:
> > On Mon, May 25, 2009 at 04:46:09PM +0800, Li Zefan wrote:
> >> Currently the output of trace_stat/workqueues is totally reversed:
> >>
> >> # cat /debug/tracing/trace_stat/workqueues
> >> ...
> >> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
> >> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
> >> 1 3796 3796 kblockd/1:120
> >> ...
> >>
> >> The correct output should be:
> >>
> >> 1 3796 3796 kblockd/1:120
> >> 1 3779 3779 181 11 |-cfq_kick_queue+0x0/0x2f
> >> 1 17 17 210 37 `-blk_unplug_work+0x0/0x57
> >>
> >> It's caused by "tracing/stat: replace linked list by an rbtree for sorting"
> >> (53059c9b67a62a3dc8c80204d3da42b9267ea5a0).
> >>
> >> Though we can simply change dummy_cmp() to return -1 instead of 1, IMO
> >> it's better to always do ascending sorting in trace_stat.c, and leave each
> >> stat tracer to decide whether to sort in descending or ascending order.
> >>
> >> [ Impact: fix the output of trace_stat/workqueue ]
> >>
> >> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> >
> >
> > For now in stat tracing, the ascendent sorting is the most relevant.
> > Especially because we always want to see the highest problems first.
> >
>
> Yeah, I saw this.
>
> > -1 (or < 0) usually means lower and 1 ( > 0) is higher.
> >
> > I wonder what would most confuse the developers of stat tracers:
> >
> > - to reverse these common sort values (-1 turn into "higher")
> > - keep the default ascendent sorting, which is not natural because the default
> > is often descendent.
> >
> > I don't know.
> >
> > Anyone else. Do you have a preference?
> >
>
> When I looked into this bug, I was confused why it's descending sorting, though
> then I found out the answer.
>
> Imagine a new stat tracer wants to sort in ascending order, but it has to define
> a cmp which compares in reverse order. This seems to be odd and confusing.
>
> But to be honest, I'm also not sure which is better, And it doesn't seem to be
> a big issue. So I think I'll make concessions.
Ok, so let's keep it as is and we'll see if developers complain :)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-05-26 20:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-25 8:46 [PATCH 1/3] tracing/stat: sort in ascending order Li Zefan
2009-05-25 8:46 ` [PATCH 2/3] tracing/stat: simplify rbtree freeing code Li Zefan
2009-05-25 15:59 ` Frederic Weisbecker
2009-05-26 1:16 ` Li Zefan
2009-05-26 19:33 ` Frederic Weisbecker
2009-05-25 8:46 ` [PATCH 3/3] tracing/stat: do some cleanups Li Zefan
2009-05-25 15:42 ` [PATCH 1/3] tracing/stat: sort in ascending order Frederic Weisbecker
2009-05-26 1:09 ` Li Zefan
2009-05-26 20:46 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).