All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/sched: fix for getting task's execute time
@ 2009-12-06 10:57 Xiao Guangrong
  2009-12-06 11:05 ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-06 10:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, Frederic Weisbecker, Paul Mackerras, LKML

In current code, we get task's execute time by reading
"/proc/<pid>/sched" file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.
So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
file.

This patch also remove redundant include files since <sys/types.h>
is included in "perf.h"

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 26b782f..b2e910e 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -416,7 +415,7 @@ static u64 get_cpu_usage_nsec_parent(void)
 
 static u64 get_cpu_usage_nsec_self(void)
 {
-	char filename [] = "/proc/1234567890/sched";
+	char filename [] = "/proc/1234567890/task/1234567890/sched";
 	unsigned long msecs, nsecs;
 	char *line = NULL;
 	u64 total = 0;
@@ -425,7 +424,9 @@ static u64 get_cpu_usage_nsec_self(void)
 	FILE *file;
 	int ret;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
+	sprintf(filename, "/proc/%d/task/%d/sched", getpid(),
+		(pid_t)syscall(SYS_gettid));
+
 	file = fopen(filename, "r");
 	BUG_ON(!file);
 
-- 
1.6.1.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 10:57 [PATCH] perf/sched: fix for getting task's execute time Xiao Guangrong
@ 2009-12-06 11:05 ` Peter Zijlstra
  2009-12-06 11:06   ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2009-12-06 11:05 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Ingo Molnar, Frederic Weisbecker, Paul Mackerras, LKML

On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> In current code, we get task's execute time by reading
> "/proc/<pid>/sched" file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> file.
> 
> This patch also remove redundant include files since <sys/types.h>
> is included in "perf.h"

We really should not be using these proc files but instead make sure
this information gets transferred through a tracepoint or similar.

Reading these proc files is too prone to races.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:05 ` Peter Zijlstra
@ 2009-12-06 11:06   ` Peter Zijlstra
  2009-12-06 11:50     ` Ingo Molnar
  2009-12-06 17:10     ` Xiao Guangrong
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2009-12-06 11:06 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Ingo Molnar, Frederic Weisbecker, Paul Mackerras, LKML

On Sun, 2009-12-06 at 12:05 +0100, Peter Zijlstra wrote:
> On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> > In current code, we get task's execute time by reading
> > "/proc/<pid>/sched" file, it's wrong if the task is created
> > by pthread_create(), because every thread task has same pid.
> > So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> > file.
> > 
> > This patch also remove redundant include files since <sys/types.h>
> > is included in "perf.h"
> 
> We really should not be using these proc files but instead make sure
> this information gets transferred through a tracepoint or similar.
> 
> Reading these proc files is too prone to races.

We can probably get the runtime by grouping a task-clock swcounter with
an appropriate other event.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:06   ` Peter Zijlstra
@ 2009-12-06 11:50     ` Ingo Molnar
  2009-12-06 17:10     ` Xiao Guangrong
  1 sibling, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-06 11:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Xiao Guangrong, Frederic Weisbecker, Paul Mackerras, LKML


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sun, 2009-12-06 at 12:05 +0100, Peter Zijlstra wrote:
> > On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> > > In current code, we get task's execute time by reading
> > > "/proc/<pid>/sched" file, it's wrong if the task is created
> > > by pthread_create(), because every thread task has same pid.
> > > So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> > > file.
> > > 
> > > This patch also remove redundant include files since <sys/types.h>
> > > is included in "perf.h"
> > 
> > We really should not be using these proc files but instead make sure
> > this information gets transferred through a tracepoint or similar.
> > 
> > Reading these proc files is too prone to races.

yeah. Ideally we'd want all valuable information that is available via 
/proc to be available via perf events as well. In the future it should 
be possible to run perf even without /proc mounted for example.

Furthermore it's good for consistency and simplicity as well, plus it's 
faster too to get the information from the perf syscall and mmap-ed 
ringbuffer than to read things via /proc. No need for ASCII conversion, 
fixed record formats, fast streaming and buffering, no read() overhead, 
etc.

> We can probably get the runtime by grouping a task-clock swcounter 
> with an appropriate other event.

Would be lovely.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:06   ` Peter Zijlstra
  2009-12-06 11:50     ` Ingo Molnar
@ 2009-12-06 17:10     ` Xiao Guangrong
  2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
  1 sibling, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-06 17:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Xiao Guangrong, Ingo Molnar, Frederic Weisbecker, Paul Mackerras, LKML

Peter Zijlstra wrote:

>> We really should not be using these proc files but instead make sure
>> this information gets transferred through a tracepoint or similar.
>>
>> Reading these proc files is too prone to races.
> 
> We can probably get the runtime by grouping a task-clock swcounter with
> an appropriate other event.
> 

Hi Peter,

Thanks your suggestion.

Actually, we can call getrusage(RUSAGE_THREAD, ru) to get current task's
execute time, and I think this is a simpler way.

I'll send v2 patch later.

Thanks,
Xiao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] perf/sched: fix for getting task's execution time
  2009-12-06 17:10     ` Xiao Guangrong
@ 2009-12-07  7:20       ` Xiao Guangrong
  2009-12-07  7:30         ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-07  7:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Paul Mackerras,
	Török Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile with
   'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch call getrusage() to get task's execution time instead
of reading /proc file

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   46 ++++++++++++++-----------------------------
 1 files changed, 15 insertions(+), 31 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..bd57994 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -399,49 +398,34 @@ process_sched_event(struct task_desc *this_task __used, struct sched_atom *atom)
 	}
 }
 
+static u64 get_sum_time(struct rusage *ru)
+{
+	u64 sum;
+
+	sum =  ru->ru_utime.tv_sec*1e9 + ru->ru_utime.tv_usec*1e3;
+	sum += ru->ru_stime.tv_sec*1e9 + ru->ru_stime.tv_usec*1e3;
+	return sum;
+}
+
 static u64 get_cpu_usage_nsec_parent(void)
 {
 	struct rusage ru;
-	u64 sum;
 	int err;
 
 	err = getrusage(RUSAGE_SELF, &ru);
 	BUG_ON(err);
 
-	sum =  ru.ru_utime.tv_sec*1e9 + ru.ru_utime.tv_usec*1e3;
-	sum += ru.ru_stime.tv_sec*1e9 + ru.ru_stime.tv_usec*1e3;
-
-	return sum;
+	return get_sum_time(&ru);
 }
 
 static u64 get_cpu_usage_nsec_self(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
-
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
-
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	struct rusage ru;
+	int err;
 
-	return total;
+	err = getrusage(RUSAGE_THREAD, &ru);
+	BUG_ON(err);
+	return get_sum_time(&ru);
 }
 
 static void *thread_func(void *ctx)
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf/sched: fix for getting task's execution time
  2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
@ 2009-12-07  7:30         ` Ingo Molnar
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2009-12-07  7:30 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile with
>    'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch call getrusage() to get task's execution time instead
> of reading /proc file

ok, that's better than /proc, but how about using 
PERF_COUNT_SW_TASK_CLOCK instead of rusage, to recover the CPU time 
used? It would be more precise, and it would use the perf API. (Some 
helper functions in tools/perf/lib/ would be nice to make it as easy to 
use as rusage (or even easier if possible).)

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-07  7:30         ` Ingo Molnar
@ 2009-12-09  9:51           ` Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
                               ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile
    with 'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 
-- 
1.6.1.2






^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
@ 2009-12-09  9:54             ` Xiao Guangrong
  2009-12-09  9:59               ` Ingo Molnar
  2009-12-09  9:57             ` Xiao Guangrong
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

Sorry, miss Reported-by, please ignore this mail, i'll resend it

Xiao Guangrong wrote:
> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile
>     with 'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
> execution time instead of reading /proc file
> 
> Changelog v2 -> v3:
> use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
@ 2009-12-09  9:57             ` Xiao Guangrong
  2009-12-09  9:57             ` Ingo Molnar
  2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile
    with 'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 
-- 
1.6.1.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
  2009-12-09  9:57             ` Xiao Guangrong
@ 2009-12-09  9:57             ` Ingo Molnar
  2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-09  9:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile
>     with 'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
> execution time instead of reading /proc file
> 
> Changelog v2 -> v3:
> use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> ---
>  tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
>  1 files changed, 27 insertions(+), 28 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 19f43fa..b12b23a 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -13,7 +13,6 @@
>  #include "util/debug.h"
>  #include "util/data_map.h"
>  
> -#include <sys/types.h>
>  #include <sys/prctl.h>
>  
>  #include <semaphore.h>
> @@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
>  	return sum;
>  }
>  
> -static u64 get_cpu_usage_nsec_self(void)
> +static int self_open_counters(void)
>  {
> -	char filename [] = "/proc/1234567890/sched";
> -	unsigned long msecs, nsecs;
> -	char *line = NULL;
> -	u64 total = 0;
> -	size_t len = 0;
> -	ssize_t chars;
> -	FILE *file;
> -	int ret;
> +	struct perf_event_attr attr;
> +	int fd;
>  
> -	sprintf(filename, "/proc/%d/sched", getpid());
> -	file = fopen(filename, "r");
> -	BUG_ON(!file);
> +	memset(&attr, 0, sizeof(attr));
>  
> -	while ((chars = getline(&line, &len, file)) != -1) {
> -		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
> -			&msecs, &nsecs);
> -		if (ret == 2) {
> -			total = msecs*1e6 + nsecs;
> -			break;
> -		}
> -	}
> -	if (line)
> -		free(line);
> -	fclose(file);
> +	attr.type = PERF_TYPE_SOFTWARE;
> +	attr.config = PERF_COUNT_SW_TASK_CLOCK;
>  
> -	return total;
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +
> +	if (fd < 0)
> +		die("Error: sys_perf_event_open() syscall returned"
> +		    "with %d (%s)\n", fd, strerror(errno));
> +	return fd;
> +}
> +
> +static u64 get_cpu_usage_nsec_self(int fd)
> +{
> +	u64 runtime;
> +	int ret;
> +
> +	ret = read(fd, &runtime, sizeof(runtime));
> +	BUG_ON(ret != sizeof(runtime));
> +
> +	return runtime;
>  }
>  
>  static void *thread_func(void *ctx)
> @@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
>  	u64 cpu_usage_0, cpu_usage_1;
>  	unsigned long i, ret;
>  	char comm2[22];
> +	int fd;
>  
>  	sprintf(comm2, ":%s", this_task->comm);
>  	prctl(PR_SET_NAME, comm2);
> +	fd = self_open_counters();
>  
>  again:
>  	ret = sem_post(&this_task->ready_for_work);
> @@ -462,16 +462,15 @@ again:
>  	ret = pthread_mutex_unlock(&start_work_mutex);
>  	BUG_ON(ret);
>  
> -	cpu_usage_0 = get_cpu_usage_nsec_self();
> +	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>  
>  	for (i = 0; i < this_task->nr_events; i++) {
>  		this_task->curr_event = i;
>  		process_sched_event(this_task, this_task->atoms[i]);
>  	}
>  
> -	cpu_usage_1 = get_cpu_usage_nsec_self();
> +	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>  	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> -
>  	ret = sem_post(&this_task->work_done_sem);
>  	BUG_ON(ret);

Very nice - and the code even got shorter a tiny bit.

Applied, thanks!

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:54             ` Xiao Guangrong
@ 2009-12-09  9:59               ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-09  9:59 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> Sorry, miss Reported-by, please ignore this mail, i'll resend it

I've added Edwin's Reported-by, no need to resend.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:perf/urgent] perf sched: Fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
                               ` (2 preceding siblings ...)
  2009-12-09  9:57             ` Ingo Molnar
@ 2009-12-09 10:03             ` tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Xiao Guangrong @ 2009-12-09 10:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, hpa, mingo, peterz, edwintorok,
	xiaoguangrong, ericxiao.gr, fweisbec, tglx, mingo

Commit-ID:  c0c9e72150c07b4a6766cd24a6f685ec2dc9c343
Gitweb:     http://git.kernel.org/tip/c0c9e72150c07b4a6766cd24a6f685ec2dc9c343
Author:     Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
AuthorDate: Wed, 9 Dec 2009 17:51:30 +0800
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Wed, 9 Dec 2009 10:59:12 +0100

perf sched: Fix for getting task's execution time

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

 1: 'perf sched replay' can't work if the kernel is not
    compiled with the 'CONFIG_SCHED_DEBUG' option

 2: perf tool should depend on proc file system

So, this patch uses PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file.

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's
suggestion

Reported-by: Torok Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Xiao Guangrong <ericxiao.gr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <4B1F7322.80103@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-12-09 10:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-06 10:57 [PATCH] perf/sched: fix for getting task's execute time Xiao Guangrong
2009-12-06 11:05 ` Peter Zijlstra
2009-12-06 11:06   ` Peter Zijlstra
2009-12-06 11:50     ` Ingo Molnar
2009-12-06 17:10     ` Xiao Guangrong
2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
2009-12-07  7:30         ` Ingo Molnar
2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
2009-12-09  9:54             ` Xiao Guangrong
2009-12-09  9:59               ` Ingo Molnar
2009-12-09  9:57             ` Xiao Guangrong
2009-12-09  9:57             ` Ingo Molnar
2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.