linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] perf tools: Introduce perf_evlist__start_workload_ex()
@ 2015-05-20  2:48 Namhyung Kim
  2015-05-20  2:48 ` [PATCH 2/3] perf trace: Create struct thread for command line workload Namhyung Kim
  2015-05-20  2:48 ` [PATCH 3/3] perf record: Synthesize COMM event for a " Namhyung Kim
  0 siblings, 2 replies; 4+ messages in thread
From: Namhyung Kim @ 2015-05-20  2:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern

The perf_evlist__start_workload_ex() does same as __start_work() but
also invokes callback which does additional work for each command.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 11 +++++++++++
 tools/perf/util/evlist.h |  4 ++++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc1dc2c181ef..bfe455ec673b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1505,6 +1505,17 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
 	return 0;
 }
 
+int perf_evlist__start_workload_ex(struct perf_evlist *evlist,
+				   workload_callback_t callback, void *arg)
+{
+	int ret = callback(evlist, arg);
+
+	if (ret == 0)
+		ret = perf_evlist__start_workload(evlist);
+
+	return ret;
+}
+
 int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
 			      struct perf_sample *sample)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 955bf31b7dd3..6212f036c134 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -123,6 +123,10 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist,
 						     void *ucontext));
 int perf_evlist__start_workload(struct perf_evlist *evlist);
 
+typedef int (*workload_callback_t)(struct perf_evlist *evlist, void *arg);
+int perf_evlist__start_workload_ex(struct perf_evlist *evlist,
+				   workload_callback_t callback, void *arg);
+
 struct option;
 
 int __perf_evlist__parse_mmap_pages(unsigned int *mmap_pages, const char *str);
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] perf trace: Create struct thread for command line workload
  2015-05-20  2:48 [PATCH 1/3] perf tools: Introduce perf_evlist__start_workload_ex() Namhyung Kim
@ 2015-05-20  2:48 ` Namhyung Kim
  2015-05-20  2:48 ` [PATCH 3/3] perf record: Synthesize COMM event for a " Namhyung Kim
  1 sibling, 0 replies; 4+ messages in thread
From: Namhyung Kim @ 2015-05-20  2:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern

When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.  This leads to some
early event not having a comm but pid (like ':15328').

Fix it by using perf_evlist__start_workload_ex() to create
corresponding thread at the same time.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-trace.c |  4 +++-
 tools/perf/util/evlist.c   | 14 ++++++++++++++
 tools/perf/util/evlist.h   |  2 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index a05490d06374..e30d0ed1e60f 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2282,7 +2282,9 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		perf_evlist__enable(evlist);
 
 	if (forks)
-		perf_evlist__start_workload(evlist);
+		perf_evlist__start_workload_ex(evlist,
+					       perf_evlist__create_workload_thread,
+					       trace->host);
 
 	trace->multiple_threads = evlist->threads->map[0] == -1 ||
 				  evlist->threads->nr > 1 ||
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bfe455ec673b..f84305915e47 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1516,6 +1516,20 @@ int perf_evlist__start_workload_ex(struct perf_evlist *evlist,
 	return ret;
 }
 
+int perf_evlist__create_workload_thread(struct perf_evlist *evlist, void *arg)
+{
+	struct machine *machine = arg;
+	int pid = evlist->workload.pid;
+	struct thread *thread;
+
+	thread = machine__findnew_thread(machine, pid, pid);
+	if (thread == NULL)
+		return -1;
+
+	thread__set_comm(thread, program_invocation_short_name, 0);
+	return 0;
+}
+
 int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
 			      struct perf_sample *sample)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 6212f036c134..a02b0f4ba4f9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -126,6 +126,8 @@ int perf_evlist__start_workload(struct perf_evlist *evlist);
 typedef int (*workload_callback_t)(struct perf_evlist *evlist, void *arg);
 int perf_evlist__start_workload_ex(struct perf_evlist *evlist,
 				   workload_callback_t callback, void *arg);
+/* workload_callbacks */
+int perf_evlist__create_workload_thread(struct perf_evlist *evlist, void *arg);
 
 struct option;
 
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] perf record: Synthesize COMM event for a command line workload
  2015-05-20  2:48 [PATCH 1/3] perf tools: Introduce perf_evlist__start_workload_ex() Namhyung Kim
  2015-05-20  2:48 ` [PATCH 2/3] perf trace: Create struct thread for command line workload Namhyung Kim
@ 2015-05-20  2:48 ` Namhyung Kim
  2015-05-20  7:08   ` Jiri Olsa
  1 sibling, 1 reply; 4+ messages in thread
From: Namhyung Kim @ 2015-05-20  2:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern

When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.

It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work if events are processed and displayed before the
COMM event overrides like in 'perf script'.  This leads to those early
samples (like native_write_msr_safe) not having a comm but pid (like
':15328').

So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm.  But at this time, the
comm will be "perf" since it's not exec-ed yet.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe91395617..a0821ee92e27 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -469,6 +469,43 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 	child_finished = 1;
 }
 
+static int synthesize_workload_comm_event(struct perf_evlist *evlist, void *arg)
+{
+	union perf_event *event;
+	struct record *rec = arg;
+	struct machine *machine = &rec->session->machines.host;
+	int pid = evlist->workload.pid;
+	const char *comm_str = program_invocation_short_name;
+	size_t comm_size, total_size;
+	int ret;
+
+	comm_size = PERF_ALIGN(strlen(comm_str) + 1, sizeof(u64));
+	total_size = sizeof(event->comm) + machine->id_hdr_size;
+	/*
+	 * (aligned) comm size might be smaller than expected size
+	 * (i.e.  size of event->comm.comm[]), in that case it needs
+	 * to shrink the total size.
+	 */
+	if (comm_size < sizeof(event->comm.comm))
+		total_size -= sizeof(event->comm.comm) - comm_size;
+
+	event = zalloc(total_size);
+	if (event == NULL)
+		return -ENOMEM;
+
+	event->comm.header.type = PERF_RECORD_COMM;
+	event->comm.header.size = total_size;
+
+	event->comm.pid = pid;
+	event->comm.tid = pid;
+	strncpy(event->comm.comm, comm_str, comm_size);
+
+	ret = record__write(rec, event, total_size);
+
+	free(event);
+	return ret;
+}
+
 static void snapshot_sig_handler(int sig);
 
 static int __cmd_record(struct record *rec, int argc, const char **argv)
@@ -627,7 +664,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	 * Let the child rip
 	 */
 	if (forks)
-		perf_evlist__start_workload(rec->evlist);
+		perf_evlist__start_workload_ex(rec->evlist,
+					       synthesize_workload_comm_event,
+					       rec);
 
 	if (opts->initial_delay) {
 		usleep(opts->initial_delay * 1000);
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 3/3] perf record: Synthesize COMM event for a command line workload
  2015-05-20  2:48 ` [PATCH 3/3] perf record: Synthesize COMM event for a " Namhyung Kim
@ 2015-05-20  7:08   ` Jiri Olsa
  0 siblings, 0 replies; 4+ messages in thread
From: Jiri Olsa @ 2015-05-20  7:08 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML, David Ahern

On Wed, May 20, 2015 at 11:48:35AM +0900, Namhyung Kim wrote:
> When perf creates a new child to profile, the events are enabled on
> exec().  And in this case, it doesn't synthesize any event for the
> child since they'll be generated during exec().  But there's an window
> between the enabling and the event generation.
> 
> It used to be overcome since samples are only in kernel (so we always
> have the map) and the comm is overridden by a later COMM event.
> However it won't work if events are processed and displayed before the
> COMM event overrides like in 'perf script'.  This leads to those early
> samples (like native_write_msr_safe) not having a comm but pid (like
> ':15328').
> 
> So it needs to synthesize COMM event for the child explicitly before
> enabling so that it can have a correct comm.  But at this time, the
> comm will be "perf" since it's not exec-ed yet.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-record.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 5dfe91395617..a0821ee92e27 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -469,6 +469,43 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
>  	child_finished = 1;
>  }
>  
> +static int synthesize_workload_comm_event(struct perf_evlist *evlist, void *arg)
> +{
> +	union perf_event *event;
> +	struct record *rec = arg;
> +	struct machine *machine = &rec->session->machines.host;
> +	int pid = evlist->workload.pid;
> +	const char *comm_str = program_invocation_short_name;

never heard of program_invocation_short_name ;-) nice..

for the patchset:
Acked-by: Jiri Olsa <jolsa@kernel.org>

I tested the record/script use case.. works ok

thanks,
jirka

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-20  7:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-20  2:48 [PATCH 1/3] perf tools: Introduce perf_evlist__start_workload_ex() Namhyung Kim
2015-05-20  2:48 ` [PATCH 2/3] perf trace: Create struct thread for command line workload Namhyung Kim
2015-05-20  2:48 ` [PATCH 3/3] perf record: Synthesize COMM event for a " Namhyung Kim
2015-05-20  7:08   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).