All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] perf tools: Minor improvements in event synthesis
@ 2021-01-29  5:48 Namhyung Kim
  2021-01-29  5:48 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Namhyung Kim @ 2021-01-29  5:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

Hello,

This is to optimize the event synthesis during perf record.

The first patch is to reduce memory usage when many threads are used.
The second is to avoid unncessary syscalls for kernel threads.  And
the last one is to reduce the number of threads to iterate when new
threads are being created at the same time.

Unfortunately there's no dramatic improvement here but I can see ~5%
gain in the 'perf bench internals synthesize' on a big machine.
(The numbers are not stable though)


Before:
  # perf bench internals synthesize --mt -M1 -I 100
  # Running 'internals/synthesize' benchmark:
  Computing performance of multi threaded perf event synthesis by
  synthesizing events on CPU 0:
    Number of synthesis threads: 1
      Average synthesis took: 68831.480 usec (+- 101.450 usec)
      Average num. events: 9982.000 (+- 0.000)
      Average time per event 6.896 usec


After:
  # perf bench internals synthesize --mt -M1 -I 100
  # Running 'internals/synthesize' benchmark:
  Computing performance of multi threaded perf event synthesis by
  synthesizing events on CPU 0:
    Number of synthesis threads: 1
      Average synthesis took: 65036.370 usec (+- 158.121 usec)
      Average num. events: 9982.000 (+- 0.000)
      Average time per event 6.515 usec


Thanks,
Namhyung


Namhyung Kim (3):
  perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  perf tools: Skip MMAP record synthesis for kernel threads
  perf tools: Use scandir() to iterate threads

 tools/perf/util/synthetic-events.c | 88 ++++++++++++++++++++----------
 1 file changed, 58 insertions(+), 30 deletions(-)

-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2021-01-29  5:48 [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
@ 2021-01-29  5:48 ` Namhyung Kim
  2021-01-31 23:00   ` Jiri Olsa
  2021-01-29  5:49 ` [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads Namhyung Kim
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Namhyung Kim @ 2021-01-29  5:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

To save memory usage, it needs to reduce number of entries in the proc
filesystem.  It's using /proc/<PID>/task directory to traverse threads
in the process and then kernel creates /proc/<PID>/task/<TID> entries.

After that it checks the thread info using the /proc/<TID>/status file
rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
are the same and contain all the info we need.

Using the latter eliminates the unnecessary /proc/<TID> entry.  This
can be useful especially a large number of threads are used in the
system.  In my experiment around 1KB of memory on average was saved
for each thread (which is not a thread group leader).

To do this, pass both pid and tid to perf_event_prepare_comm() if it
knows them.  In case it doesn't know, passing 0 as pid will do the old
way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/synthetic-events.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 3a898520f05c..800522591dde 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
  * Assumes that the first 4095 bytes of /proc/pid/stat contains
  * the comm, tgid and ppid.
  */
-static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
+static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
 				    pid_t *tgid, pid_t *ppid)
 {
 	char bf[4096];
@@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	*tgid = -1;
 	*ppid = -1;
 
-	snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
+	if (pid)
+		snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
+	else
+		snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
 
 	fd = open(bf, O_RDONLY);
 	if (fd < 0) {
@@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	close(fd);
 	if (n <= 0) {
 		pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
-			   pid);
+			   tid);
 		return -1;
 	}
 	bf[n] = '\0';
@@ -116,27 +119,32 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 		memcpy(comm, name, size);
 		comm[size] = '\0';
 	} else {
-		pr_debug("Name: string not found for pid %d\n", pid);
+		pr_debug("Name: string not found for pid %d\n", tid);
 	}
 
 	if (tgids) {
 		tgids += 5;  /* strlen("Tgid:") */
 		*tgid = atoi(tgids);
+
+		if (pid && pid != *tgid) {
+			pr_debug("Tgid: not match to given pid: %d vs %d\n",
+				 pid, *tgid);
+		}
 	} else {
-		pr_debug("Tgid: string not found for pid %d\n", pid);
+		pr_debug("Tgid: string not found for pid %d\n", tid);
 	}
 
 	if (ppids) {
 		ppids += 5;  /* strlen("PPid:") */
 		*ppid = atoi(ppids);
 	} else {
-		pr_debug("PPid: string not found for pid %d\n", pid);
+		pr_debug("PPid: string not found for pid %d\n", tid);
 	}
 
 	return 0;
 }
 
-static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
+static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid,
 				    struct machine *machine,
 				    pid_t *tgid, pid_t *ppid)
 {
@@ -147,7 +155,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	memset(&event->comm, 0, sizeof(event->comm));
 
 	if (machine__is_host(machine)) {
-		if (perf_event__get_comm_ids(pid, event->comm.comm,
+		if (perf_event__get_comm_ids(pid, tid, event->comm.comm,
 					     sizeof(event->comm.comm),
 					     tgid, ppid) != 0) {
 			return -1;
@@ -168,7 +176,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	event->comm.header.size = (sizeof(event->comm) -
 				(sizeof(event->comm.comm) - size) +
 				machine->id_hdr_size);
-	event->comm.tid = pid;
+	event->comm.tid = tid;
 
 	return 0;
 }
@@ -180,7 +188,7 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 {
 	pid_t tgid, ppid;
 
-	if (perf_event__prepare_comm(event, pid, machine, &tgid, &ppid) != 0)
+	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid) != 0)
 		return -1;
 
 	if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
@@ -746,7 +754,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 			continue;
 
 		rc = -1;
-		if (perf_event__prepare_comm(comm_event, _pid, machine,
+		if (perf_event__prepare_comm(comm_event, pid, _pid, machine,
 					     &tgid, &ppid) != 0)
 			break;
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads
  2021-01-29  5:48 [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
  2021-01-29  5:48 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
@ 2021-01-29  5:49 ` Namhyung Kim
  2021-01-31 22:48   ` Jiri Olsa
  2021-01-29  5:49 ` [PATCH 3/3] perf tools: Use scandir() to iterate threads Namhyung Kim
  2021-01-31 22:35 ` [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Jiri Olsa
  3 siblings, 1 reply; 14+ messages in thread
From: Namhyung Kim @ 2021-01-29  5:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

To synthesize information to resolve sample IPs, it needs to scan task
and mmap info from the /proc filesystem.  For each process, it
opens (and reads) status and maps file respectively.  But as kernel
threads don't have memory maps so we can skip the maps file.

To find kernel threads, check "VmPeak:" line in /proc/<PID>/status
file.  It's about the peak virtual memory usage so only user-level
tasks have that.  Also check "Threads:" line (which follows the VmPeak
line whether or not it exists) to be sure it's read enough data - just
in case of deeply nested pid namespaces or large number of
supplementary groups are involved.

This is for user process:

  $ head -40 /proc/1/status
  Name:	systemd
  Umask:	0000
  State:	S (sleeping)
  Tgid:	1
  Ngid:	0
  Pid:	1
  PPid:	0
  TracerPid:	0
  Uid:	0	0	0	0
  Gid:	0	0	0	0
  FDSize:	256
  Groups:
  NStgid:	1
  NSpid:	1
  NSpgid:	1
  NSsid:	1
  VmPeak:	  234192 kB           <-- here
  VmSize:	  169964 kB
  VmLck:	       0 kB
  VmPin:	       0 kB
  VmHWM:	   29528 kB
  VmRSS:	    6104 kB
  RssAnon:	    2756 kB
  RssFile:	    3348 kB
  RssShmem:	       0 kB
  VmData:	   19776 kB
  VmStk:	    1036 kB
  VmExe:	     784 kB
  VmLib:	    9532 kB
  VmPTE:	     116 kB
  VmSwap:	    2400 kB
  HugetlbPages:	       0 kB
  CoreDumping:	0
  THP_enabled:	1
  Threads:	1                     <-- and here
  SigQ:	1/62808
  SigPnd:	0000000000000000
  ShdPnd:	0000000000000000
  SigBlk:	7be3c0fe28014a03
  SigIgn:	0000000000001000

And this is for kernel thread:

  $ head -20 /proc/2/status
  Name:	kthreadd
  Umask:	0000
  State:	S (sleeping)
  Tgid:	2
  Ngid:	0
  Pid:	2
  PPid:	0
  TracerPid:	0
  Uid:	0	0	0	0
  Gid:	0	0	0	0
  FDSize:	64
  Groups:
  NStgid:	2
  NSpid:	2
  NSpgid:	0
  NSsid:	0
  Threads:	1                     <-- here
  SigQ:	1/62808
  SigPnd:	0000000000000000
  ShdPnd:	0000000000000000

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/synthetic-events.c | 32 +++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 800522591dde..8b38228c83d8 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -70,13 +70,13 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
  * the comm, tgid and ppid.
  */
 static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
-				    pid_t *tgid, pid_t *ppid)
+				    pid_t *tgid, pid_t *ppid, bool *kernel)
 {
 	char bf[4096];
 	int fd;
 	size_t size = 0;
 	ssize_t n;
-	char *name, *tgids, *ppids;
+	char *name, *tgids, *ppids, *vmpeak, *threads;
 
 	*tgid = -1;
 	*ppid = -1;
@@ -102,8 +102,14 @@ static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len
 	bf[n] = '\0';
 
 	name = strstr(bf, "Name:");
-	tgids = strstr(bf, "Tgid:");
-	ppids = strstr(bf, "PPid:");
+	tgids = strstr(name ?: bf, "Tgid:");
+	ppids = strstr(tgids ?: bf, "PPid:");
+	vmpeak = strstr(ppids ?: bf, "VmPeak:");
+
+	if (vmpeak)
+		threads = NULL;
+	else
+		threads = strstr(ppids ?: bf, "Threads:");
 
 	if (name) {
 		char *nl;
@@ -141,12 +147,17 @@ static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len
 		pr_debug("PPid: string not found for pid %d\n", tid);
 	}
 
+	if (!vmpeak && threads)
+		*kernel = true;
+	else
+		*kernel = false;
+
 	return 0;
 }
 
 static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid,
 				    struct machine *machine,
-				    pid_t *tgid, pid_t *ppid)
+				    pid_t *tgid, pid_t *ppid, bool *kernel)
 {
 	size_t size;
 
@@ -157,7 +168,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t ti
 	if (machine__is_host(machine)) {
 		if (perf_event__get_comm_ids(pid, tid, event->comm.comm,
 					     sizeof(event->comm.comm),
-					     tgid, ppid) != 0) {
+					     tgid, ppid, kernel) != 0) {
 			return -1;
 		}
 	} else {
@@ -187,8 +198,10 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 					 struct machine *machine)
 {
 	pid_t tgid, ppid;
+	bool kernel_thread;
 
-	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid) != 0)
+	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid,
+				     &kernel_thread) != 0)
 		return -1;
 
 	if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
@@ -748,6 +761,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 	while ((dirent = readdir(tasks)) != NULL) {
 		char *end;
 		pid_t _pid;
+		bool kernel_thread;
 
 		_pid = strtol(dirent->d_name, &end, 10);
 		if (*end)
@@ -755,7 +769,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 
 		rc = -1;
 		if (perf_event__prepare_comm(comm_event, pid, _pid, machine,
-					     &tgid, &ppid) != 0)
+					     &tgid, &ppid, &kernel_thread) != 0)
 			break;
 
 		if (perf_event__synthesize_fork(tool, fork_event, _pid, tgid,
@@ -773,7 +787,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 			break;
 
 		rc = 0;
-		if (_pid == pid) {
+		if (_pid == pid && !kernel_thread) {
 			/* process the parent's maps too */
 			rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
 						process, machine, mmap_data);
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] perf tools: Use scandir() to iterate threads
  2021-01-29  5:48 [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
  2021-01-29  5:48 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
  2021-01-29  5:49 ` [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads Namhyung Kim
@ 2021-01-29  5:49 ` Namhyung Kim
  2021-01-31 22:35 ` [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Jiri Olsa
  3 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2021-01-29  5:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

Like in __event__synthesize_thread(), I think it's better to use
scandir() instead of the readdir() loop.  In case some malicious task
continues to create new threads, the readdir() loop will run over and
over to collect tids.  The scandir() also has the problem but the
window is much smaller since it doesn't do much work during the
iteration.

Also add filter_task() function as we only care the tasks.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/synthetic-events.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 8b38228c83d8..334e577b8ae4 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -709,6 +709,11 @@ int perf_event__synthesize_modules(struct perf_tool *tool, perf_event__handler_t
 	return rc;
 }
 
+static int filter_task(const struct dirent *dirent)
+{
+	return isdigit(dirent->d_name[0]);
+}
+
 static int __event__synthesize_thread(union perf_event *comm_event,
 				      union perf_event *mmap_event,
 				      union perf_event *fork_event,
@@ -717,10 +722,10 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 				      struct perf_tool *tool, struct machine *machine, bool mmap_data)
 {
 	char filename[PATH_MAX];
-	DIR *tasks;
-	struct dirent *dirent;
+	struct dirent **dirent;
 	pid_t tgid, ppid;
 	int rc = 0;
+	int i, n;
 
 	/* special case: only send one comm event using passed in pid */
 	if (!full) {
@@ -752,18 +757,16 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 	snprintf(filename, sizeof(filename), "%s/proc/%d/task",
 		 machine->root_dir, pid);
 
-	tasks = opendir(filename);
-	if (tasks == NULL) {
-		pr_debug("couldn't open %s\n", filename);
-		return 0;
-	}
+	n = scandir(filename, &dirent, filter_task, alphasort);
+	if (n < 0)
+		return n;
 
-	while ((dirent = readdir(tasks)) != NULL) {
+	for (i = 0; i < n; i++) {
 		char *end;
 		pid_t _pid;
 		bool kernel_thread;
 
-		_pid = strtol(dirent->d_name, &end, 10);
+		_pid = strtol(dirent[i]->d_name, &end, 10);
 		if (*end)
 			continue;
 
@@ -796,7 +799,10 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 		}
 	}
 
-	closedir(tasks);
+	for (i = 0; i < n; i++)
+		zfree(&dirent[i]);
+	free(dirent);
+
 	return rc;
 }
 
@@ -981,7 +987,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 		return 0;
 
 	snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
-	n = scandir(proc_path, &dirent, 0, alphasort);
+	n = scandir(proc_path, &dirent, filter_task, alphasort);
 	if (n < 0)
 		return err;
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] perf tools: Minor improvements in event synthesis
  2021-01-29  5:48 [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
                   ` (2 preceding siblings ...)
  2021-01-29  5:49 ` [PATCH 3/3] perf tools: Use scandir() to iterate threads Namhyung Kim
@ 2021-01-31 22:35 ` Jiri Olsa
  2021-02-01  4:35   ` Namhyung Kim
  3 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2021-01-31 22:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Fri, Jan 29, 2021 at 02:48:58PM +0900, Namhyung Kim wrote:
> Hello,
> 
> This is to optimize the event synthesis during perf record.
> 
> The first patch is to reduce memory usage when many threads are used.
> The second is to avoid unncessary syscalls for kernel threads.  And
> the last one is to reduce the number of threads to iterate when new
> threads are being created at the same time.
> 
> Unfortunately there's no dramatic improvement here but I can see ~5%
> gain in the 'perf bench internals synthesize' on a big machine.
> (The numbers are not stable though)
> 
> 
> Before:
>   # perf bench internals synthesize --mt -M1 -I 100
>   # Running 'internals/synthesize' benchmark:
>   Computing performance of multi threaded perf event synthesis by
>   synthesizing events on CPU 0:
>     Number of synthesis threads: 1
>       Average synthesis took: 68831.480 usec (+- 101.450 usec)
>       Average num. events: 9982.000 (+- 0.000)
>       Average time per event 6.896 usec
> 
> 
> After:
>   # perf bench internals synthesize --mt -M1 -I 100
>   # Running 'internals/synthesize' benchmark:
>   Computing performance of multi threaded perf event synthesis by
>   synthesizing events on CPU 0:
>     Number of synthesis threads: 1
>       Average synthesis took: 65036.370 usec (+- 158.121 usec)
>       Average num. events: 9982.000 (+- 0.000)
>       Average time per event 6.515 usec
> 
> 
> Thanks,
> Namhyung
> 
> 
> Namhyung Kim (3):
>   perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
>   perf tools: Skip MMAP record synthesis for kernel threads
>   perf tools: Use scandir() to iterate threads

heya,
is there any change to previous version?

jirka

> 
>  tools/perf/util/synthetic-events.c | 88 ++++++++++++++++++++----------
>  1 file changed, 58 insertions(+), 30 deletions(-)
> 
> -- 
> 2.30.0.365.g02bc693789-goog
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads
  2021-01-29  5:49 ` [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads Namhyung Kim
@ 2021-01-31 22:48   ` Jiri Olsa
  2021-02-01  4:54     ` Namhyung Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2021-01-31 22:48 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Fri, Jan 29, 2021 at 02:49:00PM +0900, Namhyung Kim wrote:
> To synthesize information to resolve sample IPs, it needs to scan task
> and mmap info from the /proc filesystem.  For each process, it
> opens (and reads) status and maps file respectively.  But as kernel
> threads don't have memory maps so we can skip the maps file.
> 
> To find kernel threads, check "VmPeak:" line in /proc/<PID>/status
> file.  It's about the peak virtual memory usage so only user-level
> tasks have that.  Also check "Threads:" line (which follows the VmPeak
> line whether or not it exists) to be sure it's read enough data - just
> in case of deeply nested pid namespaces or large number of
> supplementary groups are involved.

I don't understand this Threads: check, could you please
show example where it's useful for the check?

thanks,
jirka

> 
> This is for user process:
> 
>   $ head -40 /proc/1/status
>   Name:	systemd
>   Umask:	0000
>   State:	S (sleeping)
>   Tgid:	1
>   Ngid:	0
>   Pid:	1
>   PPid:	0
>   TracerPid:	0
>   Uid:	0	0	0	0
>   Gid:	0	0	0	0
>   FDSize:	256
>   Groups:
>   NStgid:	1
>   NSpid:	1
>   NSpgid:	1
>   NSsid:	1
>   VmPeak:	  234192 kB           <-- here
>   VmSize:	  169964 kB
>   VmLck:	       0 kB
>   VmPin:	       0 kB
>   VmHWM:	   29528 kB
>   VmRSS:	    6104 kB
>   RssAnon:	    2756 kB
>   RssFile:	    3348 kB
>   RssShmem:	       0 kB
>   VmData:	   19776 kB
>   VmStk:	    1036 kB
>   VmExe:	     784 kB
>   VmLib:	    9532 kB
>   VmPTE:	     116 kB
>   VmSwap:	    2400 kB
>   HugetlbPages:	       0 kB
>   CoreDumping:	0
>   THP_enabled:	1
>   Threads:	1                     <-- and here
>   SigQ:	1/62808
>   SigPnd:	0000000000000000
>   ShdPnd:	0000000000000000
>   SigBlk:	7be3c0fe28014a03
>   SigIgn:	0000000000001000
> 
> And this is for kernel thread:
> 
>   $ head -20 /proc/2/status
>   Name:	kthreadd
>   Umask:	0000
>   State:	S (sleeping)
>   Tgid:	2
>   Ngid:	0
>   Pid:	2
>   PPid:	0
>   TracerPid:	0
>   Uid:	0	0	0	0
>   Gid:	0	0	0	0
>   FDSize:	64
>   Groups:
>   NStgid:	2
>   NSpid:	2
>   NSpgid:	0
>   NSsid:	0
>   Threads:	1                     <-- here
>   SigQ:	1/62808
>   SigPnd:	0000000000000000
>   ShdPnd:	0000000000000000
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/synthetic-events.c | 32 +++++++++++++++++++++---------
>  1 file changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index 800522591dde..8b38228c83d8 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -70,13 +70,13 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
>   * the comm, tgid and ppid.
>   */
>  static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
> -				    pid_t *tgid, pid_t *ppid)
> +				    pid_t *tgid, pid_t *ppid, bool *kernel)
>  {
>  	char bf[4096];
>  	int fd;
>  	size_t size = 0;
>  	ssize_t n;
> -	char *name, *tgids, *ppids;
> +	char *name, *tgids, *ppids, *vmpeak, *threads;
>  
>  	*tgid = -1;
>  	*ppid = -1;
> @@ -102,8 +102,14 @@ static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len
>  	bf[n] = '\0';
>  
>  	name = strstr(bf, "Name:");
> -	tgids = strstr(bf, "Tgid:");
> -	ppids = strstr(bf, "PPid:");
> +	tgids = strstr(name ?: bf, "Tgid:");
> +	ppids = strstr(tgids ?: bf, "PPid:");
> +	vmpeak = strstr(ppids ?: bf, "VmPeak:");
> +
> +	if (vmpeak)
> +		threads = NULL;
> +	else
> +		threads = strstr(ppids ?: bf, "Threads:");
>  
>  	if (name) {
>  		char *nl;
> @@ -141,12 +147,17 @@ static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len
>  		pr_debug("PPid: string not found for pid %d\n", tid);
>  	}
>  
> +	if (!vmpeak && threads)
> +		*kernel = true;
> +	else
> +		*kernel = false;
> +
>  	return 0;
>  }
>  
>  static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid,
>  				    struct machine *machine,
> -				    pid_t *tgid, pid_t *ppid)
> +				    pid_t *tgid, pid_t *ppid, bool *kernel)
>  {
>  	size_t size;
>  
> @@ -157,7 +168,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t ti
>  	if (machine__is_host(machine)) {
>  		if (perf_event__get_comm_ids(pid, tid, event->comm.comm,
>  					     sizeof(event->comm.comm),
> -					     tgid, ppid) != 0) {
> +					     tgid, ppid, kernel) != 0) {
>  			return -1;
>  		}
>  	} else {
> @@ -187,8 +198,10 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
>  					 struct machine *machine)
>  {
>  	pid_t tgid, ppid;
> +	bool kernel_thread;
>  
> -	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid) != 0)
> +	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid,
> +				     &kernel_thread) != 0)
>  		return -1;
>  
>  	if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
> @@ -748,6 +761,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
>  	while ((dirent = readdir(tasks)) != NULL) {
>  		char *end;
>  		pid_t _pid;
> +		bool kernel_thread;
>  
>  		_pid = strtol(dirent->d_name, &end, 10);
>  		if (*end)
> @@ -755,7 +769,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
>  
>  		rc = -1;
>  		if (perf_event__prepare_comm(comm_event, pid, _pid, machine,
> -					     &tgid, &ppid) != 0)
> +					     &tgid, &ppid, &kernel_thread) != 0)
>  			break;
>  
>  		if (perf_event__synthesize_fork(tool, fork_event, _pid, tgid,
> @@ -773,7 +787,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
>  			break;
>  
>  		rc = 0;
> -		if (_pid == pid) {
> +		if (_pid == pid && !kernel_thread) {
>  			/* process the parent's maps too */
>  			rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
>  						process, machine, mmap_data);
> -- 
> 2.30.0.365.g02bc693789-goog
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2021-01-29  5:48 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
@ 2021-01-31 23:00   ` Jiri Olsa
  2021-02-01  4:41     ` Namhyung Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2021-01-31 23:00 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Fri, Jan 29, 2021 at 02:48:59PM +0900, Namhyung Kim wrote:
> To save memory usage, it needs to reduce number of entries in the proc
> filesystem.  It's using /proc/<PID>/task directory to traverse threads
> in the process and then kernel creates /proc/<PID>/task/<TID> entries.
> 
> After that it checks the thread info using the /proc/<TID>/status file
> rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
> are the same and contain all the info we need.
> 
> Using the latter eliminates the unnecessary /proc/<TID> entry.  This
> can be useful especially a large number of threads are used in the
> system.  In my experiment around 1KB of memory on average was saved
> for each thread (which is not a thread group leader).
> 
> To do this, pass both pid and tid to perf_event_prepare_comm() if it
> knows them.  In case it doesn't know, passing 0 as pid will do the old
> way.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/synthetic-events.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index 3a898520f05c..800522591dde 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
>   * Assumes that the first 4095 bytes of /proc/pid/stat contains
>   * the comm, tgid and ppid.
>   */
> -static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> +static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
>  				    pid_t *tgid, pid_t *ppid)
>  {
>  	char bf[4096];
> @@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
>  	*tgid = -1;
>  	*ppid = -1;
>  
> -	snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
> +	if (pid)
> +		snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
> +	else
> +		snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
>  
>  	fd = open(bf, O_RDONLY);
>  	if (fd < 0) {
> @@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
>  	close(fd);
>  	if (n <= 0) {
>  		pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
> -			   pid);
> +			   tid);
>  		return -1;
>  	}
>  	bf[n] = '\0';
> @@ -116,27 +119,32 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
>  		memcpy(comm, name, size);
>  		comm[size] = '\0';
>  	} else {
> -		pr_debug("Name: string not found for pid %d\n", pid);
> +		pr_debug("Name: string not found for pid %d\n", tid);
>  	}
>  
>  	if (tgids) {
>  		tgids += 5;  /* strlen("Tgid:") */
>  		*tgid = atoi(tgids);
> +
> +		if (pid && pid != *tgid) {
> +			pr_debug("Tgid: not match to given pid: %d vs %d\n",
> +				 pid, *tgid);

hm, could this actually happen in our case?

jirka


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] perf tools: Minor improvements in event synthesis
  2021-01-31 22:35 ` [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Jiri Olsa
@ 2021-02-01  4:35   ` Namhyung Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2021-02-01  4:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

Hi Jiri,

On Mon, Feb 1, 2021 at 7:35 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Jan 29, 2021 at 02:48:58PM +0900, Namhyung Kim wrote:
> > Hello,
> >
> > This is to optimize the event synthesis during perf record.
> >
> > The first patch is to reduce memory usage when many threads are used.
> > The second is to avoid unncessary syscalls for kernel threads.  And
> > the last one is to reduce the number of threads to iterate when new
> > threads are being created at the same time.
> >
> > Unfortunately there's no dramatic improvement here but I can see ~5%
> > gain in the 'perf bench internals synthesize' on a big machine.
> > (The numbers are not stable though)
> >
> >
> > Before:
> >   # perf bench internals synthesize --mt -M1 -I 100
> >   # Running 'internals/synthesize' benchmark:
> >   Computing performance of multi threaded perf event synthesis by
> >   synthesizing events on CPU 0:
> >     Number of synthesis threads: 1
> >       Average synthesis took: 68831.480 usec (+- 101.450 usec)
> >       Average num. events: 9982.000 (+- 0.000)
> >       Average time per event 6.896 usec
> >
> >
> > After:
> >   # perf bench internals synthesize --mt -M1 -I 100
> >   # Running 'internals/synthesize' benchmark:
> >   Computing performance of multi threaded perf event synthesis by
> >   synthesizing events on CPU 0:
> >     Number of synthesis threads: 1
> >       Average synthesis took: 65036.370 usec (+- 158.121 usec)
> >       Average num. events: 9982.000 (+- 0.000)
> >       Average time per event 6.515 usec
> >
> >
> > Thanks,
> > Namhyung
> >
> >
> > Namhyung Kim (3):
> >   perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
> >   perf tools: Skip MMAP record synthesis for kernel threads
> >   perf tools: Use scandir() to iterate threads
>
> heya,
> is there any change to previous version?

No, it's just a rebase version.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2021-01-31 23:00   ` Jiri Olsa
@ 2021-02-01  4:41     ` Namhyung Kim
  2021-02-01 20:41       ` Jiri Olsa
  0 siblings, 1 reply; 14+ messages in thread
From: Namhyung Kim @ 2021-02-01  4:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Mon, Feb 1, 2021 at 8:00 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Jan 29, 2021 at 02:48:59PM +0900, Namhyung Kim wrote:
> > To save memory usage, it needs to reduce number of entries in the proc
> > filesystem.  It's using /proc/<PID>/task directory to traverse threads
> > in the process and then kernel creates /proc/<PID>/task/<TID> entries.
> >
> > After that it checks the thread info using the /proc/<TID>/status file
> > rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
> > are the same and contain all the info we need.
> >
> > Using the latter eliminates the unnecessary /proc/<TID> entry.  This
> > can be useful especially a large number of threads are used in the
> > system.  In my experiment around 1KB of memory on average was saved
> > for each thread (which is not a thread group leader).
> >
> > To do this, pass both pid and tid to perf_event_prepare_comm() if it
> > knows them.  In case it doesn't know, passing 0 as pid will do the old
> > way.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/synthetic-events.c | 30 +++++++++++++++++++-----------
> >  1 file changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > index 3a898520f05c..800522591dde 100644
> > --- a/tools/perf/util/synthetic-events.c
> > +++ b/tools/perf/util/synthetic-events.c
> > @@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
> >   * Assumes that the first 4095 bytes of /proc/pid/stat contains
> >   * the comm, tgid and ppid.
> >   */
> > -static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> > +static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
> >                                   pid_t *tgid, pid_t *ppid)
> >  {
> >       char bf[4096];
> > @@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> >       *tgid = -1;
> >       *ppid = -1;
> >
> > -     snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
> > +     if (pid)
> > +             snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
> > +     else
> > +             snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
> >
> >       fd = open(bf, O_RDONLY);
> >       if (fd < 0) {
> > @@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> >       close(fd);
> >       if (n <= 0) {
> >               pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
> > -                        pid);
> > +                        tid);
> >               return -1;
> >       }
> >       bf[n] = '\0';
> > @@ -116,27 +119,32 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> >               memcpy(comm, name, size);
> >               comm[size] = '\0';
> >       } else {
> > -             pr_debug("Name: string not found for pid %d\n", pid);
> > +             pr_debug("Name: string not found for pid %d\n", tid);
> >       }
> >
> >       if (tgids) {
> >               tgids += 5;  /* strlen("Tgid:") */
> >               *tgid = atoi(tgids);
> > +
> > +             if (pid && pid != *tgid) {
> > +                     pr_debug("Tgid: not match to given pid: %d vs %d\n",
> > +                              pid, *tgid);
>
> hm, could this actually happen in our case?

Probably not.  I'll remove it in the next version if you want.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads
  2021-01-31 22:48   ` Jiri Olsa
@ 2021-02-01  4:54     ` Namhyung Kim
  2021-02-01 20:32       ` Jiri Olsa
  0 siblings, 1 reply; 14+ messages in thread
From: Namhyung Kim @ 2021-02-01  4:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Mon, Feb 1, 2021 at 7:48 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Jan 29, 2021 at 02:49:00PM +0900, Namhyung Kim wrote:
> > To synthesize information to resolve sample IPs, it needs to scan task
> > and mmap info from the /proc filesystem.  For each process, it
> > opens (and reads) status and maps file respectively.  But as kernel
> > threads don't have memory maps so we can skip the maps file.
> >
> > To find kernel threads, check "VmPeak:" line in /proc/<PID>/status
> > file.  It's about the peak virtual memory usage so only user-level
> > tasks have that.  Also check "Threads:" line (which follows the VmPeak
> > line whether or not it exists) to be sure it's read enough data - just
> > in case of deeply nested pid namespaces or large number of
> > supplementary groups are involved.
>
> I don't understand this Threads: check, could you please
> show example where it's useful for the check?

Sure!

I think there's a chance that the status file actually has the VmPeak line,
but it didn't read to the line for some reason.  In that case, we should
not assume the task is a kernel thread.

So it needs to make sure whether there's no such line in the file
or it just didn't read enough data.  To check the latter case, it
searches the "Threads" line which follows the VmPeak always.

Did I make it clear?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads
  2021-02-01  4:54     ` Namhyung Kim
@ 2021-02-01 20:32       ` Jiri Olsa
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2021-02-01 20:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Mon, Feb 01, 2021 at 01:54:32PM +0900, Namhyung Kim wrote:
> On Mon, Feb 1, 2021 at 7:48 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Fri, Jan 29, 2021 at 02:49:00PM +0900, Namhyung Kim wrote:
> > > To synthesize information to resolve sample IPs, it needs to scan task
> > > and mmap info from the /proc filesystem.  For each process, it
> > > opens (and reads) status and maps file respectively.  But as kernel
> > > threads don't have memory maps so we can skip the maps file.
> > >
> > > To find kernel threads, check "VmPeak:" line in /proc/<PID>/status
> > > file.  It's about the peak virtual memory usage so only user-level
> > > tasks have that.  Also check "Threads:" line (which follows the VmPeak
> > > line whether or not it exists) to be sure it's read enough data - just
> > > in case of deeply nested pid namespaces or large number of
> > > supplementary groups are involved.
> >
> > I don't understand this Threads: check, could you please
> > show example where it's useful for the check?
> 
> Sure!
> 
> I think there's a chance that the status file actually has the VmPeak line,
> but it didn't read to the line for some reason.  In that case, we should
> not assume the task is a kernel thread.
> 
> So it needs to make sure whether there's no such line in the file
> or it just didn't read enough data.  To check the latter case, it
> searches the "Threads" line which follows the VmPeak always.

ah ok, there's limited buffer for the status file

we could call filename__read_str to read the whole file
and skip these checks.. but perhaps that'd be another
slowdown you trying to prevent

if you put that comment to the code, I'm ok with that

thanks,
jirka


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2021-02-01  4:41     ` Namhyung Kim
@ 2021-02-01 20:41       ` Jiri Olsa
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2021-02-01 20:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Mark Rutland, Alexander Shishkin, LKML, Stephane Eranian,
	Andi Kleen, Ian Rogers

On Mon, Feb 01, 2021 at 01:41:13PM +0900, Namhyung Kim wrote:
> On Mon, Feb 1, 2021 at 8:00 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Fri, Jan 29, 2021 at 02:48:59PM +0900, Namhyung Kim wrote:
> > > To save memory usage, it needs to reduce number of entries in the proc
> > > filesystem.  It's using /proc/<PID>/task directory to traverse threads
> > > in the process and then kernel creates /proc/<PID>/task/<TID> entries.
> > >
> > > After that it checks the thread info using the /proc/<TID>/status file
> > > rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
> > > are the same and contain all the info we need.
> > >
> > > Using the latter eliminates the unnecessary /proc/<TID> entry.  This
> > > can be useful especially a large number of threads are used in the
> > > system.  In my experiment around 1KB of memory on average was saved
> > > for each thread (which is not a thread group leader).
> > >
> > > To do this, pass both pid and tid to perf_event_prepare_comm() if it
> > > knows them.  In case it doesn't know, passing 0 as pid will do the old
> > > way.
> > >
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  tools/perf/util/synthetic-events.c | 30 +++++++++++++++++++-----------
> > >  1 file changed, 19 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > > index 3a898520f05c..800522591dde 100644
> > > --- a/tools/perf/util/synthetic-events.c
> > > +++ b/tools/perf/util/synthetic-events.c
> > > @@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
> > >   * Assumes that the first 4095 bytes of /proc/pid/stat contains
> > >   * the comm, tgid and ppid.
> > >   */
> > > -static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> > > +static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
> > >                                   pid_t *tgid, pid_t *ppid)
> > >  {
> > >       char bf[4096];
> > > @@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> > >       *tgid = -1;
> > >       *ppid = -1;
> > >
> > > -     snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
> > > +     if (pid)
> > > +             snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
> > > +     else
> > > +             snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
> > >
> > >       fd = open(bf, O_RDONLY);
> > >       if (fd < 0) {
> > > @@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> > >       close(fd);
> > >       if (n <= 0) {
> > >               pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
> > > -                        pid);
> > > +                        tid);
> > >               return -1;
> > >       }
> > >       bf[n] = '\0';
> > > @@ -116,27 +119,32 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
> > >               memcpy(comm, name, size);
> > >               comm[size] = '\0';
> > >       } else {
> > > -             pr_debug("Name: string not found for pid %d\n", pid);
> > > +             pr_debug("Name: string not found for pid %d\n", tid);
> > >       }
> > >
> > >       if (tgids) {
> > >               tgids += 5;  /* strlen("Tgid:") */
> > >               *tgid = atoi(tgids);
> > > +
> > > +             if (pid && pid != *tgid) {
> > > +                     pr_debug("Tgid: not match to given pid: %d vs %d\n",
> > > +                              pid, *tgid);
> >
> > hm, could this actually happen in our case?
> 
> Probably not.  I'll remove it in the next version if you want.

IMO if there's no reason let's remove it

jirka


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2021-02-02  9:01 [PATCHSET v3 " Namhyung Kim
@ 2021-02-02  9:01 ` Namhyung Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2021-02-02  9:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

To save memory usage, it needs to reduce number of entries in the proc
filesystem.  It's using /proc/<PID>/task directory to traverse threads
in the process and then kernel creates /proc/<PID>/task/<TID> entries.

After that it checks the thread info using the /proc/<TID>/status file
rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
are the same and contain all the info we need.

Using the latter eliminates the unnecessary /proc/<TID> entry.  This
can be useful especially a large number of threads are used in the
system.  In my experiment around 1KB of memory on average was saved
for each thread (which is not a thread group leader).

To do this, pass both pid and tid to perf_event_prepare_comm() if it
knows them.  In case it doesn't know, passing 0 as pid will do the old
way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/synthetic-events.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 3a898520f05c..0cc998663b03 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
  * Assumes that the first 4095 bytes of /proc/pid/stat contains
  * the comm, tgid and ppid.
  */
-static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
+static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
 				    pid_t *tgid, pid_t *ppid)
 {
 	char bf[4096];
@@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	*tgid = -1;
 	*ppid = -1;
 
-	snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
+	if (pid)
+		snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
+	else
+		snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
 
 	fd = open(bf, O_RDONLY);
 	if (fd < 0) {
@@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	close(fd);
 	if (n <= 0) {
 		pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
-			   pid);
+			   tid);
 		return -1;
 	}
 	bf[n] = '\0';
@@ -116,27 +119,27 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 		memcpy(comm, name, size);
 		comm[size] = '\0';
 	} else {
-		pr_debug("Name: string not found for pid %d\n", pid);
+		pr_debug("Name: string not found for pid %d\n", tid);
 	}
 
 	if (tgids) {
 		tgids += 5;  /* strlen("Tgid:") */
 		*tgid = atoi(tgids);
 	} else {
-		pr_debug("Tgid: string not found for pid %d\n", pid);
+		pr_debug("Tgid: string not found for pid %d\n", tid);
 	}
 
 	if (ppids) {
 		ppids += 5;  /* strlen("PPid:") */
 		*ppid = atoi(ppids);
 	} else {
-		pr_debug("PPid: string not found for pid %d\n", pid);
+		pr_debug("PPid: string not found for pid %d\n", tid);
 	}
 
 	return 0;
 }
 
-static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
+static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid,
 				    struct machine *machine,
 				    pid_t *tgid, pid_t *ppid)
 {
@@ -147,7 +150,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	memset(&event->comm, 0, sizeof(event->comm));
 
 	if (machine__is_host(machine)) {
-		if (perf_event__get_comm_ids(pid, event->comm.comm,
+		if (perf_event__get_comm_ids(pid, tid, event->comm.comm,
 					     sizeof(event->comm.comm),
 					     tgid, ppid) != 0) {
 			return -1;
@@ -168,7 +171,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	event->comm.header.size = (sizeof(event->comm) -
 				(sizeof(event->comm.comm) - size) +
 				machine->id_hdr_size);
-	event->comm.tid = pid;
+	event->comm.tid = tid;
 
 	return 0;
 }
@@ -180,7 +183,7 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 {
 	pid_t tgid, ppid;
 
-	if (perf_event__prepare_comm(event, pid, machine, &tgid, &ppid) != 0)
+	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid) != 0)
 		return -1;
 
 	if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
@@ -746,7 +749,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 			continue;
 
 		rc = -1;
-		if (perf_event__prepare_comm(comm_event, _pid, machine,
+		if (perf_event__prepare_comm(comm_event, pid, _pid, machine,
 					     &tgid, &ppid) != 0)
 			break;
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis
  2020-12-21  7:00 [PATCH 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
@ 2020-12-21  7:00 ` Namhyung Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2020-12-21  7:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, Mark Rutland, Alexander Shishkin,
	LKML, Stephane Eranian, Andi Kleen, Ian Rogers

To save memory usage, it needs to reduce number of entries in the proc
filesystem.  It's using /proc/<PID>/task directory to traverse threads
in the process and then kernel creates /proc/<PID>/task/<TID> entries.

After that it checks the thread info using the /proc/<TID>/status file
rather than /proc/<PID>/task/<TID>/status.  As far as I can see, they
are the same and contain all the info we need.

Using the latter eliminates the unnecessary /proc/<TID> entry.  This
can be useful especially a large number of threads are used in the
system.  In my experiment around 1KB of memory on average was saved
for each thread (which is not a thread group leader).

To do this, pass both pid and tid to perf_event_prepare_comm() if it
knows them.  In case it doesn't know, passing 0 as pid will do the old
way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/synthetic-events.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 2947e3f3c6d9..515d145a4303 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -69,7 +69,7 @@ int perf_tool__process_synth_event(struct perf_tool *tool,
  * Assumes that the first 4095 bytes of /proc/pid/stat contains
  * the comm, tgid and ppid.
  */
-static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
+static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len,
 				    pid_t *tgid, pid_t *ppid)
 {
 	char bf[4096];
@@ -81,7 +81,10 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	*tgid = -1;
 	*ppid = -1;
 
-	snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
+	if (pid)
+		snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid);
+	else
+		snprintf(bf, sizeof(bf), "/proc/%d/status", tid);
 
 	fd = open(bf, O_RDONLY);
 	if (fd < 0) {
@@ -93,7 +96,7 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 	close(fd);
 	if (n <= 0) {
 		pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n",
-			   pid);
+			   tid);
 		return -1;
 	}
 	bf[n] = '\0';
@@ -116,27 +119,32 @@ static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
 		memcpy(comm, name, size);
 		comm[size] = '\0';
 	} else {
-		pr_debug("Name: string not found for pid %d\n", pid);
+		pr_debug("Name: string not found for pid %d\n", tid);
 	}
 
 	if (tgids) {
 		tgids += 5;  /* strlen("Tgid:") */
 		*tgid = atoi(tgids);
+
+		if (pid && pid != *tgid) {
+			pr_debug("Tgid: not match to given pid: %d vs %d\n",
+				 pid, *tgid);
+		}
 	} else {
-		pr_debug("Tgid: string not found for pid %d\n", pid);
+		pr_debug("Tgid: string not found for pid %d\n", tid);
 	}
 
 	if (ppids) {
 		ppids += 5;  /* strlen("PPid:") */
 		*ppid = atoi(ppids);
 	} else {
-		pr_debug("PPid: string not found for pid %d\n", pid);
+		pr_debug("PPid: string not found for pid %d\n", tid);
 	}
 
 	return 0;
 }
 
-static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
+static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid,
 				    struct machine *machine,
 				    pid_t *tgid, pid_t *ppid)
 {
@@ -147,7 +155,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	memset(&event->comm, 0, sizeof(event->comm));
 
 	if (machine__is_host(machine)) {
-		if (perf_event__get_comm_ids(pid, event->comm.comm,
+		if (perf_event__get_comm_ids(pid, tid, event->comm.comm,
 					     sizeof(event->comm.comm),
 					     tgid, ppid) != 0) {
 			return -1;
@@ -168,7 +176,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	event->comm.header.size = (sizeof(event->comm) -
 				(sizeof(event->comm.comm) - size) +
 				machine->id_hdr_size);
-	event->comm.tid = pid;
+	event->comm.tid = tid;
 
 	return 0;
 }
@@ -180,7 +188,7 @@ pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 {
 	pid_t tgid, ppid;
 
-	if (perf_event__prepare_comm(event, pid, machine, &tgid, &ppid) != 0)
+	if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid) != 0)
 		return -1;
 
 	if (perf_tool__process_synth_event(tool, event, machine, process) != 0)
@@ -701,7 +709,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 			continue;
 
 		rc = -1;
-		if (perf_event__prepare_comm(comm_event, _pid, machine,
+		if (perf_event__prepare_comm(comm_event, pid, _pid, machine,
 					     &tgid, &ppid) != 0)
 			break;
 
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-02-02  9:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29  5:48 [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
2021-01-29  5:48 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
2021-01-31 23:00   ` Jiri Olsa
2021-02-01  4:41     ` Namhyung Kim
2021-02-01 20:41       ` Jiri Olsa
2021-01-29  5:49 ` [PATCH 2/3] perf tools: Skip MMAP record synthesis for kernel threads Namhyung Kim
2021-01-31 22:48   ` Jiri Olsa
2021-02-01  4:54     ` Namhyung Kim
2021-02-01 20:32       ` Jiri Olsa
2021-01-29  5:49 ` [PATCH 3/3] perf tools: Use scandir() to iterate threads Namhyung Kim
2021-01-31 22:35 ` [PATCH v2 0/3] perf tools: Minor improvements in event synthesis Jiri Olsa
2021-02-01  4:35   ` Namhyung Kim
  -- strict thread matches above, loose matches on Subject: below --
2021-02-02  9:01 [PATCHSET v3 " Namhyung Kim
2021-02-02  9:01 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim
2020-12-21  7:00 [PATCH 0/3] perf tools: Minor improvements in event synthesis Namhyung Kim
2020-12-21  7:00 ` [PATCH 1/3] perf tools: Use /proc/<PID>/task/<TID>/status for synthesis Namhyung Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.