linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] perf: Fix perf in non-root PID namespace
@ 2021-12-12 13:47 Leo Yan
  2021-12-12 13:47 ` [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace() Leo Yan
  2021-12-12 13:47 ` [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload Leo Yan
  0 siblings, 2 replies; 6+ messages in thread
From: Leo Yan @ 2021-12-12 13:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jin Yao, John Garry, Yonatan Goldschmidt,
	linux-perf-users, linux-kernel, netdev, bpf
  Cc: Leo Yan

When perf tool runs in non-root PID namespace, it fails to gather the
correct process and namespace info for the profiled (forked) program
since it wrongly uses the non-root PID number to access '/proc' nodes.

To fix this issue, this patchset adds the checking if the perf tool runs
in the non-root namespace, if it is the case, perf tool reports error to
notify users to run perf tool in root PID namespace.  This can ensure
perf tool gathers correct process info for profiled program.

After applied this patchset:

  # unshare --fork --pid perf record -e cs_etm//u -a -- uname
  Perf runs in non-root PID namespace; please run perf tool in the root PID namespace for gathering process info.
  Couldn't run the workload!

  # perf record -e cs_etm//u -a -- unshare --fork --pid uname
  Couldn't synthesize bpf events.
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.438 MB perf_test.data ]


Leo Yan (2):
  perf namespaces: Add helper nsinfo__is_in_root_namespace()
  perf evlist: Don't run perf in non-root PID namespace when launch
    workload

 tools/perf/util/evlist.c     |  7 ++++
 tools/perf/util/namespaces.c | 76 ++++++++++++++++++++++--------------
 tools/perf/util/namespaces.h |  2 +
 3 files changed, 55 insertions(+), 30 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace()
  2021-12-12 13:47 [PATCH v1 0/2] perf: Fix perf in non-root PID namespace Leo Yan
@ 2021-12-12 13:47 ` Leo Yan
  2021-12-13 18:20   ` Arnaldo Carvalho de Melo
  2021-12-12 13:47 ` [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload Leo Yan
  1 sibling, 1 reply; 6+ messages in thread
From: Leo Yan @ 2021-12-12 13:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jin Yao, John Garry, Yonatan Goldschmidt,
	linux-perf-users, linux-kernel, netdev, bpf
  Cc: Leo Yan

Refactors code for gathering PID infos, it creates the function
nsinfo__get_nspid() to parse process 'status' node in folder '/proc'.

Base on the refactoring, this patch introduces a new helper
nsinfo__is_in_root_namespace(), it returns true when the caller runs in
the root PID namespace.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/namespaces.c | 76 ++++++++++++++++++++++--------------
 tools/perf/util/namespaces.h |  2 +
 2 files changed, 48 insertions(+), 30 deletions(-)

diff --git a/tools/perf/util/namespaces.c b/tools/perf/util/namespaces.c
index 608b20c72a5c..48aa3217300b 100644
--- a/tools/perf/util/namespaces.c
+++ b/tools/perf/util/namespaces.c
@@ -60,17 +60,49 @@ void namespaces__free(struct namespaces *namespaces)
 	free(namespaces);
 }
 
+static int nsinfo__get_nspid(struct nsinfo *nsi, const char *path)
+{
+	FILE *f = NULL;
+	char *statln = NULL;
+	size_t linesz = 0;
+	char *nspid;
+
+	f = fopen(path, "r");
+	if (f == NULL)
+		return -1;
+
+	while (getline(&statln, &linesz, f) != -1) {
+		/* Use tgid if CONFIG_PID_NS is not defined. */
+		if (strstr(statln, "Tgid:") != NULL) {
+			nsi->tgid = (pid_t)strtol(strrchr(statln, '\t'),
+						     NULL, 10);
+			nsi->nstgid = nsi->tgid;
+		}
+
+		if (strstr(statln, "NStgid:") != NULL) {
+			nspid = strrchr(statln, '\t');
+			nsi->nstgid = (pid_t)strtol(nspid, NULL, 10);
+			/*
+			 * If innermost tgid is not the first, process is in a different
+			 * PID namespace.
+			 */
+			nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
+			break;
+		}
+	}
+
+	fclose(f);
+	free(statln);
+	return 0;
+}
+
 int nsinfo__init(struct nsinfo *nsi)
 {
 	char oldns[PATH_MAX];
 	char spath[PATH_MAX];
 	char *newns = NULL;
-	char *statln = NULL;
-	char *nspid;
 	struct stat old_stat;
 	struct stat new_stat;
-	FILE *f = NULL;
-	size_t linesz = 0;
 	int rv = -1;
 
 	if (snprintf(oldns, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX)
@@ -100,34 +132,9 @@ int nsinfo__init(struct nsinfo *nsi)
 	if (snprintf(spath, PATH_MAX, "/proc/%d/status", nsi->pid) >= PATH_MAX)
 		goto out;
 
-	f = fopen(spath, "r");
-	if (f == NULL)
-		goto out;
-
-	while (getline(&statln, &linesz, f) != -1) {
-		/* Use tgid if CONFIG_PID_NS is not defined. */
-		if (strstr(statln, "Tgid:") != NULL) {
-			nsi->tgid = (pid_t)strtol(strrchr(statln, '\t'),
-						     NULL, 10);
-			nsi->nstgid = nsi->tgid;
-		}
-
-		if (strstr(statln, "NStgid:") != NULL) {
-			nspid = strrchr(statln, '\t');
-			nsi->nstgid = (pid_t)strtol(nspid, NULL, 10);
-			/* If innermost tgid is not the first, process is in a different
-			 * PID namespace.
-			 */
-			nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
-			break;
-		}
-	}
-	rv = 0;
+	rv = nsinfo__get_nspid(nsi, spath);
 
 out:
-	if (f != NULL)
-		(void) fclose(f);
-	free(statln);
 	free(newns);
 	return rv;
 }
@@ -299,3 +306,12 @@ int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi)
 
 	return ret;
 }
+
+bool nsinfo__is_in_root_namespace(void)
+{
+	struct nsinfo nsi;
+
+	memset(&nsi, 0x0, sizeof(nsi));
+	nsinfo__get_nspid(&nsi, "/proc/self/status");
+	return !nsi.in_pidns;
+}
diff --git a/tools/perf/util/namespaces.h b/tools/perf/util/namespaces.h
index ad9775db7b9c..9ceea9643507 100644
--- a/tools/perf/util/namespaces.h
+++ b/tools/perf/util/namespaces.h
@@ -59,6 +59,8 @@ void nsinfo__mountns_exit(struct nscookie *nc);
 char *nsinfo__realpath(const char *path, struct nsinfo *nsi);
 int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi);
 
+bool nsinfo__is_in_root_namespace(void);
+
 static inline void __nsinfo__zput(struct nsinfo **nsip)
 {
 	if (nsip) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload
  2021-12-12 13:47 [PATCH v1 0/2] perf: Fix perf in non-root PID namespace Leo Yan
  2021-12-12 13:47 ` [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace() Leo Yan
@ 2021-12-12 13:47 ` Leo Yan
  2021-12-13 13:54   ` James Clark
  1 sibling, 1 reply; 6+ messages in thread
From: Leo Yan @ 2021-12-12 13:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jin Yao, John Garry, Yonatan Goldschmidt,
	linux-perf-users, linux-kernel, netdev, bpf
  Cc: Leo Yan

In function evlist__prepare_workload(), after perf forks a child process
and launches a workload in the created process, it needs to retrieve
process and namespace related info via '/proc/$PID/' node.

The process folders under 'proc' file system use the PID number from the
root PID namespace, when perf tool runs in non-root PID namespace and
creates new process for profiled program, this leads to the perf tool
wrongly gather process info since it uses PID from non-root namespace to
access nodes under '/proc'.

Let's see an example:

  unshare --fork --pid perf record -e cs_etm//u -a -- test_program

This command runs perf tool and the profiled program 'test_program' in
the non-root PID namespace.  When perf tool launches 'test_program',
e.g. the forked PID number is 2, perf tool retrieves process info for
'test_program' from the folder '/proc/2'.  But '/proc/2' is actually for
a kernel thread so perf tool wrongly gather info for 'test_program'.

To fix this issue, we don't allow perf tool runs in non-root PID
namespace when it launches workload and reports error in this
case.  This can notify users to run the perf tool in root PID namespace
to gather correct info for profiled program.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/evlist.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5f92319ce258..bdf79a97db66 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -11,6 +11,7 @@
 #include <poll.h>
 #include "cpumap.h"
 #include "util/mmap.h"
+#include "util/namespaces.h"
 #include "thread_map.h"
 #include "target.h"
 #include "evlist.h"
@@ -1364,6 +1365,12 @@ int evlist__prepare_workload(struct evlist *evlist, struct target *target, const
 	int child_ready_pipe[2], go_pipe[2];
 	char bf;
 
+	if (!nsinfo__is_in_root_namespace()) {
+		pr_err("Perf runs in non-root PID namespace; please run perf tool ");
+		pr_err("in the root PID namespace for gathering process info.\n");
+		return -EPERM;
+	}
+
 	if (pipe(child_ready_pipe) < 0) {
 		perror("failed to create 'ready' pipe");
 		return -1;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload
  2021-12-12 13:47 ` [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload Leo Yan
@ 2021-12-13 13:54   ` James Clark
  2021-12-14  3:24     ` Leo Yan
  0 siblings, 1 reply; 6+ messages in thread
From: James Clark @ 2021-12-13 13:54 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jin Yao, John Garry, Yonatan Goldschmidt,
	linux-perf-users, linux-kernel, netdev, bpf



On 12/12/2021 13:47, Leo Yan wrote:
> In function evlist__prepare_workload(), after perf forks a child process
> and launches a workload in the created process, it needs to retrieve
> process and namespace related info via '/proc/$PID/' node.
> 
> The process folders under 'proc' file system use the PID number from the
> root PID namespace, when perf tool runs in non-root PID namespace and
> creates new process for profiled program, this leads to the perf tool
> wrongly gather process info since it uses PID from non-root namespace to
> access nodes under '/proc'.
> 
> Let's see an example:
> 
>   unshare --fork --pid perf record -e cs_etm//u -a -- test_program
> 
> This command runs perf tool and the profiled program 'test_program' in
> the non-root PID namespace.  When perf tool launches 'test_program',
> e.g. the forked PID number is 2, perf tool retrieves process info for
> 'test_program' from the folder '/proc/2'.  But '/proc/2' is actually for
> a kernel thread so perf tool wrongly gather info for 'test_program'.

Hi Leo,

Which features aren't working exactly when you run in a non root namespace?

I did "perf record -- ls" and it seemed to be working for me. At least kernel
sampling would be working in a namespace, even if there was something wrong
with userspace.

I think causing a failure might be too restrictive and would prevent people
from using perf in a container. Maybe we could show a warning instead, but
I'm not sure exactly what's not working because I thought perf looked up stuff
based on the path of the process not the pid.

James

> 
> To fix this issue, we don't allow perf tool runs in non-root PID
> namespace when it launches workload and reports error in this
> case.  This can notify users to run the perf tool in root PID namespace
> to gather correct info for profiled program.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/evlist.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 5f92319ce258..bdf79a97db66 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -11,6 +11,7 @@
>  #include <poll.h>
>  #include "cpumap.h"
>  #include "util/mmap.h"
> +#include "util/namespaces.h"
>  #include "thread_map.h"
>  #include "target.h"
>  #include "evlist.h"
> @@ -1364,6 +1365,12 @@ int evlist__prepare_workload(struct evlist *evlist, struct target *target, const
>  	int child_ready_pipe[2], go_pipe[2];
>  	char bf;
>  
> +	if (!nsinfo__is_in_root_namespace()) {
> +		pr_err("Perf runs in non-root PID namespace; please run perf tool ");
> +		pr_err("in the root PID namespace for gathering process info.\n");
> +		return -EPERM;
> +	}
> +
>  	if (pipe(child_ready_pipe) < 0) {
>  		perror("failed to create 'ready' pipe");
>  		return -1;
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace()
  2021-12-12 13:47 ` [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace() Leo Yan
@ 2021-12-13 18:20   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-12-13 18:20 UTC (permalink / raw)
  To: Leo Yan
  Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Jin Yao, John Garry,
	Yonatan Goldschmidt, linux-perf-users, linux-kernel, netdev, bpf

Em Sun, Dec 12, 2021 at 09:47:20PM +0800, Leo Yan escreveu:
> Refactors code for gathering PID infos, it creates the function
> nsinfo__get_nspid() to parse process 'status' node in folder '/proc'.
> 
> Base on the refactoring, this patch introduces a new helper
> nsinfo__is_in_root_namespace(), it returns true when the caller runs in
> the root PID namespace.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/namespaces.c | 76 ++++++++++++++++++++++--------------
>  tools/perf/util/namespaces.h |  2 +
>  2 files changed, 48 insertions(+), 30 deletions(-)
> 
> diff --git a/tools/perf/util/namespaces.c b/tools/perf/util/namespaces.c
> index 608b20c72a5c..48aa3217300b 100644
> --- a/tools/perf/util/namespaces.c
> +++ b/tools/perf/util/namespaces.c
> @@ -60,17 +60,49 @@ void namespaces__free(struct namespaces *namespaces)
>  	free(namespaces);
>  }
>  
> +static int nsinfo__get_nspid(struct nsinfo *nsi, const char *path)
> +{
> +	FILE *f = NULL;
> +	char *statln = NULL;
> +	size_t linesz = 0;
> +	char *nspid;
> +
> +	f = fopen(path, "r");
> +	if (f == NULL)
> +		return -1;
> +
> +	while (getline(&statln, &linesz, f) != -1) {
> +		/* Use tgid if CONFIG_PID_NS is not defined. */
> +		if (strstr(statln, "Tgid:") != NULL) {
> +			nsi->tgid = (pid_t)strtol(strrchr(statln, '\t'),
> +						     NULL, 10);
> +			nsi->nstgid = nsi->tgid;
> +		}
> +
> +		if (strstr(statln, "NStgid:") != NULL) {
> +			nspid = strrchr(statln, '\t');
> +			nsi->nstgid = (pid_t)strtol(nspid, NULL, 10);
> +			/*
> +			 * If innermost tgid is not the first, process is in a different
> +			 * PID namespace.
> +			 */
> +			nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
> +			break;
> +		}
> +	}
> +
> +	fclose(f);
> +	free(statln);
> +	return 0;
> +}
> +
>  int nsinfo__init(struct nsinfo *nsi)
>  {
>  	char oldns[PATH_MAX];
>  	char spath[PATH_MAX];
>  	char *newns = NULL;
> -	char *statln = NULL;
> -	char *nspid;
>  	struct stat old_stat;
>  	struct stat new_stat;
> -	FILE *f = NULL;
> -	size_t linesz = 0;
>  	int rv = -1;
>  
>  	if (snprintf(oldns, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX)
> @@ -100,34 +132,9 @@ int nsinfo__init(struct nsinfo *nsi)
>  	if (snprintf(spath, PATH_MAX, "/proc/%d/status", nsi->pid) >= PATH_MAX)
>  		goto out;
>  
> -	f = fopen(spath, "r");
> -	if (f == NULL)
> -		goto out;
> -
> -	while (getline(&statln, &linesz, f) != -1) {
> -		/* Use tgid if CONFIG_PID_NS is not defined. */
> -		if (strstr(statln, "Tgid:") != NULL) {
> -			nsi->tgid = (pid_t)strtol(strrchr(statln, '\t'),
> -						     NULL, 10);
> -			nsi->nstgid = nsi->tgid;
> -		}
> -
> -		if (strstr(statln, "NStgid:") != NULL) {
> -			nspid = strrchr(statln, '\t');
> -			nsi->nstgid = (pid_t)strtol(nspid, NULL, 10);
> -			/* If innermost tgid is not the first, process is in a different
> -			 * PID namespace.
> -			 */
> -			nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
> -			break;
> -		}
> -	}
> -	rv = 0;
> +	rv = nsinfo__get_nspid(nsi, spath);
>  
>  out:
> -	if (f != NULL)
> -		(void) fclose(f);
> -	free(statln);
>  	free(newns);
>  	return rv;
>  }
> @@ -299,3 +306,12 @@ int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi)
>  
>  	return ret;
>  }
> +
> +bool nsinfo__is_in_root_namespace(void)
> +{
> +	struct nsinfo nsi;
> +
> +	memset(&nsi, 0x0, sizeof(nsi));
> +	nsinfo__get_nspid(&nsi, "/proc/self/status");
> +	return !nsi.in_pidns;
> +}
> diff --git a/tools/perf/util/namespaces.h b/tools/perf/util/namespaces.h
> index ad9775db7b9c..9ceea9643507 100644
> --- a/tools/perf/util/namespaces.h
> +++ b/tools/perf/util/namespaces.h
> @@ -59,6 +59,8 @@ void nsinfo__mountns_exit(struct nscookie *nc);
>  char *nsinfo__realpath(const char *path, struct nsinfo *nsi);
>  int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi);
>  
> +bool nsinfo__is_in_root_namespace(void);
> +
>  static inline void __nsinfo__zput(struct nsinfo **nsip)
>  {
>  	if (nsip) {
> -- 
> 2.25.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload
  2021-12-13 13:54   ` James Clark
@ 2021-12-14  3:24     ` Leo Yan
  0 siblings, 0 replies; 6+ messages in thread
From: Leo Yan @ 2021-12-14  3:24 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jin Yao, John Garry, Yonatan Goldschmidt,
	linux-perf-users, linux-kernel, netdev, bpf

Hi James,

On Mon, Dec 13, 2021 at 01:54:33PM +0000, James Clark wrote:
> 
> 
> On 12/12/2021 13:47, Leo Yan wrote:
> > In function evlist__prepare_workload(), after perf forks a child process
> > and launches a workload in the created process, it needs to retrieve
> > process and namespace related info via '/proc/$PID/' node.
> > 
> > The process folders under 'proc' file system use the PID number from the
> > root PID namespace, when perf tool runs in non-root PID namespace and
> > creates new process for profiled program, this leads to the perf tool
> > wrongly gather process info since it uses PID from non-root namespace to
> > access nodes under '/proc'.
> > 
> > Let's see an example:
> > 
> >   unshare --fork --pid perf record -e cs_etm//u -a -- test_program
> > 
> > This command runs perf tool and the profiled program 'test_program' in
> > the non-root PID namespace.  When perf tool launches 'test_program',
> > e.g. the forked PID number is 2, perf tool retrieves process info for
> > 'test_program' from the folder '/proc/2'.  But '/proc/2' is actually for
> > a kernel thread so perf tool wrongly gather info for 'test_program'.
> 
> Hi Leo,
> 
> Which features aren't working exactly when you run in a non root namespace?

When perf tool lanches workload, it needs to synthesize samples for
PERF_RECORD_COMM and PERF_RECORD_NAMESPACES, this can make sure the
thread info has been prepared ahead before we decode hardware trace
data (e.g. using Arm CoreSight, SPE, or Intel PT, etc).

Also please see the comment in perf record tool [1]:

"Some H/W events are generated before COMM event
* which is emitted during exec(), so perf script
* cannot see a correct process name for those events.
* Synthesize COMM event to prevent it."

Unfortunately, when using the command "unshare --fork --pid perf
record -e cs_etm//u --namespaces -a -- test_program", it uses
the PID from non-root namespace to synthesize RECORD_COMM and
RECORD_NAMESPACES events, but the PID number doesn't match with the
process folders under /proc folder (which uses the root namespace's PID
to create file node).  As result, perf tool uses pid 2 (from non-root
namespace to capture a kernel thread info rather than the info for
created workload:

0x1ea90 [0x40]: event: 3
.
. ... raw event: size 64 bytes
.  0000:  03 00 00 00 00 00 40 00 02 00 00 00 02 00 00 00  ......@.........
.  0010:  6b 74 68 72 65 61 64 64 00 00 00 00 00 00 00 00  kthreadd........
.  0020:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  0030:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

0 0 0x1ea90 [0x40]: PERF_RECORD_COMM: kthreadd:2/2

0x1ead0 [0xa8]: event: 16
.
. ... raw event: size 168 bytes
.  0000:  10 00 00 00 00 00 a8 00 02 00 00 00 02 00 00 00  ......?.........
.  0010:  07 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00  ................
.  0020:  91 00 00 f0 00 00 00 00 04 00 00 00 00 00 00 00  ...?............
.  0030:  fe ff ff ef 00 00 00 00 04 00 00 00 00 00 00 00  ????............
.  0040:  ff ff ff ef 00 00 00 00 04 00 00 00 00 00 00 00  ????............
.  0050:  fc ff ff ef 00 00 00 00 04 00 00 00 00 00 00 00  ????............
.  0060:  fd ff ff ef 00 00 00 00 04 00 00 00 00 00 00 00  ????............
.  0070:  00 00 00 f0 00 00 00 00 04 00 00 00 00 00 00 00  ...?............
.  0080:  fb ff ff ef 00 00 00 00 00 00 00 00 00 00 00 00  ????............
.  0090:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
.  00a0:  00 00 00 00 00 00 00 00                          ........        

0 0 0x1ead0 [0xa8]: PERF_RECORD_NAMESPACES 2/2 - nr_namespaces: 7
                [0/net: 4/0xf0000091, 1/uts: 4/0xeffffffe, 2/ipc: 4/0xefffffff, 3/pid: 4/0xeffffffc, 
                 4/user: 4/0xeffffffd, 5/mnt: 4/0xf0000000, 6/cgroup: 4/0xeffffffb]

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/builtin-record.c#n1823

> I did "perf record -- ls" and it seemed to be working for me. At least kernel
> sampling would be working in a namespace, even if there was something wrong
> with userspace.

The issue is relevant with the hardware trace events.

> I think causing a failure might be too restrictive and would prevent people
> from using perf in a container. Maybe we could show a warning instead, but
> I'm not sure exactly what's not working because I thought perf looked up stuff
> based on the path of the process not the pid.

Good point.  I am also worry that it is arbitrary to prevent perf to be
used in the namespace, so this patch it doesn't forbid all cases for
perf tool.  It only returns failure when perf tool tries to fork a
new program.

When perf tool runs in non-root PID namespace, since it still can access
the process info from root file system's /proc node, this causes mess that
the perf tool gathers process info from the root PID namespace.

One thing I think I should dig deeper: can we dynamically update (or mount)
/proc node when perf tool runs in non-root PID namespace so can ensure
perf tool to only see the process nodes in the same non-root namespace?
This can be a solution to avoid the perf tool gathers mess process info.
If anyone know for this, also welcome suggestion, thanks a lot!

Otherwise if we cannot find method to update '/proc' nodes, I think we
still need this patch to return failure when detects perf running in
non-root PID namespace.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-14  3:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-12 13:47 [PATCH v1 0/2] perf: Fix perf in non-root PID namespace Leo Yan
2021-12-12 13:47 ` [PATCH v1 1/2] perf namespaces: Add helper nsinfo__is_in_root_namespace() Leo Yan
2021-12-13 18:20   ` Arnaldo Carvalho de Melo
2021-12-12 13:47 ` [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload Leo Yan
2021-12-13 13:54   ` James Clark
2021-12-14  3:24     ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).