All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Clark Williams <williams@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Stephane Eranian <eranian@google.com>,
	Ian Rogers <irogers@google.com>, Jiri Olsa <jolsa@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andrey Zhizhikin <andrey.z@gmail.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Petr Mladek <pmladek@suse.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 20/26] tools api fs: Make xxx__mountpoint() more scalable
Date: Mon, 13 Apr 2020 13:51:57 -0300	[thread overview]
Message-ID: <20200413165203.1816-21-acme@kernel.org> (raw)
In-Reply-To: <20200413165203.1816-1-acme@kernel.org>

From: Stephane Eranian <eranian@google.com>

The xxx_mountpoint() interface provided by fs.c finds mount points for
common pseudo filesystems. The first time xxx_mountpoint() is invoked,
it scans the mount table (/proc/mounts) looking for a match. If found,
it is cached. The price to scan /proc/mounts is paid once if the mount
is found.

When the mount point is not found, subsequent calls to xxx_mountpoint()
scan /proc/mounts over and over again.  There is no caching.

This causes a scaling issue in perf record with hugeltbfs__mountpoint().
The function is called for each process found in
synthesize__mmap_events().  If the machine has thousands of processes
and if the /proc/mounts has many entries this could cause major overhead
in perf record. We have observed multi-second slowdowns on some
configurations.

As an example on a laptop:

Before:

  $ sudo umount /dev/hugepages
  $ strace -e trace=openat -o /tmp/tt perf record -a ls
  $ fgrep mounts /tmp/tt
  285

After:

  $ sudo umount /dev/hugepages
  $ strace -e trace=openat -o /tmp/tt perf record -a ls
  $ fgrep mounts /tmp/tt
  1

One could argue that the non-caching in case the moint point is not
found is intentional. That way subsequent calls may discover a moint
point if the sysadmin mounts the filesystem. But the same argument could
be made against caching the mount point. It could be unmounted causing
errors.  It all depends on the intent of the interface. This patch
assumes it is expected to scan /proc/mounts once. The patch documents
the caching behavior in the fs.h header file.

An alternative would be to just fix perf record. But it would solve the
problem with hugetlbs__mountpoint() but there could be similar issues
(possibly down the line) with other xxx_mountpoint() calls in perf or
other tools.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrey Zhizhikin <andrey.z@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200402154357.107873-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/fs/fs.c | 17 +++++++++++++++++
 tools/lib/api/fs/fs.h | 12 ++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 027b18f7ed8c..82f53d81a7a7 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -90,6 +90,7 @@ struct fs {
 	const char * const	*mounts;
 	char			 path[PATH_MAX];
 	bool			 found;
+	bool			 checked;
 	long			 magic;
 };
 
@@ -111,31 +112,37 @@ static struct fs fs__entries[] = {
 		.name	= "sysfs",
 		.mounts	= sysfs__fs_known_mountpoints,
 		.magic	= SYSFS_MAGIC,
+		.checked = false,
 	},
 	[FS__PROCFS] = {
 		.name	= "proc",
 		.mounts	= procfs__known_mountpoints,
 		.magic	= PROC_SUPER_MAGIC,
+		.checked = false,
 	},
 	[FS__DEBUGFS] = {
 		.name	= "debugfs",
 		.mounts	= debugfs__known_mountpoints,
 		.magic	= DEBUGFS_MAGIC,
+		.checked = false,
 	},
 	[FS__TRACEFS] = {
 		.name	= "tracefs",
 		.mounts	= tracefs__known_mountpoints,
 		.magic	= TRACEFS_MAGIC,
+		.checked = false,
 	},
 	[FS__HUGETLBFS] = {
 		.name	= "hugetlbfs",
 		.mounts = hugetlbfs__known_mountpoints,
 		.magic	= HUGETLBFS_MAGIC,
+		.checked = false,
 	},
 	[FS__BPF_FS] = {
 		.name	= "bpf",
 		.mounts = bpf_fs__known_mountpoints,
 		.magic	= BPF_FS_MAGIC,
+		.checked = false,
 	},
 };
 
@@ -158,6 +165,7 @@ static bool fs__read_mounts(struct fs *fs)
 	}
 
 	fclose(fp);
+	fs->checked = true;
 	return fs->found = found;
 }
 
@@ -220,6 +228,7 @@ static bool fs__env_override(struct fs *fs)
 		return false;
 
 	fs->found = true;
+	fs->checked = true;
 	strncpy(fs->path, override_path, sizeof(fs->path) - 1);
 	fs->path[sizeof(fs->path) - 1] = '\0';
 	return true;
@@ -246,6 +255,14 @@ static const char *fs__mountpoint(int idx)
 	if (fs->found)
 		return (const char *)fs->path;
 
+	/* the mount point was already checked for the mount point
+	 * but and did not exist, so return NULL to avoid scanning again.
+	 * This makes the found and not found paths cost equivalent
+	 * in case of multiple calls.
+	 */
+	if (fs->checked)
+		return NULL;
+
 	return fs__get_mountpoint(fs);
 }
 
diff --git a/tools/lib/api/fs/fs.h b/tools/lib/api/fs/fs.h
index 936edb95e1f3..aa222ca30311 100644
--- a/tools/lib/api/fs/fs.h
+++ b/tools/lib/api/fs/fs.h
@@ -18,6 +18,18 @@
 	const char *name##__mount(void);	\
 	bool name##__configured(void);		\
 
+/*
+ * The xxxx__mountpoint() entry points find the first match mount point for each
+ * filesystems listed below, where xxxx is the filesystem type.
+ *
+ * The interface is as follows:
+ *
+ * - If a mount point is found on first call, it is cached and used for all
+ *   subsequent calls.
+ *
+ * - If a mount point is not found, NULL is returned on first call and all
+ *   subsequent calls.
+ */
 FS(sysfs)
 FS(procfs)
 FS(debugfs)
-- 
2.21.1


  parent reply	other threads:[~2020-04-13 16:54 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-13 16:51 [GIT PULL] perf/core fixes and improvements Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 01/26] perf tools: Synthesize bpf_trampoline/dispatcher ksymbol event Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 02/26] perf machine: Set ksymbol dso as loaded on arrival Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 03/26] perf annotate: Add basic support for bpf_image Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 04/26] tools arch x86: Sync the msr-index.h copy with the kernel sources Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 05/26] capabilities: Introduce CAP_PERFMON to kernel and user space Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 06/26] perf/core: Open access to the core for CAP_PERFMON privileged process Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 07/26] perf/core: open access to probes " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 08/26] perf tools: Support CAP_PERFMON capability Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 09/26] drm/i915/perf: Open access for CAP_PERFMON privileged process Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 10/26] trace/bpf_trace: " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 11/26] powerpc/perf: open " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 12/26] parisc/perf: " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 13/26] drivers/perf: Open " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 14/26] drivers/oprofile: " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 15/26] doc/admin-guide: Update perf-security.rst with CAP_PERFMON information Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 16/26] doc/admin-guide: update kernel.rst " Arnaldo Carvalho de Melo
2020-04-13 16:51   ` [Intel-gfx] " Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 17/26] perf python: Check if clang supports -fno-semantic-interposition Arnaldo Carvalho de Melo
2020-04-13 16:51   ` Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 18/26] perf script: Simplify auxiliary event printing functions Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 19/26] perf bench: Add event synthesis benchmark Arnaldo Carvalho de Melo
2020-04-13 16:51 ` Arnaldo Carvalho de Melo [this message]
2020-04-13 16:51 ` [PATCH 21/26] perf synthetic-events: save 4kb from 2 stack frames Arnaldo Carvalho de Melo
2020-04-13 16:51 ` [PATCH 22/26] perf stat: Fix no metric header if --per-socket and --metric-only set Arnaldo Carvalho de Melo
2020-04-13 16:52 ` [PATCH 23/26] perf expr: Add expr_ prefix for parse_ctx and parse_id Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52 ` [PATCH 24/26] perf expr: Add expr_scanner_ctx object Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52 ` [PATCH 25/26] perf metrictroup: Split the metricgroup__add_metric function Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52   ` Arnaldo Carvalho de Melo
2020-04-13 16:52 ` [PATCH 26/26] perf script: Add flamegraph.py script Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200413165203.1816-21-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrey.z@gmail.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    --cc=wangkefeng.wang@huawei.com \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.