linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/16] Address some perf memory/data size issues
@ 2023-05-25  7:11 Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology Ian Rogers
                   ` (15 more replies)
  0 siblings, 16 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Try to reduce the data size of the perf command. Before these patches
a stripped non-debug binary was:

$ size -A perf
perf  :
section                  size       addr
.interp                    28        848
.note.gnu.property         32        880
.note.gnu.build-id         36        912
.note.ABI-tag              32        948
.gnu.hash               24628        984
.dynsym                 88920      25616
.dynstr                 70193     114536
.gnu.version             7410     184730
.gnu.version_r            800     192144
.rela.dyn              460824     192944
.rela.plt               14784     653768
.init                      23     671744
.plt                     9872     671776
.plt.got                   24     681648
.text                 2279182     681680
.noinstr.text             476    2960864
.fini                       9    2961340
.rodata               7042922    2961408
.eh_frame_hdr           42844   10004332
.eh_frame              226496   10047176
.tbss                      48   10279720
.init_array                16   10279720
.fini_array                 8   10279736
.data.rel.ro            53376   10279744
.dynamic                  736   10333120
.got                      328   10333856
.got.plt                 4952   10334184
.data                  391088   10339136
.bss                   285776   10730240
.comment                   31          0
Total                11005894

And after:
perf  :
section                  size       addr
.interp                    28        848
.note.gnu.property         32        880
.note.gnu.build-id         36        912
.note.ABI-tag              32        948
.gnu.hash               24628        984
.dynsym                 88944      25616
.dynstr                 70217     114560
.gnu.version             7412     184778
.gnu.version_r            816     192192
.rela.dyn              460824     193008
.rela.plt               14808     653832
.init                      23     671744
.plt                     9888     671776
.plt.got                   24     681664
.text                 2280446     681696
.noinstr.text             476    2962144
.fini                       9    2962620
.rodata               7048746    2965504
.eh_frame_hdr           42852   10014252
.eh_frame              226568   10057104
.tbss                      48   10285640
.init_array                16   10285640
.fini_array                 8   10285656
.data.rel.ro           301408   10285664
.dynamic                  736   10587072
.got                      328   10587808
.got.plt                 4960   10588136
.data                  100464   10593152
.bss                    22512   10693632
.comment                   31          0
Total                10707320

The binary has reduced in size by 298,574 bytes. The .bss, that
doesn't count toward file size, is reduced by 263,254 bytes. At
runtime this could reduce the footprint up to 561,828 bytes. This is
still just a fraction of the .rodata section's size of 7,048,746
bytes, that mainly contains the converted json events. The .rodata
section needn't all be mapped at the same time.

The changes are largely removing static variables and replacing them
with local or dynamically allocated memory. A common issue was having
paths in statics for the sake of returning a non-stack pointer to a
buffer, so the APIs were changed to pass buffers in.

Ian Rogers (16):
  perf header: Make nodes dynamic in write_mem_topology
  perf test x86: insn-x86 test data is immutable so mark it const
  perf test x86: intel-pt-test data is immutable so mark it const
  perf trace: Make some large static arrays const
  perf trace beauty: Make MSR arrays const
  tools api fs: Avoid large static PATH_MAX arrays
  tools lib api fs tracing_path: Remove two unused MAX_PATH paths
  perf daemon: Dynamically allocate path to perf
  perf lock: Dynamically allocate lockhash_table
  perf timechart: Make large arrays dynamic
  perf probe: Dynamically allocate params memory
  perf path: Make mkpath thread safe
  perf scripting-engines: Move static to local variable
  tools api fs: Dynamically allocate cgroupfs mount point cache
  perf test pmu: Avoid 2 static path arrays
  libsubcmd: Avoid two path statics

 tools/lib/api/fs/cgroup.c                     |  17 ++-
 tools/lib/api/fs/fs.c                         |  25 +++-
 tools/lib/api/fs/tracing_path.c               |  17 +--
 tools/lib/subcmd/exec-cmd.c                   |  35 +++--
 tools/perf/arch/x86/tests/insn-x86.c          |  10 +-
 tools/perf/arch/x86/tests/intel-pt-test.c     |  14 +-
 tools/perf/builtin-config.c                   |   4 +-
 tools/perf/builtin-daemon.c                   |  15 +-
 tools/perf/builtin-help.c                     |   4 +-
 tools/perf/builtin-lock.c                     |  20 ++-
 tools/perf/builtin-probe.c                    | 133 ++++++++++--------
 tools/perf/builtin-timechart.c                |  48 +++++--
 tools/perf/builtin-trace.c                    |  33 +++--
 tools/perf/tests/pmu.c                        |  17 +--
 tools/perf/trace/beauty/beauty.h              |   2 +-
 .../perf/trace/beauty/tracepoints/x86_msr.sh  |   6 +-
 tools/perf/util/cache.h                       |   2 +-
 tools/perf/util/config.c                      |   3 +-
 tools/perf/util/header.c                      |  33 +++--
 tools/perf/util/path.c                        |  35 +----
 .../util/scripting-engines/trace-event-perl.c |   4 +-
 .../scripting-engines/trace-event-python.c    |   5 +-
 22 files changed, 278 insertions(+), 204 deletions(-)

-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25 19:14   ` Namhyung Kim
  2023-05-25  7:11 ` [PATCH v1 02/16] perf test x86: insn-x86 test data is immutable so mark it const Ian Rogers
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid a large static array, dynamically allocate the nodes avoiding a
hard coded limited as well.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/header.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 2dde3ca20de5..80593ed8c79b 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -24,6 +24,7 @@
 #include <bpf/libbpf.h>
 #endif
 #include <perf/cpumap.h>
+#include <tools/libc_compat.h> // reallocarray
 
 #include "dso.h"
 #include "evlist.h"
@@ -1396,13 +1397,14 @@ static int memory_node__sort(const void *a, const void *b)
 	return na->node - nb->node;
 }
 
-static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
+static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
 {
 	char path[PATH_MAX];
 	struct dirent *ent;
 	DIR *dir;
-	u64 cnt = 0;
 	int ret = 0;
+	size_t cnt = 0, size = 0;
+	struct memory_node *nodes = NULL;
 
 	scnprintf(path, PATH_MAX, "%s/devices/system/node/",
 		  sysfs__mountpoint());
@@ -1426,16 +1428,24 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
 		if (r != 1)
 			continue;
 
-		if (WARN_ONCE(cnt >= size,
-			"failed to write MEM_TOPOLOGY, way too many nodes\n")) {
-			closedir(dir);
-			return -1;
-		}
+		if (cnt >= size) {
+			struct memory_node *new_nodes =
+				reallocarray(nodes, cnt + 4, sizeof(*nodes));
 
+			if (!new_nodes) {
+				pr_err("Failed to write MEM_TOPOLOGY, size %zd nodes\n", size);
+				free(nodes);
+				closedir(dir);
+				return -ENOMEM;
+			}
+			nodes = new_nodes;
+			size += 4;
+		}
 		ret = memory_node__read(&nodes[cnt++], idx);
 	}
 
 	*cntp = cnt;
+	*nodesp = nodes;
 	closedir(dir);
 
 	if (!ret)
@@ -1444,8 +1454,6 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
 	return ret;
 }
 
-#define MAX_MEMORY_NODES 2000
-
 /*
  * The MEM_TOPOLOGY holds physical memory map for every
  * node in system. The format of data is as follows:
@@ -1464,8 +1472,8 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
 static int write_mem_topology(struct feat_fd *ff __maybe_unused,
 			      struct evlist *evlist __maybe_unused)
 {
-	static struct memory_node nodes[MAX_MEMORY_NODES];
-	u64 bsize, version = 1, i, nr;
+	struct memory_node *nodes = NULL;
+	u64 bsize, version = 1, i, nr = 0;
 	int ret;
 
 	ret = sysfs__read_xll("devices/system/memory/block_size_bytes",
@@ -1473,7 +1481,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
 	if (ret)
 		return ret;
 
-	ret = build_mem_topology(&nodes[0], MAX_MEMORY_NODES, &nr);
+	ret = build_mem_topology(&nodes, &nr);
 	if (ret)
 		return ret;
 
@@ -1508,6 +1516,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
 	}
 
 out:
+	free(nodes);
 	return ret;
 }
 
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 02/16] perf test x86: insn-x86 test data is immutable so mark it const
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 03/16] perf test x86: intel-pt-test " Ian Rogers
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

This allows the movement of some sizeable data arrays (168,624bytes)
to .data.relro. Without PIE or the strings it could be moved to
.rodata.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/tests/insn-x86.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/x86/tests/insn-x86.c b/tools/perf/arch/x86/tests/insn-x86.c
index 735257d205b5..7b5eb8baf0f2 100644
--- a/tools/perf/arch/x86/tests/insn-x86.c
+++ b/tools/perf/arch/x86/tests/insn-x86.c
@@ -18,14 +18,14 @@ struct test_data {
 	const char *asm_rep;
 };
 
-struct test_data test_data_32[] = {
+const struct test_data test_data_32[] = {
 #include "insn-x86-dat-32.c"
 	{{0x0f, 0x01, 0xee}, 3, 0, NULL, NULL, "0f 01 ee             \trdpkru"},
 	{{0x0f, 0x01, 0xef}, 3, 0, NULL, NULL, "0f 01 ef             \twrpkru"},
 	{{0}, 0, 0, NULL, NULL, NULL},
 };
 
-struct test_data test_data_64[] = {
+const struct test_data test_data_64[] = {
 #include "insn-x86-dat-64.c"
 	{{0x0f, 0x01, 0xee}, 3, 0, NULL, NULL, "0f 01 ee             \trdpkru"},
 	{{0x0f, 0x01, 0xef}, 3, 0, NULL, NULL, "0f 01 ef             \twrpkru"},
@@ -97,7 +97,7 @@ static int get_branch(const char *branch_str)
 	return -1;
 }
 
-static int test_data_item(struct test_data *dat, int x86_64)
+static int test_data_item(const struct test_data *dat, int x86_64)
 {
 	struct intel_pt_insn intel_pt_insn;
 	int op, branch, ret;
@@ -147,9 +147,9 @@ static int test_data_item(struct test_data *dat, int x86_64)
 	return 0;
 }
 
-static int test_data_set(struct test_data *dat_set, int x86_64)
+static int test_data_set(const struct test_data *dat_set, int x86_64)
 {
-	struct test_data *dat;
+	const struct test_data *dat;
 	int ret = 0;
 
 	for (dat = dat_set; dat->expected_length; dat++) {
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 03/16] perf test x86: intel-pt-test data is immutable so mark it const
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 02/16] perf test x86: insn-x86 test data is immutable so mark it const Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 04/16] perf trace: Make some large static arrays const Ian Rogers
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

This allows the movement of 5,808 bytes from .data to .rodata.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/tests/intel-pt-test.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/arch/x86/tests/intel-pt-test.c b/tools/perf/arch/x86/tests/intel-pt-test.c
index 70b7f79396b1..09d61fa736e3 100644
--- a/tools/perf/arch/x86/tests/intel-pt-test.c
+++ b/tools/perf/arch/x86/tests/intel-pt-test.c
@@ -22,7 +22,7 @@
  * @new_ctx: expected new packet context
  * @ctx_unchanged: the packet context must not change
  */
-static struct test_data {
+static const struct test_data {
 	int len;
 	u8 bytes[INTEL_PT_PKT_MAX_SZ];
 	enum intel_pt_pkt_ctx ctx;
@@ -186,7 +186,7 @@ static struct test_data {
 	{0, {0}, 0, {0, 0, 0}, 0, 0 },
 };
 
-static int dump_packet(struct intel_pt_pkt *packet, u8 *bytes, int len)
+static int dump_packet(const struct intel_pt_pkt *packet, const u8 *bytes, int len)
 {
 	char desc[INTEL_PT_PKT_DESC_MAX];
 	int ret, i;
@@ -206,14 +206,14 @@ static int dump_packet(struct intel_pt_pkt *packet, u8 *bytes, int len)
 	return TEST_OK;
 }
 
-static void decoding_failed(struct test_data *d)
+static void decoding_failed(const struct test_data *d)
 {
 	pr_debug("Decoding failed!\n");
 	pr_debug("Decoding:  ");
 	dump_packet(&d->packet, d->bytes, d->len);
 }
 
-static int fail(struct test_data *d, struct intel_pt_pkt *packet, int len,
+static int fail(const struct test_data *d, struct intel_pt_pkt *packet, int len,
 		enum intel_pt_pkt_ctx new_ctx)
 {
 	decoding_failed(d);
@@ -242,7 +242,7 @@ static int fail(struct test_data *d, struct intel_pt_pkt *packet, int len,
 	return TEST_FAIL;
 }
 
-static int test_ctx_unchanged(struct test_data *d, struct intel_pt_pkt *packet,
+static int test_ctx_unchanged(const struct test_data *d, struct intel_pt_pkt *packet,
 			      enum intel_pt_pkt_ctx ctx)
 {
 	enum intel_pt_pkt_ctx old_ctx = ctx;
@@ -258,7 +258,7 @@ static int test_ctx_unchanged(struct test_data *d, struct intel_pt_pkt *packet,
 	return TEST_OK;
 }
 
-static int test_one(struct test_data *d)
+static int test_one(const struct test_data *d)
 {
 	struct intel_pt_pkt packet;
 	enum intel_pt_pkt_ctx ctx = d->ctx;
@@ -307,7 +307,7 @@ static int test_one(struct test_data *d)
  */
 int test__intel_pt_pkt_decoder(struct test_suite *test __maybe_unused, int subtest __maybe_unused)
 {
-	struct test_data *d = data;
+	const struct test_data *d = data;
 	int ret;
 
 	for (d = data; d->len; d++) {
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 04/16] perf trace: Make some large static arrays const
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (2 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 03/16] perf test x86: intel-pt-test " Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 05/16] perf trace beauty: Make MSR " Ian Rogers
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Allows the movement of 33,128 bytes from .data to .data.rel.ro.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-trace.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index b49d3abb1203..62c7c99a0fe4 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -914,7 +914,7 @@ static size_t syscall_arg__scnprintf_getrandom_flags(char *bf, size_t size,
 #include "trace/beauty/socket_type.c"
 #include "trace/beauty/waitid_options.c"
 
-static struct syscall_fmt syscall_fmts[] = {
+static const struct syscall_fmt syscall_fmts[] = {
 	{ .name	    = "access",
 	  .arg = { [1] = { .scnprintf = SCA_ACCMODE,  /* mode */ }, }, },
 	{ .name	    = "arch_prctl",
@@ -1176,18 +1176,21 @@ static int syscall_fmt__cmp(const void *name, const void *fmtp)
 	return strcmp(name, fmt->name);
 }
 
-static struct syscall_fmt *__syscall_fmt__find(struct syscall_fmt *fmts, const int nmemb, const char *name)
+static const struct syscall_fmt *__syscall_fmt__find(const struct syscall_fmt *fmts,
+						     const int nmemb,
+						     const char *name)
 {
 	return bsearch(name, fmts, nmemb, sizeof(struct syscall_fmt), syscall_fmt__cmp);
 }
 
-static struct syscall_fmt *syscall_fmt__find(const char *name)
+static const struct syscall_fmt *syscall_fmt__find(const char *name)
 {
 	const int nmemb = ARRAY_SIZE(syscall_fmts);
 	return __syscall_fmt__find(syscall_fmts, nmemb, name);
 }
 
-static struct syscall_fmt *__syscall_fmt__find_by_alias(struct syscall_fmt *fmts, const int nmemb, const char *alias)
+static const struct syscall_fmt *__syscall_fmt__find_by_alias(const struct syscall_fmt *fmts,
+							      const int nmemb, const char *alias)
 {
 	int i;
 
@@ -1199,7 +1202,7 @@ static struct syscall_fmt *__syscall_fmt__find_by_alias(struct syscall_fmt *fmts
 	return NULL;
 }
 
-static struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
+static const struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
 {
 	const int nmemb = ARRAY_SIZE(syscall_fmts);
 	return __syscall_fmt__find_by_alias(syscall_fmts, nmemb, alias);
@@ -1224,7 +1227,7 @@ struct syscall {
 	bool		    nonexistent;
 	struct tep_format_field *args;
 	const char	    *name;
-	struct syscall_fmt  *fmt;
+	const struct syscall_fmt  *fmt;
 	struct syscall_arg_fmt *arg_fmt;
 };
 
@@ -1673,7 +1676,7 @@ static int syscall__alloc_arg_fmts(struct syscall *sc, int nr_args)
 	return 0;
 }
 
-static struct syscall_arg_fmt syscall_arg_fmts__by_name[] = {
+static const struct syscall_arg_fmt syscall_arg_fmts__by_name[] = {
 	{ .name = "msr",	.scnprintf = SCA_X86_MSR,	  .strtoul = STUL_X86_MSR,	   },
 	{ .name = "vector",	.scnprintf = SCA_X86_IRQ_VECTORS, .strtoul = STUL_X86_IRQ_VECTORS, },
 };
@@ -1684,13 +1687,14 @@ static int syscall_arg_fmt__cmp(const void *name, const void *fmtp)
        return strcmp(name, fmt->name);
 }
 
-static struct syscall_arg_fmt *
-__syscall_arg_fmt__find_by_name(struct syscall_arg_fmt *fmts, const int nmemb, const char *name)
+static const struct syscall_arg_fmt *
+__syscall_arg_fmt__find_by_name(const struct syscall_arg_fmt *fmts, const int nmemb,
+				const char *name)
 {
        return bsearch(name, fmts, nmemb, sizeof(struct syscall_arg_fmt), syscall_arg_fmt__cmp);
 }
 
-static struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *name)
+static const struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *name)
 {
        const int nmemb = ARRAY_SIZE(syscall_arg_fmts__by_name);
        return __syscall_arg_fmt__find_by_name(syscall_arg_fmts__by_name, nmemb, name);
@@ -1735,8 +1739,9 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
 			 * 7 unsigned long
 			 */
 			arg->scnprintf = SCA_FD;
-               } else {
-			struct syscall_arg_fmt *fmt = syscall_arg_fmt__find_by_name(field->name);
+		} else {
+			const struct syscall_arg_fmt *fmt =
+				syscall_arg_fmt__find_by_name(field->name);
 
 			if (fmt) {
 				arg->scnprintf = fmt->scnprintf;
@@ -4458,7 +4463,7 @@ static void evsel__set_syscall_arg_fmt(struct evsel *evsel, const char *name)
 	struct syscall_arg_fmt *fmt = evsel__syscall_arg_fmt(evsel);
 
 	if (fmt) {
-		struct syscall_fmt *scfmt = syscall_fmt__find(name);
+		const struct syscall_fmt *scfmt = syscall_fmt__find(name);
 
 		if (scfmt) {
 			int skip = 0;
@@ -4525,7 +4530,7 @@ static int trace__parse_events_option(const struct option *opt, const char *str,
 	int len = strlen(str) + 1, err = -1, list, idx;
 	char *strace_groups_dir = system_path(STRACE_GROUPS_DIR);
 	char group_name[PATH_MAX];
-	struct syscall_fmt *fmt;
+	const struct syscall_fmt *fmt;
 
 	if (strace_groups_dir == NULL)
 		return -1;
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 05/16] perf trace beauty: Make MSR arrays const
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (3 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 04/16] perf trace: Make some large static arrays const Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 06/16] tools api fs: Avoid large static PATH_MAX arrays Ian Rogers
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Allows the movement of 46,072 bytes from .data to .data.rel.ro.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/trace/beauty/beauty.h               | 2 +-
 tools/perf/trace/beauty/tracepoints/x86_msr.sh | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/trace/beauty/beauty.h b/tools/perf/trace/beauty/beauty.h
index 4c59edddd6a8..3d12bf0f6d07 100644
--- a/tools/perf/trace/beauty/beauty.h
+++ b/tools/perf/trace/beauty/beauty.h
@@ -11,7 +11,7 @@ struct strarray {
 	u64	    offset;
 	int	    nr_entries;
 	const char *prefix;
-	const char **entries;
+	const char * const *entries;
 };
 
 #define DEFINE_STRARRAY(array, _prefix) struct strarray strarray__##array = { \
diff --git a/tools/perf/trace/beauty/tracepoints/x86_msr.sh b/tools/perf/trace/beauty/tracepoints/x86_msr.sh
index 0078689963e0..fa3c4418e856 100755
--- a/tools/perf/trace/beauty/tracepoints/x86_msr.sh
+++ b/tools/perf/trace/beauty/tracepoints/x86_msr.sh
@@ -13,7 +13,7 @@ x86_msr_index=${arch_x86_header_dir}/msr-index.h
 # Just the ones starting with 0x00000 so as to have a simple
 # array.
 
-printf "static const char *x86_MSRs[] = {\n"
+printf "static const char * const x86_MSRs[] = {\n"
 regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MSR_([[:alnum:]][[:alnum:]_]+)[[:space:]]+(0x00000[[:xdigit:]]+)[[:space:]]*.*'
 grep -E $regex ${x86_msr_index} | grep -E -v 'MSR_(ATOM|P[46]|IA32_(TSC_DEADLINE|UCODE_REV)|IDT_FCR4)' | \
 	sed -r "s/$regex/\2 \1/g" | sort -n | \
@@ -24,7 +24,7 @@ printf "};\n\n"
 regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MSR_([[:alnum:]][[:alnum:]_]+)[[:space:]]+(0xc0000[[:xdigit:]]+)[[:space:]]*.*'
 printf "#define x86_64_specific_MSRs_offset "
 grep -E $regex ${x86_msr_index} | sed -r "s/$regex/\2/g" | sort -n | head -1
-printf "static const char *x86_64_specific_MSRs[] = {\n"
+printf "static const char * const x86_64_specific_MSRs[] = {\n"
 grep -E $regex ${x86_msr_index} | \
 	sed -r "s/$regex/\2 \1/g" | grep -E -vw 'K6_WHCR' | sort -n | \
 	xargs printf "\t[%s - x86_64_specific_MSRs_offset] = \"%s\",\n"
@@ -33,7 +33,7 @@ printf "};\n\n"
 regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MSR_([[:alnum:]][[:alnum:]_]+)[[:space:]]+(0xc0010[[:xdigit:]]+)[[:space:]]*.*'
 printf "#define x86_AMD_V_KVM_MSRs_offset "
 grep -E $regex ${x86_msr_index} | sed -r "s/$regex/\2/g" | sort -n | head -1
-printf "static const char *x86_AMD_V_KVM_MSRs[] = {\n"
+printf "static const char * const x86_AMD_V_KVM_MSRs[] = {\n"
 grep -E $regex ${x86_msr_index} | \
 	sed -r "s/$regex/\2 \1/g" | sort -n | \
 	xargs printf "\t[%s - x86_AMD_V_KVM_MSRs_offset] = \"%s\",\n"
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 06/16] tools api fs: Avoid large static PATH_MAX arrays
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (4 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 05/16] perf trace beauty: Make MSR " Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 07/16] tools lib api fs tracing_path: Remove two unused MAX_PATH paths Ian Rogers
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Change struct fs to have a pointer to a dynamically allocated array
rather than an array. This reduces the size of fs__entries from 24,768
bytes to 240 bytes. Read paths into a stack allocated array and
strdup. Fix off-by-1 fscanf %<num>s in fs__read_mounts caught by
address sanitizer.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/api/fs/fs.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 82f53d81a7a7..22d34a0be8b4 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -88,7 +88,7 @@ static const char * const bpf_fs__known_mountpoints[] = {
 struct fs {
 	const char		*name;
 	const char * const	*mounts;
-	char			 path[PATH_MAX];
+	char			*path;
 	bool			 found;
 	bool			 checked;
 	long			 magic;
@@ -151,17 +151,23 @@ static bool fs__read_mounts(struct fs *fs)
 	bool found = false;
 	char type[100];
 	FILE *fp;
+	char path[PATH_MAX + 1];
 
 	fp = fopen("/proc/mounts", "r");
 	if (fp == NULL)
-		return NULL;
+		return false;
 
 	while (!found &&
 	       fscanf(fp, "%*s %" STR(PATH_MAX) "s %99s %*s %*d %*d\n",
-		      fs->path, type) == 2) {
+		      path, type) == 2) {
 
-		if (strcmp(type, fs->name) == 0)
+		if (strcmp(type, fs->name) == 0) {
+			free(fs->path);
+			fs->path = strdup(path);
+			if (!fs->path)
+				return false;
 			found = true;
+		}
 	}
 
 	fclose(fp);
@@ -188,8 +194,11 @@ static bool fs__check_mounts(struct fs *fs)
 	ptr = fs->mounts;
 	while (*ptr) {
 		if (fs__valid_mount(*ptr, fs->magic) == 0) {
+			free(fs->path);
+			fs->path = strdup(*ptr);
+			if (!fs->path)
+				return false;
 			fs->found = true;
-			strcpy(fs->path, *ptr);
 			return true;
 		}
 		ptr++;
@@ -227,10 +236,12 @@ static bool fs__env_override(struct fs *fs)
 	if (!override_path)
 		return false;
 
+	free(fs->path);
+	fs->path = strdup(override_path);
+	if (!fs->path)
+		return false;
 	fs->found = true;
 	fs->checked = true;
-	strncpy(fs->path, override_path, sizeof(fs->path) - 1);
-	fs->path[sizeof(fs->path) - 1] = '\0';
 	return true;
 }
 
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 07/16] tools lib api fs tracing_path: Remove two unused MAX_PATH paths
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (5 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 06/16] tools api fs: Avoid large static PATH_MAX arrays Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf Ian Rogers
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

tracing_mnt was set but never written. tracing_events_path was set and
read on errors paths, but its value is exactly tracing_path with a
"/events" appended, so we can derive the value in the error
paths. There appears to have been a missing "/" when
tracing_events_path was initialized.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/api/fs/tracing_path.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 7ba3e81274e8..30745f35d0d2 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -13,17 +13,12 @@
 
 #include "tracing_path.h"
 
-static char tracing_mnt[PATH_MAX]  = "/sys/kernel/debug";
 static char tracing_path[PATH_MAX]        = "/sys/kernel/tracing";
-static char tracing_events_path[PATH_MAX] = "/sys/kernel/tracing/events";
 
 static void __tracing_path_set(const char *tracing, const char *mountpoint)
 {
-	snprintf(tracing_mnt, sizeof(tracing_mnt), "%s", mountpoint);
 	snprintf(tracing_path, sizeof(tracing_path), "%s/%s",
 		 mountpoint, tracing);
-	snprintf(tracing_events_path, sizeof(tracing_events_path), "%s/%s%s",
-		 mountpoint, tracing, "events");
 }
 
 static const char *tracing_path_tracefs_mount(void)
@@ -149,15 +144,15 @@ int tracing_path__strerror_open_tp(int err, char *buf, size_t size,
 			/* sdt markers */
 			if (!strncmp(filename, "sdt_", 4)) {
 				snprintf(buf, size,
-					"Error:\tFile %s/%s not found.\n"
+					"Error:\tFile %s/events/%s not found.\n"
 					"Hint:\tSDT event cannot be directly recorded on.\n"
 					"\tPlease first use 'perf probe %s:%s' before recording it.\n",
-					tracing_events_path, filename, sys, name);
+					tracing_path, filename, sys, name);
 			} else {
 				snprintf(buf, size,
-					 "Error:\tFile %s/%s not found.\n"
+					 "Error:\tFile %s/events/%s not found.\n"
 					 "Hint:\tPerhaps this kernel misses some CONFIG_ setting to enable this feature?.\n",
-					 tracing_events_path, filename);
+					 tracing_path, filename);
 			}
 			break;
 		}
@@ -169,9 +164,9 @@ int tracing_path__strerror_open_tp(int err, char *buf, size_t size,
 		break;
 	case EACCES: {
 		snprintf(buf, size,
-			 "Error:\tNo permissions to read %s/%s\n"
+			 "Error:\tNo permissions to read %s/events/%s\n"
 			 "Hint:\tTry 'sudo mount -o remount,mode=755 %s'\n",
-			 tracing_events_path, filename, tracing_path_mount());
+			 tracing_path, filename, tracing_path_mount());
 	}
 		break;
 	default:
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (6 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 07/16] tools lib api fs tracing_path: Remove two unused MAX_PATH paths Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25 19:17   ` Namhyung Kim
  2023-05-25  7:11 ` [PATCH v1 09/16] perf lock: Dynamically allocate lockhash_table Ian Rogers
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid a PATH_MAX array in __daemon (the .data section) by dynamically
allocating the memory.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-daemon.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-daemon.c b/tools/perf/builtin-daemon.c
index 34cbe3e959aa..adb5751c3ed0 100644
--- a/tools/perf/builtin-daemon.c
+++ b/tools/perf/builtin-daemon.c
@@ -90,7 +90,7 @@ struct daemon {
 	char			*base;
 	struct list_head	 sessions;
 	FILE			*out;
-	char			 perf[PATH_MAX];
+	char			*perf;
 	int			 signal_fd;
 	time_t			 start;
 };
@@ -1490,6 +1490,15 @@ static int __cmd_ping(struct daemon *daemon, struct option parent_options[],
 	return send_cmd(daemon, &cmd);
 }
 
+static char *set_perf_exe(void)
+{
+	char path[PATH_MAX];
+
+	perf_exe(path, sizeof(path));
+	__daemon.perf = strdup(path);
+	return __daemon.perf;
+}
+
 int cmd_daemon(int argc, const char **argv)
 {
 	struct option daemon_options[] = {
@@ -1503,7 +1512,9 @@ int cmd_daemon(int argc, const char **argv)
 		OPT_END()
 	};
 
-	perf_exe(__daemon.perf, sizeof(__daemon.perf));
+	if (!set_perf_exe())
+		return -ENOMEM;
+
 	__daemon.out = stdout;
 
 	argc = parse_options(argc, argv, daemon_options, daemon_usage,
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 09/16] perf lock: Dynamically allocate lockhash_table
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (7 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 10/16] perf timechart: Make large arrays dynamic Ian Rogers
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

lockhash_table is 32,768bytes in .bss, make it a memory allocation so
that the space is freed for non-lock perf commands.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-lock.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 70b14ba5fdd5..fc8356bd6e3a 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -48,7 +48,7 @@ static struct target target;
 #define LOCKHASH_BITS		12
 #define LOCKHASH_SIZE		(1UL << LOCKHASH_BITS)
 
-static struct hlist_head lockhash_table[LOCKHASH_SIZE];
+static struct hlist_head *lockhash_table;
 
 #define __lockhashfn(key)	hash_long((unsigned long)key, LOCKHASH_BITS)
 #define lockhashentry(key)	(lockhash_table + __lockhashfn((key)))
@@ -1871,7 +1871,6 @@ static int __cmd_contention(int argc, const char **argv)
 	};
 	struct lock_contention con = {
 		.target = &target,
-		.result = &lockhash_table[0],
 		.map_nr_entries = bpf_map_entries,
 		.max_stack = max_stack_depth,
 		.stack_skip = stack_skip,
@@ -1880,10 +1879,17 @@ static int __cmd_contention(int argc, const char **argv)
 		.owner = show_lock_owner,
 	};
 
+	lockhash_table = calloc(LOCKHASH_SIZE, sizeof(*lockhash_table));
+	if (!lockhash_table)
+		return -ENOMEM;
+
+	con.result = &lockhash_table[0];
+
 	session = perf_session__new(use_bpf ? NULL : &data, &eops);
 	if (IS_ERR(session)) {
 		pr_err("Initializing perf session failed\n");
-		return PTR_ERR(session);
+		err = PTR_ERR(session);
+		goto out_delete;
 	}
 
 	con.machine = &session->machines.host;
@@ -1983,6 +1989,7 @@ static int __cmd_contention(int argc, const char **argv)
 	evlist__delete(con.evlist);
 	lock_contention_finish();
 	perf_session__delete(session);
+	zfree(&lockhash_table);
 	return err;
 }
 
@@ -2348,6 +2355,10 @@ int cmd_lock(int argc, const char **argv)
 	unsigned int i;
 	int rc = 0;
 
+	lockhash_table = calloc(LOCKHASH_SIZE, sizeof(*lockhash_table));
+	if (!lockhash_table)
+		return -ENOMEM;
+
 	for (i = 0; i < LOCKHASH_SIZE; i++)
 		INIT_HLIST_HEAD(lockhash_table + i);
 
@@ -2369,7 +2380,7 @@ int cmd_lock(int argc, const char **argv)
 		rc = __cmd_report(false);
 	} else if (!strcmp(argv[0], "script")) {
 		/* Aliased to 'perf script' */
-		return cmd_script(argc, argv);
+		rc = cmd_script(argc, argv);
 	} else if (!strcmp(argv[0], "info")) {
 		if (argc) {
 			argc = parse_options(argc, argv,
@@ -2403,5 +2414,6 @@ int cmd_lock(int argc, const char **argv)
 		usage_with_options(lock_usage, lock_options);
 	}
 
+	zfree(&lockhash_table);
 	return rc;
 }
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 10/16] perf timechart: Make large arrays dynamic
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (8 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 09/16] perf lock: Dynamically allocate lockhash_table Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 11/16] perf probe: Dynamically allocate params memory Ian Rogers
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Allocate start time and state arrays when command starts rather than
using 114,688 bytes in .bss.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-timechart.c | 48 +++++++++++++++++++++++++++-------
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index bce1cf896f9c..829d99fecfd0 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -315,10 +315,10 @@ static void pid_put_sample(struct timechart *tchart, int pid, int type,
 
 #define MAX_CPUS 4096
 
-static u64 cpus_cstate_start_times[MAX_CPUS];
-static int cpus_cstate_state[MAX_CPUS];
-static u64 cpus_pstate_start_times[MAX_CPUS];
-static u64 cpus_pstate_state[MAX_CPUS];
+static u64 *cpus_cstate_start_times;
+static int *cpus_cstate_state;
+static u64 *cpus_pstate_start_times;
+static u64 *cpus_pstate_state;
 
 static int process_comm_event(struct perf_tool *tool,
 			      union perf_event *event,
@@ -1981,12 +1981,34 @@ int cmd_timechart(int argc, const char **argv)
 		"perf timechart record [<options>]",
 		NULL
 	};
+	int ret;
+
+	cpus_cstate_start_times = calloc(MAX_CPUS, sizeof(*cpus_cstate_start_times));
+	if (!cpus_cstate_start_times)
+		return -ENOMEM;
+	cpus_cstate_state = calloc(MAX_CPUS, sizeof(*cpus_cstate_state));
+	if (!cpus_cstate_state) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpus_pstate_start_times = calloc(MAX_CPUS, sizeof(*cpus_pstate_start_times));
+	if (!cpus_pstate_start_times) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpus_pstate_state = calloc(MAX_CPUS, sizeof(*cpus_pstate_state));
+	if (!cpus_pstate_state) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
 	argc = parse_options_subcommand(argc, argv, timechart_options, timechart_subcommands,
 			timechart_usage, PARSE_OPT_STOP_AT_NON_OPTION);
 
 	if (tchart.power_only && tchart.tasks_only) {
 		pr_err("-P and -T options cannot be used at the same time.\n");
-		return -1;
+		ret = -1;
+		goto out;
 	}
 
 	if (argc && strlen(argv[0]) > 2 && strstarts("record", argv[0])) {
@@ -1996,17 +2018,25 @@ int cmd_timechart(int argc, const char **argv)
 
 		if (tchart.power_only && tchart.tasks_only) {
 			pr_err("-P and -T options cannot be used at the same time.\n");
-			return -1;
+			ret = -1;
+			goto out;
 		}
 
 		if (tchart.io_only)
-			return timechart__io_record(argc, argv);
+			ret = timechart__io_record(argc, argv);
 		else
-			return timechart__record(&tchart, argc, argv);
+			ret = timechart__record(&tchart, argc, argv);
+		goto out;
 	} else if (argc)
 		usage_with_options(timechart_usage, timechart_options);
 
 	setup_pager();
 
-	return __cmd_timechart(&tchart, output_name);
+	ret = __cmd_timechart(&tchart, output_name);
+out:
+	zfree(&cpus_cstate_start_times);
+	zfree(&cpus_cstate_state);
+	zfree(&cpus_pstate_start_times);
+	zfree(&cpus_pstate_state);
+	return ret;
 }
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 11/16] perf probe: Dynamically allocate params memory
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (9 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 10/16] perf timechart: Make large arrays dynamic Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 12/16] perf path: Make mkpath thread safe Ian Rogers
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid 14,432 bytes in .bss by dynamically allocating params.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-probe.c | 133 ++++++++++++++++++++-----------------
 1 file changed, 71 insertions(+), 62 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 4df05b992093..019fef8da6a8 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -47,29 +47,29 @@ static struct {
 	char *target;
 	struct strfilter *filter;
 	struct nsinfo *nsi;
-} params;
+} *params;
 
 /* Parse an event definition. Note that any error must die. */
 static int parse_probe_event(const char *str)
 {
-	struct perf_probe_event *pev = &params.events[params.nevents];
+	struct perf_probe_event *pev = &params->events[params->nevents];
 	int ret;
 
-	pr_debug("probe-definition(%d): %s\n", params.nevents, str);
-	if (++params.nevents == MAX_PROBES) {
+	pr_debug("probe-definition(%d): %s\n", params->nevents, str);
+	if (++params->nevents == MAX_PROBES) {
 		pr_err("Too many probes (> %d) were specified.", MAX_PROBES);
 		return -1;
 	}
 
-	pev->uprobes = params.uprobes;
-	if (params.target) {
-		pev->target = strdup(params.target);
+	pev->uprobes = params->uprobes;
+	if (params->target) {
+		pev->target = strdup(params->target);
 		if (!pev->target)
 			return -ENOMEM;
-		params.target_used = true;
+		params->target_used = true;
 	}
 
-	pev->nsi = nsinfo__get(params.nsi);
+	pev->nsi = nsinfo__get(params->nsi);
 
 	/* Parse a perf-probe command into event */
 	ret = parse_perf_probe_command(str, pev);
@@ -84,12 +84,12 @@ static int params_add_filter(const char *str)
 	int ret = 0;
 
 	pr_debug2("Add filter: %s\n", str);
-	if (!params.filter) {
-		params.filter = strfilter__new(str, &err);
-		if (!params.filter)
+	if (!params->filter) {
+		params->filter = strfilter__new(str, &err);
+		if (!params->filter)
 			ret = err ? -EINVAL : -ENOMEM;
 	} else
-		ret = strfilter__or(params.filter, str, &err);
+		ret = strfilter__or(params->filter, str, &err);
 
 	if (ret == -EINVAL) {
 		pr_err("Filter parse error at %td.\n", err - str + 1);
@@ -112,17 +112,17 @@ static int set_target(const char *ptr)
 	 * TODO: Support relative path, and $PATH, $LD_LIBRARY_PATH,
 	 * short module name.
 	 */
-	if (!params.target && ptr && *ptr == '/') {
-		params.target = strdup(ptr);
-		if (!params.target)
+	if (!params->target && ptr && *ptr == '/') {
+		params->target = strdup(ptr);
+		if (!params->target)
 			return -ENOMEM;
-		params.target_used = false;
+		params->target_used = false;
 
 		found = 1;
 		buf = ptr + (strlen(ptr) - 3);
 
 		if (strcmp(buf, ".ko"))
-			params.uprobes = true;
+			params->uprobes = true;
 
 	}
 
@@ -172,15 +172,15 @@ static int opt_set_target(const struct option *opt, const char *str,
 
 	if  (str) {
 		if (!strcmp(opt->long_name, "exec"))
-			params.uprobes = true;
+			params->uprobes = true;
 		else if (!strcmp(opt->long_name, "module"))
-			params.uprobes = false;
+			params->uprobes = false;
 		else
 			return ret;
 
 		/* Expand given path to absolute path, except for modulename */
-		if (params.uprobes || strchr(str, '/')) {
-			tmp = nsinfo__realpath(str, params.nsi);
+		if (params->uprobes || strchr(str, '/')) {
+			tmp = nsinfo__realpath(str, params->nsi);
 			if (!tmp) {
 				pr_warning("Failed to get the absolute path of %s: %m\n", str);
 				return ret;
@@ -190,9 +190,9 @@ static int opt_set_target(const struct option *opt, const char *str,
 			if (!tmp)
 				return -ENOMEM;
 		}
-		free(params.target);
-		params.target = tmp;
-		params.target_used = false;
+		free(params->target);
+		params->target = tmp;
+		params->target_used = false;
 		ret = 0;
 	}
 
@@ -217,7 +217,7 @@ static int opt_set_target_ns(const struct option *opt __maybe_unused,
 		}
 		nsip = nsinfo__new(ns_pid);
 		if (nsip && nsinfo__need_setns(nsip))
-			params.nsi = nsinfo__get(nsip);
+			params->nsi = nsinfo__get(nsip);
 		nsinfo__put(nsip);
 
 		ret = 0;
@@ -238,14 +238,14 @@ static int opt_show_lines(const struct option *opt,
 	if (!str)
 		return 0;
 
-	if (params.command == 'L') {
+	if (params->command == 'L') {
 		pr_warning("Warning: more than one --line options are"
 			   " detected. Only the first one is valid.\n");
 		return 0;
 	}
 
-	params.command = opt->short_name;
-	ret = parse_line_range_desc(str, &params.line_range);
+	params->command = opt->short_name;
+	ret = parse_line_range_desc(str, &params->line_range);
 
 	return ret;
 }
@@ -253,7 +253,7 @@ static int opt_show_lines(const struct option *opt,
 static int opt_show_vars(const struct option *opt,
 			 const char *str, int unset __maybe_unused)
 {
-	struct perf_probe_event *pev = &params.events[params.nevents];
+	struct perf_probe_event *pev = &params->events[params->nevents];
 	int ret;
 
 	if (!str)
@@ -264,7 +264,7 @@ static int opt_show_vars(const struct option *opt,
 		pr_err("  Error: '--vars' doesn't accept arguments.\n");
 		return -EINVAL;
 	}
-	params.command = opt->short_name;
+	params->command = opt->short_name;
 
 	return ret;
 }
@@ -276,7 +276,7 @@ static int opt_add_probe_event(const struct option *opt,
 			      const char *str, int unset __maybe_unused)
 {
 	if (str) {
-		params.command = opt->short_name;
+		params->command = opt->short_name;
 		return parse_probe_event(str);
 	}
 
@@ -287,7 +287,7 @@ static int opt_set_filter_with_command(const struct option *opt,
 				       const char *str, int unset)
 {
 	if (!unset)
-		params.command = opt->short_name;
+		params->command = opt->short_name;
 
 	if (str)
 		return params_add_filter(str);
@@ -306,20 +306,29 @@ static int opt_set_filter(const struct option *opt __maybe_unused,
 
 static int init_params(void)
 {
-	return line_range__init(&params.line_range);
+	int ret;
+
+	params = calloc(1, sizeof(*params));
+	if (!params)
+		return -ENOMEM;
+
+	ret = line_range__init(&params->line_range);
+	if (ret)
+		zfree(&params);
+	return ret;
 }
 
 static void cleanup_params(void)
 {
 	int i;
 
-	for (i = 0; i < params.nevents; i++)
-		clear_perf_probe_event(params.events + i);
-	line_range__clear(&params.line_range);
-	free(params.target);
-	strfilter__delete(params.filter);
-	nsinfo__put(params.nsi);
-	memset(&params, 0, sizeof(params));
+	for (i = 0; i < params->nevents; i++)
+		clear_perf_probe_event(params->events + i);
+	line_range__clear(&params->line_range);
+	free(params->target);
+	strfilter__delete(params->filter);
+	nsinfo__put(params->nsi);
+	zfree(&params);
 }
 
 static void pr_err_with_code(const char *msg, int err)
@@ -346,7 +355,7 @@ static int perf_add_probe_events(struct perf_probe_event *pevs, int npevs)
 	if (ret < 0)
 		goto out_cleanup;
 
-	if (params.command == 'D') {	/* it shows definition */
+	if (params->command == 'D') {	/* it shows definition */
 		if (probe_conf.bootconfig)
 			ret = show_bootconfig_events(pevs, npevs);
 		else
@@ -635,7 +644,7 @@ __cmd_probe(int argc, const char **argv)
 			usage_with_options_msg(probe_usage, options,
 				"'-' is not supported.\n");
 		}
-		if (params.command && params.command != 'a') {
+		if (params->command && params->command != 'a') {
 			usage_with_options_msg(probe_usage, options,
 				"another command except --add is set.\n");
 		}
@@ -644,7 +653,7 @@ __cmd_probe(int argc, const char **argv)
 			pr_err_with_code("  Error: Command Parse Error.", ret);
 			return ret;
 		}
-		params.command = 'a';
+		params->command = 'a';
 	}
 
 	ret = symbol__validate_sym_arguments();
@@ -664,54 +673,54 @@ __cmd_probe(int argc, const char **argv)
 	 * nor change running kernel. So if user gives offline vmlinux,
 	 * ignore its buildid.
 	 */
-	if (!strchr("lda", params.command) && symbol_conf.vmlinux_name)
+	if (!strchr("lda", params->command) && symbol_conf.vmlinux_name)
 		symbol_conf.ignore_vmlinux_buildid = true;
 
-	switch (params.command) {
+	switch (params->command) {
 	case 'l':
-		if (params.uprobes) {
+		if (params->uprobes) {
 			pr_err("  Error: Don't use --list with --exec.\n");
 			parse_options_usage(probe_usage, options, "l", true);
 			parse_options_usage(NULL, options, "x", true);
 			return -EINVAL;
 		}
-		ret = show_perf_probe_events(params.filter);
+		ret = show_perf_probe_events(params->filter);
 		if (ret < 0)
 			pr_err_with_code("  Error: Failed to show event list.", ret);
 		return ret;
 	case 'F':
-		ret = show_available_funcs(params.target, params.nsi,
-					   params.filter, params.uprobes);
+		ret = show_available_funcs(params->target, params->nsi,
+					   params->filter, params->uprobes);
 		if (ret < 0)
 			pr_err_with_code("  Error: Failed to show functions.", ret);
 		return ret;
 #ifdef HAVE_DWARF_SUPPORT
 	case 'L':
-		ret = show_line_range(&params.line_range, params.target,
-				      params.nsi, params.uprobes);
+		ret = show_line_range(&params->line_range, params->target,
+				      params->nsi, params->uprobes);
 		if (ret < 0)
 			pr_err_with_code("  Error: Failed to show lines.", ret);
 		return ret;
 	case 'V':
-		if (!params.filter)
-			params.filter = strfilter__new(DEFAULT_VAR_FILTER,
+		if (!params->filter)
+			params->filter = strfilter__new(DEFAULT_VAR_FILTER,
 						       NULL);
 
-		ret = show_available_vars(params.events, params.nevents,
-					  params.filter);
+		ret = show_available_vars(params->events, params->nevents,
+					  params->filter);
 		if (ret < 0)
 			pr_err_with_code("  Error: Failed to show vars.", ret);
 		return ret;
 #endif
 	case 'd':
-		ret = perf_del_probe_events(params.filter);
+		ret = perf_del_probe_events(params->filter);
 		if (ret < 0) {
 			pr_err_with_code("  Error: Failed to delete events.", ret);
 			return ret;
 		}
 		break;
 	case 'D':
-		if (probe_conf.bootconfig && params.uprobes) {
+		if (probe_conf.bootconfig && params->uprobes) {
 			pr_err("  Error: --bootconfig doesn't support uprobes.\n");
 			return -EINVAL;
 		}
@@ -719,25 +728,25 @@ __cmd_probe(int argc, const char **argv)
 	case 'a':
 
 		/* Ensure the last given target is used */
-		if (params.target && !params.target_used) {
+		if (params->target && !params->target_used) {
 			pr_err("  Error: -x/-m must follow the probe definitions.\n");
 			parse_options_usage(probe_usage, options, "m", true);
 			parse_options_usage(NULL, options, "x", true);
 			return -EINVAL;
 		}
 
-		ret = perf_add_probe_events(params.events, params.nevents);
+		ret = perf_add_probe_events(params->events, params->nevents);
 		if (ret < 0) {
 
 			/*
 			 * When perf_add_probe_events() fails it calls
 			 * cleanup_perf_probe_events(pevs, npevs), i.e.
-			 * cleanup_perf_probe_events(params.events, params.nevents), which
+			 * cleanup_perf_probe_events(params->events, params->nevents), which
 			 * will call clear_perf_probe_event(), so set nevents to zero
 			 * to avoid cleanup_params() to call clear_perf_probe_event() again
 			 * on the same pevs.
 			 */
-			params.nevents = 0;
+			params->nevents = 0;
 			pr_err_with_code("  Error: Failed to add events.", ret);
 			return ret;
 		}
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 12/16] perf path: Make mkpath thread safe
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (10 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 11/16] perf probe: Dynamically allocate params memory Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 13/16] perf scripting-engines: Move static to local variable Ian Rogers
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid 4 static arrays for paths, pass in a char[] buffer to use. Makes
mkpath thread safe for the small number of users. Also removes 16,384
bytes from .bss.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-config.c |  4 +++-
 tools/perf/builtin-help.c   |  4 +++-
 tools/perf/util/cache.h     |  2 +-
 tools/perf/util/config.c    |  3 ++-
 tools/perf/util/path.c      | 35 +++++------------------------------
 5 files changed, 14 insertions(+), 34 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index 2603015f98be..2e8363778935 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -12,6 +12,7 @@
 #include "util/debug.h"
 #include "util/config.h"
 #include <linux/string.h>
+#include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 
@@ -157,7 +158,8 @@ int cmd_config(int argc, const char **argv)
 {
 	int i, ret = -1;
 	struct perf_config_set *set;
-	char *user_config = mkpath("%s/.perfconfig", getenv("HOME"));
+	char path[PATH_MAX];
+	char *user_config = mkpath(path, sizeof(path), "%s/.perfconfig", getenv("HOME"));
 	const char *config_filename;
 	bool changed = false;
 
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 3e7f52054fac..b2a368ae295a 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -19,6 +19,7 @@
 #include <linux/string.h>
 #include <linux/zalloc.h>
 #include <errno.h>
+#include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -389,9 +390,10 @@ static int get_html_page_path(char **page_path, const char *page)
 {
 	struct stat st;
 	const char *html_path = system_path(PERF_HTML_PATH);
+	char path[PATH_MAX];
 
 	/* Check that we have a perf documentation directory. */
-	if (stat(mkpath("%s/perf.html", html_path), &st)
+	if (stat(mkpath(path, sizeof(path), "%s/perf.html", html_path), &st)
 	    || !S_ISREG(st.st_mode)) {
 		pr_err("'%s': not a documentation directory.", html_path);
 		return -1;
diff --git a/tools/perf/util/cache.h b/tools/perf/util/cache.h
index 9f2e36ef5072..0b61840d4226 100644
--- a/tools/perf/util/cache.h
+++ b/tools/perf/util/cache.h
@@ -26,6 +26,6 @@ static inline int is_absolute_path(const char *path)
 	return path[0] == '/';
 }
 
-char *mkpath(const char *fmt, ...) __printf(1, 2);
+char *mkpath(char *path_buf, size_t sz, const char *fmt, ...) __printf(3, 4);
 
 #endif /* __PERF_CACHE_H */
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 658170b8dcef..f340dc73db6d 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -543,6 +543,7 @@ static char *home_perfconfig(void)
 	const char *home = NULL;
 	char *config;
 	struct stat st;
+	char path[PATH_MAX];
 
 	home = getenv("HOME");
 
@@ -554,7 +555,7 @@ static char *home_perfconfig(void)
 	if (!home || !*home || !perf_config_global())
 		return NULL;
 
-	config = strdup(mkpath("%s/.perfconfig", home));
+	config = strdup(mkpath(path, sizeof(path), "%s/.perfconfig", home));
 	if (config == NULL) {
 		pr_warning("Not enough memory to process %s/.perfconfig, ignoring it.\n", home);
 		return NULL;
diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
index ce80b79be103..00adf872bf00 100644
--- a/tools/perf/util/path.c
+++ b/tools/perf/util/path.c
@@ -1,16 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0
-/*
- * I'm tired of doing "vsnprintf()" etc just to open a
- * file, so here's a "return static buffer with printf"
- * interface for paths.
- *
- * It's obviously not thread-safe. Sue me. But it's quite
- * useful for doing things like
- *
- *   f = open(mkpath("%s/%s.perf", base, name), O_RDONLY);
- *
- * which is what it's designed for.
- */
 #include "path.h"
 #include "cache.h"
 #include <linux/kernel.h>
@@ -22,18 +10,6 @@
 #include <dirent.h>
 #include <unistd.h>
 
-static char bad_path[] = "/bad-path/";
-/*
- * One hack:
- */
-static char *get_pathname(void)
-{
-	static char pathname_array[4][PATH_MAX];
-	static int idx;
-
-	return pathname_array[3 & ++idx];
-}
-
 static char *cleanup_path(char *path)
 {
 	/* Clean it up */
@@ -45,18 +21,17 @@ static char *cleanup_path(char *path)
 	return path;
 }
 
-char *mkpath(const char *fmt, ...)
+char *mkpath(char *path_buf, size_t sz, const char *fmt, ...)
 {
 	va_list args;
 	unsigned len;
-	char *pathname = get_pathname();
 
 	va_start(args, fmt);
-	len = vsnprintf(pathname, PATH_MAX, fmt, args);
+	len = vsnprintf(path_buf, sz, fmt, args);
 	va_end(args);
-	if (len >= PATH_MAX)
-		return bad_path;
-	return cleanup_path(pathname);
+	if (len >= sz)
+		strncpy(path_buf, "/bad-path/", sz);
+	return cleanup_path(path_buf);
 }
 
 int path__join(char *bf, size_t size, const char *path1, const char *path2)
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 13/16] perf scripting-engines: Move static to local variable
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (11 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 12/16] perf path: Make mkpath thread safe Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 14/16] tools api fs: Dynamically allocate cgroupfs mount point cache Ian Rogers
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid 16,384 bytes in .bss by stack allocating two bitmaps.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/scripting-engines/trace-event-perl.c   | 4 ++--
 tools/perf/util/scripting-engines/trace-event-python.c | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 039d0365ad41..65b761d83a1f 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -67,8 +67,6 @@ INTERP my_perl;
 #define TRACE_EVENT_TYPE_MAX				\
 	((1 << (sizeof(unsigned short) * 8)) - 1)
 
-static DECLARE_BITMAP(events_defined, TRACE_EVENT_TYPE_MAX);
-
 extern struct scripting_context *scripting_context;
 
 static char *cur_field_name;
@@ -353,7 +351,9 @@ static void perl_process_tracepoint(struct perf_sample *sample,
 	void *data = sample->raw_data;
 	unsigned long long nsecs = sample->time;
 	const char *comm = thread__comm_str(thread);
+	DECLARE_BITMAP(events_defined, TRACE_EVENT_TYPE_MAX);
 
+	bitmap_zero(events_defined, TRACE_EVENT_TYPE_MAX);
 	dSP;
 
 	if (evsel->core.attr.type != PERF_TYPE_TRACEPOINT)
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 41d4f9e6a8b7..40964078f92f 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -93,8 +93,6 @@ PyMODINIT_FUNC PyInit_perf_trace_context(void);
 #define TRACE_EVENT_TYPE_MAX				\
 	((1 << (sizeof(unsigned short) * 8)) - 1)
 
-static DECLARE_BITMAP(events_defined, TRACE_EVENT_TYPE_MAX);
-
 #define N_COMMON_FIELDS	7
 
 static char *cur_field_name;
@@ -934,6 +932,9 @@ static void python_process_tracepoint(struct perf_sample *sample,
 	unsigned long long nsecs = sample->time;
 	const char *comm = thread__comm_str(al->thread);
 	const char *default_handler_name = "trace_unhandled";
+	DECLARE_BITMAP(events_defined, TRACE_EVENT_TYPE_MAX);
+
+	bitmap_zero(events_defined, TRACE_EVENT_TYPE_MAX);
 
 	if (!event) {
 		snprintf(handler_name, sizeof(handler_name),
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 14/16] tools api fs: Dynamically allocate cgroupfs mount point cache
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (12 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 13/16] perf scripting-engines: Move static to local variable Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 15/16] perf test pmu: Avoid 2 static path arrays Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 16/16] libsubcmd: Avoid two path statics Ian Rogers
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Move the cgroupfs_cache_entry 4128 byte array out of .bss.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/api/fs/cgroup.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/lib/api/fs/cgroup.c b/tools/lib/api/fs/cgroup.c
index 1573dae4259d..250629a09423 100644
--- a/tools/lib/api/fs/cgroup.c
+++ b/tools/lib/api/fs/cgroup.c
@@ -14,7 +14,7 @@ struct cgroupfs_cache_entry {
 };
 
 /* just cache last used one */
-static struct cgroupfs_cache_entry cached;
+static struct cgroupfs_cache_entry *cached;
 
 int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys)
 {
@@ -24,9 +24,9 @@ int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys)
 	char *p, *path;
 	char mountpoint[PATH_MAX];
 
-	if (!strcmp(cached.subsys, subsys)) {
-		if (strlen(cached.mountpoint) < maxlen) {
-			strcpy(buf, cached.mountpoint);
+	if (cached && !strcmp(cached->subsys, subsys)) {
+		if (strlen(cached->mountpoint) < maxlen) {
+			strcpy(buf, cached->mountpoint);
 			return 0;
 		}
 		return -1;
@@ -91,8 +91,13 @@ int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys)
 	free(line);
 	fclose(fp);
 
-	strncpy(cached.subsys, subsys, sizeof(cached.subsys) - 1);
-	strcpy(cached.mountpoint, mountpoint);
+	if (!cached)
+		cached = calloc(1, sizeof(*cached));
+
+	if (cached) {
+		strncpy(cached->subsys, subsys, sizeof(cached->subsys) - 1);
+		strcpy(cached->mountpoint, mountpoint);
+	}
 
 	if (mountpoint[0] && strlen(mountpoint) < maxlen) {
 		strcpy(buf, mountpoint);
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 15/16] perf test pmu: Avoid 2 static path arrays
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (13 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 14/16] tools api fs: Dynamically allocate cgroupfs mount point cache Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  2023-05-25  7:11 ` [PATCH v1 16/16] libsubcmd: Avoid two path statics Ian Rogers
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Avoid two static paths that contributed 8,192 bytes to .bss are only
used duing the perf parse pmu test. This change helps FORTIFY
triggering 2 warnings like:

```
tests/pmu.c: In function ‘test__pmu’:
tests/pmu.c:121:43: error: ‘%s’ directive output may be truncated writing up to 4095 bytes into a region of size 4090 [-Werror=format-truncation=]
  121 |         snprintf(buf, sizeof(buf), "rm -f %s/*\n", dir);
```

So make buf a little larger.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/pmu.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/pmu.c b/tools/perf/tests/pmu.c
index 3cf25f883df7..a4452639a3d4 100644
--- a/tools/perf/tests/pmu.c
+++ b/tools/perf/tests/pmu.c
@@ -86,17 +86,16 @@ static struct parse_events_term test_terms[] = {
  * Prepare format directory data, exported by kernel
  * at /sys/bus/event_source/devices/<dev>/format.
  */
-static char *test_format_dir_get(void)
+static char *test_format_dir_get(char *dir, size_t sz)
 {
-	static char dir[PATH_MAX];
 	unsigned int i;
 
-	snprintf(dir, PATH_MAX, "/tmp/perf-pmu-test-format-XXXXXX");
+	snprintf(dir, sz, "/tmp/perf-pmu-test-format-XXXXXX");
 	if (!mkdtemp(dir))
 		return NULL;
 
 	for (i = 0; i < ARRAY_SIZE(test_formats); i++) {
-		static char name[PATH_MAX];
+		char name[PATH_MAX];
 		struct test_format *format = &test_formats[i];
 		FILE *file;
 
@@ -118,12 +117,13 @@ static char *test_format_dir_get(void)
 /* Cleanup format directory. */
 static int test_format_dir_put(char *dir)
 {
-	char buf[PATH_MAX];
-	snprintf(buf, PATH_MAX, "rm -f %s/*\n", dir);
+	char buf[PATH_MAX + 20];
+
+	snprintf(buf, sizeof(buf), "rm -f %s/*\n", dir);
 	if (system(buf))
 		return -1;
 
-	snprintf(buf, PATH_MAX, "rmdir %s\n", dir);
+	snprintf(buf, sizeof(buf), "rmdir %s\n", dir);
 	return system(buf);
 }
 
@@ -140,7 +140,8 @@ static struct list_head *test_terms_list(void)
 
 static int test__pmu(struct test_suite *test __maybe_unused, int subtest __maybe_unused)
 {
-	char *format = test_format_dir_get();
+	char dir[PATH_MAX];
+	char *format = test_format_dir_get(dir, sizeof(dir));
 	LIST_HEAD(formats);
 	struct list_head *terms = test_terms_list();
 	int ret;
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v1 16/16] libsubcmd: Avoid two path statics
  2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
                   ` (14 preceding siblings ...)
  2023-05-25  7:11 ` [PATCH v1 15/16] perf test pmu: Avoid 2 static path arrays Ian Rogers
@ 2023-05-25  7:11 ` Ian Rogers
  15 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-25  7:11 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Masami Hiramatsu (Google),
	Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

Use a single stack allocated buffer and avoid 8,192 bytes in .bss.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/subcmd/exec-cmd.c | 35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/tools/lib/subcmd/exec-cmd.c b/tools/lib/subcmd/exec-cmd.c
index 5dbea456973e..7739b5217cf6 100644
--- a/tools/lib/subcmd/exec-cmd.c
+++ b/tools/lib/subcmd/exec-cmd.c
@@ -36,38 +36,40 @@ static int is_absolute_path(const char *path)
 	return path[0] == '/';
 }
 
-static const char *get_pwd_cwd(void)
+static const char *get_pwd_cwd(char *buf, size_t sz)
 {
-	static char cwd[PATH_MAX + 1];
 	char *pwd;
 	struct stat cwd_stat, pwd_stat;
-	if (getcwd(cwd, PATH_MAX) == NULL)
+	if (getcwd(buf, sz) == NULL)
 		return NULL;
 	pwd = getenv("PWD");
-	if (pwd && strcmp(pwd, cwd)) {
-		stat(cwd, &cwd_stat);
+	if (pwd && strcmp(pwd, buf)) {
+		stat(buf, &cwd_stat);
 		if (!stat(pwd, &pwd_stat) &&
 		    pwd_stat.st_dev == cwd_stat.st_dev &&
 		    pwd_stat.st_ino == cwd_stat.st_ino) {
-			strlcpy(cwd, pwd, PATH_MAX);
+			strlcpy(buf, pwd, sz);
 		}
 	}
-	return cwd;
+	return buf;
 }
 
-static const char *make_nonrelative_path(const char *path)
+static const char *make_nonrelative_path(char *buf, size_t sz, const char *path)
 {
-	static char buf[PATH_MAX + 1];
-
 	if (is_absolute_path(path)) {
-		if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX)
+		if (strlcpy(buf, path, sz) >= sz)
 			die("Too long path: %.*s", 60, path);
 	} else {
-		const char *cwd = get_pwd_cwd();
+		const char *cwd = get_pwd_cwd(buf, sz);
+
 		if (!cwd)
 			die("Cannot determine the current working directory");
-		if (snprintf(buf, PATH_MAX, "%s/%s", cwd, path) >= PATH_MAX)
+
+		if (strlen(cwd) + strlen(path) + 2 >= sz)
 			die("Too long path: %.*s", 60, path);
+
+		strcat(buf, "/");
+		strcat(buf, path);
 	}
 	return buf;
 }
@@ -133,8 +135,11 @@ static void add_path(char **out, const char *path)
 	if (path && *path) {
 		if (is_absolute_path(path))
 			astrcat(out, path);
-		else
-			astrcat(out, make_nonrelative_path(path));
+		else {
+			char buf[PATH_MAX];
+
+			astrcat(out, make_nonrelative_path(buf, sizeof(buf), path));
+		}
 
 		astrcat(out, ":");
 	}
-- 
2.40.1.698.g37aff9b760-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology
  2023-05-25  7:11 ` [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology Ian Rogers
@ 2023-05-25 19:14   ` Namhyung Kim
  2023-05-26  2:38     ` Ian Rogers
  0 siblings, 1 reply; 22+ messages in thread
From: Namhyung Kim @ 2023-05-25 19:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Masami Hiramatsu (Google), Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

On Thu, May 25, 2023 at 12:12 AM Ian Rogers <irogers@google.com> wrote:
>
> Avoid a large static array, dynamically allocate the nodes avoiding a
> hard coded limited as well.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/header.c | 33 +++++++++++++++++++++------------
>  1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 2dde3ca20de5..80593ed8c79b 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -24,6 +24,7 @@
>  #include <bpf/libbpf.h>
>  #endif
>  #include <perf/cpumap.h>
> +#include <tools/libc_compat.h> // reallocarray
>
>  #include "dso.h"
>  #include "evlist.h"
> @@ -1396,13 +1397,14 @@ static int memory_node__sort(const void *a, const void *b)
>         return na->node - nb->node;
>  }
>
> -static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> +static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
>  {
>         char path[PATH_MAX];
>         struct dirent *ent;
>         DIR *dir;
> -       u64 cnt = 0;
>         int ret = 0;
> +       size_t cnt = 0, size = 0;
> +       struct memory_node *nodes = NULL;
>
>         scnprintf(path, PATH_MAX, "%s/devices/system/node/",
>                   sysfs__mountpoint());
> @@ -1426,16 +1428,24 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
>                 if (r != 1)
>                         continue;
>
> -               if (WARN_ONCE(cnt >= size,
> -                       "failed to write MEM_TOPOLOGY, way too many nodes\n")) {
> -                       closedir(dir);
> -                       return -1;
> -               }
> +               if (cnt >= size) {
> +                       struct memory_node *new_nodes =
> +                               reallocarray(nodes, cnt + 4, sizeof(*nodes));
>
> +                       if (!new_nodes) {
> +                               pr_err("Failed to write MEM_TOPOLOGY, size %zd nodes\n", size);
> +                               free(nodes);
> +                               closedir(dir);
> +                               return -ENOMEM;
> +                       }
> +                       nodes = new_nodes;
> +                       size += 4;
> +               }
>                 ret = memory_node__read(&nodes[cnt++], idx);

I think you need to handle error cases here.

Thanks,
Namhyung


>         }
>
>         *cntp = cnt;
> +       *nodesp = nodes;
>         closedir(dir);
>
>         if (!ret)
> @@ -1444,8 +1454,6 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
>         return ret;
>  }
>
> -#define MAX_MEMORY_NODES 2000
> -
>  /*
>   * The MEM_TOPOLOGY holds physical memory map for every
>   * node in system. The format of data is as follows:
> @@ -1464,8 +1472,8 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
>  static int write_mem_topology(struct feat_fd *ff __maybe_unused,
>                               struct evlist *evlist __maybe_unused)
>  {
> -       static struct memory_node nodes[MAX_MEMORY_NODES];
> -       u64 bsize, version = 1, i, nr;
> +       struct memory_node *nodes = NULL;
> +       u64 bsize, version = 1, i, nr = 0;
>         int ret;
>
>         ret = sysfs__read_xll("devices/system/memory/block_size_bytes",
> @@ -1473,7 +1481,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
>         if (ret)
>                 return ret;
>
> -       ret = build_mem_topology(&nodes[0], MAX_MEMORY_NODES, &nr);
> +       ret = build_mem_topology(&nodes, &nr);
>         if (ret)
>                 return ret;
>
> @@ -1508,6 +1516,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
>         }
>
>  out:
> +       free(nodes);
>         return ret;
>  }
>
> --
> 2.40.1.698.g37aff9b760-goog
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf
  2023-05-25  7:11 ` [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf Ian Rogers
@ 2023-05-25 19:17   ` Namhyung Kim
  2023-05-26  2:40     ` Ian Rogers
  0 siblings, 1 reply; 22+ messages in thread
From: Namhyung Kim @ 2023-05-25 19:17 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Masami Hiramatsu (Google), Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

On Thu, May 25, 2023 at 12:12 AM Ian Rogers <irogers@google.com> wrote:
>
> Avoid a PATH_MAX array in __daemon (the .data section) by dynamically
> allocating the memory.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/builtin-daemon.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-daemon.c b/tools/perf/builtin-daemon.c
> index 34cbe3e959aa..adb5751c3ed0 100644
> --- a/tools/perf/builtin-daemon.c
> +++ b/tools/perf/builtin-daemon.c
> @@ -90,7 +90,7 @@ struct daemon {
>         char                    *base;
>         struct list_head         sessions;
>         FILE                    *out;
> -       char                     perf[PATH_MAX];
> +       char                    *perf;
>         int                      signal_fd;
>         time_t                   start;
>  };
> @@ -1490,6 +1490,15 @@ static int __cmd_ping(struct daemon *daemon, struct option parent_options[],
>         return send_cmd(daemon, &cmd);
>  }
>
> +static char *set_perf_exe(void)
> +{
> +       char path[PATH_MAX];
> +
> +       perf_exe(path, sizeof(path));
> +       __daemon.perf = strdup(path);
> +       return __daemon.perf;

Then you need to free it somewhere.

Thanks,
Namhyung


> +}
> +
>  int cmd_daemon(int argc, const char **argv)
>  {
>         struct option daemon_options[] = {
> @@ -1503,7 +1512,9 @@ int cmd_daemon(int argc, const char **argv)
>                 OPT_END()
>         };
>
> -       perf_exe(__daemon.perf, sizeof(__daemon.perf));
> +       if (!set_perf_exe())
> +               return -ENOMEM;
> +
>         __daemon.out = stdout;
>
>         argc = parse_options(argc, argv, daemon_options, daemon_usage,
> --
> 2.40.1.698.g37aff9b760-goog
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology
  2023-05-25 19:14   ` Namhyung Kim
@ 2023-05-26  2:38     ` Ian Rogers
  2023-05-26  6:00       ` Namhyung Kim
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2023-05-26  2:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Masami Hiramatsu (Google), Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

On Thu, May 25, 2023 at 12:15 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, May 25, 2023 at 12:12 AM Ian Rogers <irogers@google.com> wrote:
> >
> > Avoid a large static array, dynamically allocate the nodes avoiding a
> > hard coded limited as well.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/util/header.c | 33 +++++++++++++++++++++------------
> >  1 file changed, 21 insertions(+), 12 deletions(-)
> >
> > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > index 2dde3ca20de5..80593ed8c79b 100644
> > --- a/tools/perf/util/header.c
> > +++ b/tools/perf/util/header.c
> > @@ -24,6 +24,7 @@
> >  #include <bpf/libbpf.h>
> >  #endif
> >  #include <perf/cpumap.h>
> > +#include <tools/libc_compat.h> // reallocarray
> >
> >  #include "dso.h"
> >  #include "evlist.h"
> > @@ -1396,13 +1397,14 @@ static int memory_node__sort(const void *a, const void *b)
> >         return na->node - nb->node;
> >  }
> >
> > -static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> > +static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
> >  {
> >         char path[PATH_MAX];
> >         struct dirent *ent;
> >         DIR *dir;
> > -       u64 cnt = 0;
> >         int ret = 0;
> > +       size_t cnt = 0, size = 0;
> > +       struct memory_node *nodes = NULL;
> >
> >         scnprintf(path, PATH_MAX, "%s/devices/system/node/",
> >                   sysfs__mountpoint());
> > @@ -1426,16 +1428,24 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> >                 if (r != 1)
> >                         continue;
> >
> > -               if (WARN_ONCE(cnt >= size,
> > -                       "failed to write MEM_TOPOLOGY, way too many nodes\n")) {
> > -                       closedir(dir);
> > -                       return -1;
> > -               }
> > +               if (cnt >= size) {
> > +                       struct memory_node *new_nodes =
> > +                               reallocarray(nodes, cnt + 4, sizeof(*nodes));
> >
> > +                       if (!new_nodes) {
> > +                               pr_err("Failed to write MEM_TOPOLOGY, size %zd nodes\n", size);
> > +                               free(nodes);
> > +                               closedir(dir);
> > +                               return -ENOMEM;
> > +                       }
> > +                       nodes = new_nodes;
> > +                       size += 4;
> > +               }
> >                 ret = memory_node__read(&nodes[cnt++], idx);
>
> I think you need to handle error cases here.
>
> Thanks,
> Namhyung

Not sure I follow. The reallocarray tests for failure, frees nodes and
returns -ENOMEM on error.

Thanks,
Ian

>
> >         }
> >
> >         *cntp = cnt;
> > +       *nodesp = nodes;
> >         closedir(dir);
> >
> >         if (!ret)
> > @@ -1444,8 +1454,6 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> >         return ret;
> >  }
> >
> > -#define MAX_MEMORY_NODES 2000
> > -
> >  /*
> >   * The MEM_TOPOLOGY holds physical memory map for every
> >   * node in system. The format of data is as follows:
> > @@ -1464,8 +1472,8 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> >  static int write_mem_topology(struct feat_fd *ff __maybe_unused,
> >                               struct evlist *evlist __maybe_unused)
> >  {
> > -       static struct memory_node nodes[MAX_MEMORY_NODES];
> > -       u64 bsize, version = 1, i, nr;
> > +       struct memory_node *nodes = NULL;
> > +       u64 bsize, version = 1, i, nr = 0;
> >         int ret;
> >
> >         ret = sysfs__read_xll("devices/system/memory/block_size_bytes",
> > @@ -1473,7 +1481,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
> >         if (ret)
> >                 return ret;
> >
> > -       ret = build_mem_topology(&nodes[0], MAX_MEMORY_NODES, &nr);
> > +       ret = build_mem_topology(&nodes, &nr);
> >         if (ret)
> >                 return ret;
> >
> > @@ -1508,6 +1516,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
> >         }
> >
> >  out:
> > +       free(nodes);
> >         return ret;
> >  }
> >
> > --
> > 2.40.1.698.g37aff9b760-goog
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf
  2023-05-25 19:17   ` Namhyung Kim
@ 2023-05-26  2:40     ` Ian Rogers
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Rogers @ 2023-05-26  2:40 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Masami Hiramatsu (Google), Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

On Thu, May 25, 2023 at 12:17 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, May 25, 2023 at 12:12 AM Ian Rogers <irogers@google.com> wrote:
> >
> > Avoid a PATH_MAX array in __daemon (the .data section) by dynamically
> > allocating the memory.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/builtin-daemon.c | 15 +++++++++++++--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/builtin-daemon.c b/tools/perf/builtin-daemon.c
> > index 34cbe3e959aa..adb5751c3ed0 100644
> > --- a/tools/perf/builtin-daemon.c
> > +++ b/tools/perf/builtin-daemon.c
> > @@ -90,7 +90,7 @@ struct daemon {
> >         char                    *base;
> >         struct list_head         sessions;
> >         FILE                    *out;
> > -       char                     perf[PATH_MAX];
> > +       char                    *perf;
> >         int                      signal_fd;
> >         time_t                   start;
> >  };
> > @@ -1490,6 +1490,15 @@ static int __cmd_ping(struct daemon *daemon, struct option parent_options[],
> >         return send_cmd(daemon, &cmd);
> >  }
> >
> > +static char *set_perf_exe(void)
> > +{
> > +       char path[PATH_MAX];
> > +
> > +       perf_exe(path, sizeof(path));
> > +       __daemon.perf = strdup(path);
> > +       return __daemon.perf;
>
> Then you need to free it somewhere.

For leak sanitizer, if an allocation is reachable from a global then
it isn't considered a leak. I'll address this in v2.

Thanks,
Ian

> Thanks,
> Namhyung
>
>
> > +}
> > +
> >  int cmd_daemon(int argc, const char **argv)
> >  {
> >         struct option daemon_options[] = {
> > @@ -1503,7 +1512,9 @@ int cmd_daemon(int argc, const char **argv)
> >                 OPT_END()
> >         };
> >
> > -       perf_exe(__daemon.perf, sizeof(__daemon.perf));
> > +       if (!set_perf_exe())
> > +               return -ENOMEM;
> > +
> >         __daemon.out = stdout;
> >
> >         argc = parse_options(argc, argv, daemon_options, daemon_usage,
> > --
> > 2.40.1.698.g37aff9b760-goog
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology
  2023-05-26  2:38     ` Ian Rogers
@ 2023-05-26  6:00       ` Namhyung Kim
  0 siblings, 0 replies; 22+ messages in thread
From: Namhyung Kim @ 2023-05-26  6:00 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Masami Hiramatsu (Google), Steven Rostedt (Google),
	Ross Zwisler, Leo Yan, Tiezhu Yang, Yang Jihong, Andi Kleen,
	Kan Liang, Ravi Bangoria, Sean Christopherson, K Prateek Nayak,
	Paolo Bonzini, linux-kernel, linux-perf-users

On Thu, May 25, 2023 at 7:39 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, May 25, 2023 at 12:15 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Thu, May 25, 2023 at 12:12 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > Avoid a large static array, dynamically allocate the nodes avoiding a
> > > hard coded limited as well.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/header.c | 33 +++++++++++++++++++++------------
> > >  1 file changed, 21 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > > index 2dde3ca20de5..80593ed8c79b 100644
> > > --- a/tools/perf/util/header.c
> > > +++ b/tools/perf/util/header.c
> > > @@ -24,6 +24,7 @@
> > >  #include <bpf/libbpf.h>
> > >  #endif
> > >  #include <perf/cpumap.h>
> > > +#include <tools/libc_compat.h> // reallocarray
> > >
> > >  #include "dso.h"
> > >  #include "evlist.h"
> > > @@ -1396,13 +1397,14 @@ static int memory_node__sort(const void *a, const void *b)
> > >         return na->node - nb->node;
> > >  }
> > >
> > > -static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> > > +static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
> > >  {
> > >         char path[PATH_MAX];
> > >         struct dirent *ent;
> > >         DIR *dir;
> > > -       u64 cnt = 0;
> > >         int ret = 0;
> > > +       size_t cnt = 0, size = 0;
> > > +       struct memory_node *nodes = NULL;
> > >
> > >         scnprintf(path, PATH_MAX, "%s/devices/system/node/",
> > >                   sysfs__mountpoint());
> > > @@ -1426,16 +1428,24 @@ static int build_mem_topology(struct memory_node *nodes, u64 size, u64 *cntp)
> > >                 if (r != 1)
> > >                         continue;
> > >
> > > -               if (WARN_ONCE(cnt >= size,
> > > -                       "failed to write MEM_TOPOLOGY, way too many nodes\n")) {
> > > -                       closedir(dir);
> > > -                       return -1;
> > > -               }
> > > +               if (cnt >= size) {
> > > +                       struct memory_node *new_nodes =
> > > +                               reallocarray(nodes, cnt + 4, sizeof(*nodes));
> > >
> > > +                       if (!new_nodes) {
> > > +                               pr_err("Failed to write MEM_TOPOLOGY, size %zd nodes\n", size);
> > > +                               free(nodes);
> > > +                               closedir(dir);
> > > +                               return -ENOMEM;
> > > +                       }
> > > +                       nodes = new_nodes;
> > > +                       size += 4;
> > > +               }
> > >                 ret = memory_node__read(&nodes[cnt++], idx);
> >
> > I think you need to handle error cases here.
> >
> > Thanks,
> > Namhyung
>
> Not sure I follow. The reallocarray tests for failure, frees nodes and
> returns -ENOMEM on error.

Right, but I think it would leak the nodes when the
memory_node__read() fails.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-05-26  6:01 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-25  7:11 [PATCH v1 00/16] Address some perf memory/data size issues Ian Rogers
2023-05-25  7:11 ` [PATCH v1 01/16] perf header: Make nodes dynamic in write_mem_topology Ian Rogers
2023-05-25 19:14   ` Namhyung Kim
2023-05-26  2:38     ` Ian Rogers
2023-05-26  6:00       ` Namhyung Kim
2023-05-25  7:11 ` [PATCH v1 02/16] perf test x86: insn-x86 test data is immutable so mark it const Ian Rogers
2023-05-25  7:11 ` [PATCH v1 03/16] perf test x86: intel-pt-test " Ian Rogers
2023-05-25  7:11 ` [PATCH v1 04/16] perf trace: Make some large static arrays const Ian Rogers
2023-05-25  7:11 ` [PATCH v1 05/16] perf trace beauty: Make MSR " Ian Rogers
2023-05-25  7:11 ` [PATCH v1 06/16] tools api fs: Avoid large static PATH_MAX arrays Ian Rogers
2023-05-25  7:11 ` [PATCH v1 07/16] tools lib api fs tracing_path: Remove two unused MAX_PATH paths Ian Rogers
2023-05-25  7:11 ` [PATCH v1 08/16] perf daemon: Dynamically allocate path to perf Ian Rogers
2023-05-25 19:17   ` Namhyung Kim
2023-05-26  2:40     ` Ian Rogers
2023-05-25  7:11 ` [PATCH v1 09/16] perf lock: Dynamically allocate lockhash_table Ian Rogers
2023-05-25  7:11 ` [PATCH v1 10/16] perf timechart: Make large arrays dynamic Ian Rogers
2023-05-25  7:11 ` [PATCH v1 11/16] perf probe: Dynamically allocate params memory Ian Rogers
2023-05-25  7:11 ` [PATCH v1 12/16] perf path: Make mkpath thread safe Ian Rogers
2023-05-25  7:11 ` [PATCH v1 13/16] perf scripting-engines: Move static to local variable Ian Rogers
2023-05-25  7:11 ` [PATCH v1 14/16] tools api fs: Dynamically allocate cgroupfs mount point cache Ian Rogers
2023-05-25  7:11 ` [PATCH v1 15/16] perf test pmu: Avoid 2 static path arrays Ian Rogers
2023-05-25  7:11 ` [PATCH v1 16/16] libsubcmd: Avoid two path statics Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).