linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
@ 2022-06-15 16:32 Namhyung Kim
  2022-06-15 16:32 ` [PATCH 1/7] perf lock: Print wait times with unit Namhyung Kim
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Hello,

Kernel v5.19 will have a new set of tracepoints to track lock
contentions for various lock types.  Unlike tracepoints in LOCKDEP and
LOCK_STAT, it's hit only for contended locks and lock names are not
available.  So it needs to collect stack traces and display the caller
function instead.

Changes in v4)
 * add Acked-by from Ian
 * more comments on trace_lock_handler
 * don't create stats in the contention_end handler
 
Changes in v3)
 * fix build error
 * support data from different kernels/machines
 * skip bad stat unless there's actual bad ones
 
Changes in v2)
 * add Acked-by from Ian
 * print time with a unit for compact output
 * add some comments  (Ian)
 * remove already applied patch
 
This patchset merely adds support for the new tracepoints to the
existing perf lock commands.  So there's no change to the user.  Later
I'll add new a sub-command dedicated to the tracepoints to make use of
the additional information.

Example output:

  $ sudo perf lock record -a sleep 3

  $ perf lock report -F acquired,contended,avg_wait,wait_total

                  Name   acquired  contended     avg wait    total wait

   update_blocked_a...         40         40      3.61 us     144.45 us
   kernfs_fop_open+...          5          5      3.64 us      18.18 us
    _nohz_idle_balance          3          3      2.65 us       7.95 us
   tick_do_update_j...          1          1      6.04 us       6.04 us
    ep_scan_ready_list          1          1      3.93 us       3.93 us
  ...

You can find the code in the 'perf/lock-contention-v4' branch at

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (7):
  perf lock: Print wait times with unit
  perf lock: Allow to use different kernel symbols
  perf lock: Skip print_bad_events() if nothing bad
  perf lock: Add lock contention tracepoints record support
  perf lock: Handle lock contention tracepoints
  perf record: Allow to specify max stack depth of fp callchain
  perf lock: Look up callchain for the contended locks

 tools/perf/Documentation/perf-lock.txt   |   7 +
 tools/perf/Documentation/perf-record.txt |   5 +
 tools/perf/builtin-lock.c                | 426 ++++++++++++++++++++++-
 tools/perf/util/callchain.c              |  18 +-
 4 files changed, 434 insertions(+), 22 deletions(-)


base-commit: 9886142c7a2226439c1e3f7d9b69f9c7094c3ef6
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] perf lock: Print wait times with unit
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 2/7] perf lock: Allow to use different kernel symbols Namhyung Kim
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Currently it only prints the time in nsec but it's a bit hard to read
and takes longer in the screen.  We can change it to use different
units and keep the number small to save the space.

Before:
  $ perf lock report

                Name   acquired  contended   avg wait (ns) total wait (ns)   max wait (ns)   min wait (ns)

        jiffies_lock        433         32            2778           88908           13570             692
   &lruvec->lru_lock        747          5           11254           56272           18317            1412
      slock-AF_INET6          7          1           23543           23543           23543           23543
    &newf->file_lock        706         15            1025           15388            2279             618
      slock-AF_INET6          8          1           10379           10379           10379           10379
         &rq->__lock       2143          5            2037           10185            3462             939

After:
                Name   acquired  contended     avg wait   total wait     max wait     min wait

        jiffies_lock        433         32      2.78 us     88.91 us     13.57 us       692 ns
   &lruvec->lru_lock        747          5     11.25 us     56.27 us     18.32 us      1.41 us
      slock-AF_INET6          7          1     23.54 us     23.54 us     23.54 us     23.54 us
    &newf->file_lock        706         15      1.02 us     15.39 us      2.28 us       618 ns
      slock-AF_INET6          8          1     10.38 us     10.38 us     10.38 us     10.38 us
         &rq->__lock       2143          5      2.04 us     10.19 us      3.46 us       939 ns

Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 48 ++++++++++++++++++++++++++++++++-------
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 23a33ac15e68..57e396323d05 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -251,6 +251,31 @@ struct lock_key {
 	struct list_head	list;
 };
 
+static void lock_stat_key_print_time(unsigned long long nsec, int len)
+{
+	static const struct {
+		float base;
+		const char *unit;
+	} table[] = {
+		{ 1e9 * 3600, "h " },
+		{ 1e9 * 60, "m " },
+		{ 1e9, "s " },
+		{ 1e6, "ms" },
+		{ 1e3, "us" },
+		{ 0, NULL },
+	};
+
+	for (int i = 0; table[i].unit; i++) {
+		if (nsec < table[i].base)
+			continue;
+
+		pr_info("%*.2f %s", len - 3, nsec / table[i].base, table[i].unit);
+		return;
+	}
+
+	pr_info("%*llu %s", len - 3, nsec, "ns");
+}
+
 #define PRINT_KEY(member)						\
 static void lock_stat_key_print_ ## member(struct lock_key *key,	\
 					   struct lock_stat *ls)	\
@@ -258,11 +283,18 @@ static void lock_stat_key_print_ ## member(struct lock_key *key,	\
 	pr_info("%*llu", key->len, (unsigned long long)ls->member);	\
 }
 
+#define PRINT_TIME(member)						\
+static void lock_stat_key_print_ ## member(struct lock_key *key,	\
+					   struct lock_stat *ls)	\
+{									\
+	lock_stat_key_print_time((unsigned long long)ls->member, key->len);	\
+}
+
 PRINT_KEY(nr_acquired)
 PRINT_KEY(nr_contended)
-PRINT_KEY(avg_wait_time)
-PRINT_KEY(wait_time_total)
-PRINT_KEY(wait_time_max)
+PRINT_TIME(avg_wait_time)
+PRINT_TIME(wait_time_total)
+PRINT_TIME(wait_time_max)
 
 static void lock_stat_key_print_wait_time_min(struct lock_key *key,
 					      struct lock_stat *ls)
@@ -272,7 +304,7 @@ static void lock_stat_key_print_wait_time_min(struct lock_key *key,
 	if (wait_time == ULLONG_MAX)
 		wait_time = 0;
 
-	pr_info("%*"PRIu64, key->len, wait_time);
+	lock_stat_key_print_time(wait_time, key->len);
 }
 
 
@@ -291,10 +323,10 @@ static const char		*output_fields;
 struct lock_key keys[] = {
 	DEF_KEY_LOCK(acquired, "acquired", nr_acquired, 10),
 	DEF_KEY_LOCK(contended, "contended", nr_contended, 10),
-	DEF_KEY_LOCK(avg_wait, "avg wait (ns)", avg_wait_time, 15),
-	DEF_KEY_LOCK(wait_total, "total wait (ns)", wait_time_total, 15),
-	DEF_KEY_LOCK(wait_max, "max wait (ns)", wait_time_max, 15),
-	DEF_KEY_LOCK(wait_min, "min wait (ns)", wait_time_min, 15),
+	DEF_KEY_LOCK(avg_wait, "avg wait", avg_wait_time, 12),
+	DEF_KEY_LOCK(wait_total, "total wait", wait_time_total, 12),
+	DEF_KEY_LOCK(wait_max, "max wait", wait_time_max, 12),
+	DEF_KEY_LOCK(wait_min, "min wait", wait_time_min, 12),
 
 	/* extra comparisons much complicated should be here */
 	{ }
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] perf lock: Allow to use different kernel symbols
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
  2022-06-15 16:32 ` [PATCH 1/7] perf lock: Print wait times with unit Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 3/7] perf lock: Skip print_bad_events() if nothing bad Namhyung Kim
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Add --vmlinux and --kallsyms options to support data file from
different kernels.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-lock.txt | 7 +++++++
 tools/perf/builtin-lock.c              | 4 ++++
 2 files changed, 11 insertions(+)

diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt
index 656b537b2fba..4b8568f0c53b 100644
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@@ -46,6 +46,13 @@ COMMON OPTIONS
 --force::
 	Don't complain, do it.
 
+--vmlinux=<file>::
+        vmlinux pathname
+
+--kallsyms=<file>::
+        kallsyms pathname
+
+
 REPORT OPTIONS
 --------------
 
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 57e396323d05..118a036a81fb 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -1162,6 +1162,10 @@ int cmd_lock(int argc, const char **argv)
 	OPT_INCR('v', "verbose", &verbose, "be more verbose (show symbol address, etc)"),
 	OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace, "dump raw trace in ASCII"),
 	OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
+	OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
+		   "file", "vmlinux pathname"),
+	OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
+		   "file", "kallsyms pathname"),
 	OPT_END()
 	};
 
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] perf lock: Skip print_bad_events() if nothing bad
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
  2022-06-15 16:32 ` [PATCH 1/7] perf lock: Print wait times with unit Namhyung Kim
  2022-06-15 16:32 ` [PATCH 2/7] perf lock: Allow to use different kernel symbols Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 4/7] perf lock: Add lock contention tracepoints record support Namhyung Kim
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

The debug output is meaningful when there are bad lock sequences.
Skip it unless there's one or -v option is given.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 118a036a81fb..2337b09dd2cd 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -858,9 +858,16 @@ static void print_bad_events(int bad, int total)
 {
 	/* Output for debug, this have to be removed */
 	int i;
+	int broken = 0;
 	const char *name[4] =
 		{ "acquire", "acquired", "contended", "release" };
 
+	for (i = 0; i < BROKEN_MAX; i++)
+		broken += bad_hist[i];
+
+	if (broken == 0 && !verbose)
+		return;
+
 	pr_info("\n=== output for debug===\n\n");
 	pr_info("bad: %d, total: %d\n", bad, total);
 	pr_info("bad rate: %.2f %%\n", (double)bad / (double)total * 100);
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] perf lock: Add lock contention tracepoints record support
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (2 preceding siblings ...)
  2022-06-15 16:32 ` [PATCH 3/7] perf lock: Skip print_bad_events() if nothing bad Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 5/7] perf lock: Handle lock contention tracepoints Namhyung Kim
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

When LOCKDEP and LOCK_STAT events are not available, it falls back to
record the new lock contention tracepoints.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 76 +++++++++++++++++++++++++++++++++++----
 1 file changed, 69 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 2337b09dd2cd..9e3b90cac505 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -516,17 +516,29 @@ static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
 }
 
 struct trace_lock_handler {
+	/* it's used on CONFIG_LOCKDEP */
 	int (*acquire_event)(struct evsel *evsel,
 			     struct perf_sample *sample);
 
+	/* it's used on CONFIG_LOCKDEP && CONFIG_LOCK_STAT */
 	int (*acquired_event)(struct evsel *evsel,
 			      struct perf_sample *sample);
 
+	/* it's used on CONFIG_LOCKDEP && CONFIG_LOCK_STAT */
 	int (*contended_event)(struct evsel *evsel,
 			       struct perf_sample *sample);
 
+	/* it's used on CONFIG_LOCKDEP */
 	int (*release_event)(struct evsel *evsel,
 			     struct perf_sample *sample);
+
+	/* it's used when CONFIG_LOCKDEP is off */
+	int (*contention_begin_event)(struct evsel *evsel,
+				      struct perf_sample *sample);
+
+	/* it's used when CONFIG_LOCKDEP is off */
+	int (*contention_end_event)(struct evsel *evsel,
+				    struct perf_sample *sample);
 };
 
 static struct lock_seq_stat *get_seq(struct thread_stat *ts, u64 addr)
@@ -854,6 +866,20 @@ static int evsel__process_lock_release(struct evsel *evsel, struct perf_sample *
 	return 0;
 }
 
+static int evsel__process_contention_begin(struct evsel *evsel, struct perf_sample *sample)
+{
+	if (trace_handler->contention_begin_event)
+		return trace_handler->contention_begin_event(evsel, sample);
+	return 0;
+}
+
+static int evsel__process_contention_end(struct evsel *evsel, struct perf_sample *sample)
+{
+	if (trace_handler->contention_end_event)
+		return trace_handler->contention_end_event(evsel, sample);
+	return 0;
+}
+
 static void print_bad_events(int bad, int total)
 {
 	/* Output for debug, this have to be removed */
@@ -1062,6 +1088,11 @@ static const struct evsel_str_handler lock_tracepoints[] = {
 	{ "lock:lock_release",	 evsel__process_lock_release,   }, /* CONFIG_LOCKDEP */
 };
 
+static const struct evsel_str_handler contention_tracepoints[] = {
+	{ "lock:contention_begin", evsel__process_contention_begin, },
+	{ "lock:contention_end",   evsel__process_contention_end,   },
+};
+
 static bool force;
 
 static int __cmd_report(bool display_info)
@@ -1125,20 +1156,41 @@ static int __cmd_record(int argc, const char **argv)
 		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
 	};
 	unsigned int rec_argc, i, j, ret;
+	unsigned int nr_tracepoints;
 	const char **rec_argv;
+	bool has_lock_stat = true;
 
 	for (i = 0; i < ARRAY_SIZE(lock_tracepoints); i++) {
 		if (!is_valid_tracepoint(lock_tracepoints[i].name)) {
-				pr_err("tracepoint %s is not enabled. "
-				       "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
-				       lock_tracepoints[i].name);
-				return 1;
+			pr_debug("tracepoint %s is not enabled. "
+				 "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
+				 lock_tracepoints[i].name);
+			has_lock_stat = false;
+			break;
+		}
+	}
+
+	if (has_lock_stat)
+		goto setup_args;
+
+	for (i = 0; i < ARRAY_SIZE(contention_tracepoints); i++) {
+		if (!is_valid_tracepoint(contention_tracepoints[i].name)) {
+			pr_err("tracepoint %s is not enabled.\n",
+			       contention_tracepoints[i].name);
+			return 1;
 		}
 	}
 
+setup_args:
 	rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+
+	if (has_lock_stat)
+		nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
+	else
+		nr_tracepoints = ARRAY_SIZE(contention_tracepoints);
+
 	/* factor of 2 is for -e in front of each tracepoint */
-	rec_argc += 2 * ARRAY_SIZE(lock_tracepoints);
+	rec_argc += 2 * nr_tracepoints;
 
 	rec_argv = calloc(rec_argc + 1, sizeof(char *));
 	if (!rec_argv)
@@ -1147,9 +1199,19 @@ static int __cmd_record(int argc, const char **argv)
 	for (i = 0; i < ARRAY_SIZE(record_args); i++)
 		rec_argv[i] = strdup(record_args[i]);
 
-	for (j = 0; j < ARRAY_SIZE(lock_tracepoints); j++) {
+	for (j = 0; j < nr_tracepoints; j++) {
+		const char *ev_name;
+
+		if (has_lock_stat)
+			ev_name = strdup(lock_tracepoints[j].name);
+		else
+			ev_name = strdup(contention_tracepoints[j].name);
+
+		if (!ev_name)
+			return -ENOMEM;
+
 		rec_argv[i++] = "-e";
-		rec_argv[i++] = strdup(lock_tracepoints[j].name);
+		rec_argv[i++] = ev_name;
 	}
 
 	for (j = 1; j < (unsigned int)argc; j++, i++)
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] perf lock: Handle lock contention tracepoints
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (3 preceding siblings ...)
  2022-06-15 16:32 ` [PATCH 4/7] perf lock: Add lock contention tracepoints record support Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 6/7] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

When the lock contention events are used, there's no tracking of
acquire and release.  So the state machine is simplified to use
UNINITIALIZED -> CONTENDED -> ACQUIRED only.

Note that CONTENDED state is re-entrant since mutex locks can hit two
or more consecutive contention_begin events for optimistic spinning
and sleep.

Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 137 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 137 insertions(+)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 9e3b90cac505..546dad1963c8 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -483,6 +483,18 @@ static struct lock_stat *pop_from_result(void)
 	return container_of(node, struct lock_stat, rb);
 }
 
+static struct lock_stat *lock_stat_find(u64 addr)
+{
+	struct hlist_head *entry = lockhashentry(addr);
+	struct lock_stat *ret;
+
+	hlist_for_each_entry(ret, entry, hash_entry) {
+		if (ret->addr == addr)
+			return ret;
+	}
+	return NULL;
+}
+
 static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
 {
 	struct hlist_head *entry = lockhashentry(addr);
@@ -827,6 +839,124 @@ static int report_lock_release_event(struct evsel *evsel,
 	return 0;
 }
 
+static int report_lock_contention_begin_event(struct evsel *evsel,
+					      struct perf_sample *sample)
+{
+	struct lock_stat *ls;
+	struct thread_stat *ts;
+	struct lock_seq_stat *seq;
+	u64 addr = evsel__intval(evsel, sample, "lock_addr");
+
+	if (show_thread_stats)
+		addr = sample->tid;
+
+	ls = lock_stat_findnew(addr, "No name");
+	if (!ls)
+		return -ENOMEM;
+
+	ts = thread_stat_findnew(sample->tid);
+	if (!ts)
+		return -ENOMEM;
+
+	seq = get_seq(ts, addr);
+	if (!seq)
+		return -ENOMEM;
+
+	switch (seq->state) {
+	case SEQ_STATE_UNINITIALIZED:
+	case SEQ_STATE_ACQUIRED:
+		break;
+	case SEQ_STATE_CONTENDED:
+		/*
+		 * It can have nested contention begin with mutex spinning,
+		 * then we would use the original contention begin event and
+		 * ignore the second one.
+		 */
+		goto end;
+	case SEQ_STATE_ACQUIRING:
+	case SEQ_STATE_READ_ACQUIRED:
+	case SEQ_STATE_RELEASED:
+		/* broken lock sequence */
+		if (!ls->broken) {
+			ls->broken = 1;
+			bad_hist[BROKEN_CONTENDED]++;
+		}
+		list_del_init(&seq->list);
+		free(seq);
+		goto end;
+	default:
+		BUG_ON("Unknown state of lock sequence found!\n");
+		break;
+	}
+
+	if (seq->state != SEQ_STATE_CONTENDED) {
+		seq->state = SEQ_STATE_CONTENDED;
+		seq->prev_event_time = sample->time;
+		ls->nr_contended++;
+	}
+end:
+	return 0;
+}
+
+static int report_lock_contention_end_event(struct evsel *evsel,
+					    struct perf_sample *sample)
+{
+	struct lock_stat *ls;
+	struct thread_stat *ts;
+	struct lock_seq_stat *seq;
+	u64 contended_term;
+	u64 addr = evsel__intval(evsel, sample, "lock_addr");
+
+	if (show_thread_stats)
+		addr = sample->tid;
+
+	ls = lock_stat_find(addr);
+	if (!ls)
+		return 0;
+
+	ts = thread_stat_find(sample->tid);
+	if (!ts)
+		return 0;
+
+	seq = get_seq(ts, addr);
+	if (!seq)
+		return -ENOMEM;
+
+	switch (seq->state) {
+	case SEQ_STATE_UNINITIALIZED:
+		goto end;
+	case SEQ_STATE_CONTENDED:
+		contended_term = sample->time - seq->prev_event_time;
+		ls->wait_time_total += contended_term;
+		if (contended_term < ls->wait_time_min)
+			ls->wait_time_min = contended_term;
+		if (ls->wait_time_max < contended_term)
+			ls->wait_time_max = contended_term;
+		break;
+	case SEQ_STATE_ACQUIRING:
+	case SEQ_STATE_ACQUIRED:
+	case SEQ_STATE_READ_ACQUIRED:
+	case SEQ_STATE_RELEASED:
+		/* broken lock sequence */
+		if (!ls->broken) {
+			ls->broken = 1;
+			bad_hist[BROKEN_ACQUIRED]++;
+		}
+		list_del_init(&seq->list);
+		free(seq);
+		goto end;
+	default:
+		BUG_ON("Unknown state of lock sequence found!\n");
+		break;
+	}
+
+	seq->state = SEQ_STATE_ACQUIRED;
+	ls->nr_acquired++;
+	ls->avg_wait_time = ls->wait_time_total/ls->nr_acquired;
+end:
+	return 0;
+}
+
 /* lock oriented handlers */
 /* TODO: handlers for CPU oriented, thread oriented */
 static struct trace_lock_handler report_lock_ops  = {
@@ -834,6 +964,8 @@ static struct trace_lock_handler report_lock_ops  = {
 	.acquired_event		= report_lock_acquired_event,
 	.contended_event	= report_lock_contended_event,
 	.release_event		= report_lock_release_event,
+	.contention_begin_event	= report_lock_contention_begin_event,
+	.contention_end_event	= report_lock_contention_end_event,
 };
 
 static struct trace_lock_handler *trace_handler;
@@ -1126,6 +1258,11 @@ static int __cmd_report(bool display_info)
 		goto out_delete;
 	}
 
+	if (perf_session__set_tracepoints_handlers(session, contention_tracepoints)) {
+		pr_err("Initializing perf session tracepoint handlers failed\n");
+		goto out_delete;
+	}
+
 	if (setup_output_field(output_fields))
 		goto out_delete;
 
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] perf record: Allow to specify max stack depth of fp callchain
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (4 preceding siblings ...)
  2022-06-15 16:32 ` [PATCH 5/7] perf lock: Handle lock contention tracepoints Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-15 16:32 ` [PATCH 7/7] perf lock: Look up callchain for the contended locks Namhyung Kim
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Currently it has no interface to specify the max stack depth for perf
record.  Extend the command line parameter to accept a number after
'fp' to specify the depth like '--call-graph fp,32'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-record.txt |  5 +++++
 tools/perf/util/callchain.c              | 18 ++++++++++++------
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index cf8ad50f3de1..772777c2a52e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -275,6 +275,11 @@ OPTIONS
 	User can change the size by passing the size after comma like
 	"--call-graph dwarf,4096".
 
+	When "fp" recording is used, perf tries to save stack enties
+	up to the number specified in sysctl.kernel.perf_event_max_stack
+	by default.  User can change the number by passing it after comma
+	like "--call-graph fp,32".
+
 -q::
 --quiet::
 	Don't print any message, useful for scripting.
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 5c27a4b2e7a7..7e663673f79f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -31,6 +31,7 @@
 #include "callchain.h"
 #include "branch.h"
 #include "symbol.h"
+#include "util.h"
 #include "../perf.h"
 
 #define CALLCHAIN_PARAM_DEFAULT			\
@@ -266,12 +267,17 @@ int parse_callchain_record(const char *arg, struct callchain_param *param)
 	do {
 		/* Framepointer style */
 		if (!strncmp(name, "fp", sizeof("fp"))) {
-			if (!strtok_r(NULL, ",", &saveptr)) {
-				param->record_mode = CALLCHAIN_FP;
-				ret = 0;
-			} else
-				pr_err("callchain: No more arguments "
-				       "needed for --call-graph fp\n");
+			ret = 0;
+			param->record_mode = CALLCHAIN_FP;
+
+			tok = strtok_r(NULL, ",", &saveptr);
+			if (tok) {
+				unsigned long size;
+
+				size = strtoul(tok, &name, 0);
+				if (size < (unsigned) sysctl__max_stack())
+					param->max_stack = size;
+			}
 			break;
 
 		/* Dwarf style */
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/7] perf lock: Look up callchain for the contended locks
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (5 preceding siblings ...)
  2022-06-15 16:32 ` [PATCH 6/7] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
@ 2022-06-15 16:32 ` Namhyung Kim
  2022-06-24 16:40 ` [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
  2022-07-12 12:57 ` Arnaldo Carvalho de Melo
  8 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-15 16:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

The lock contention tracepoints don't provide lock names.  All we can
do is to get stack traces and show the caller instead.  To minimize
the overhead it's limited to up to 8 stack traces and display the
first non-lock function symbol name as a caller.

  $ perf lock report -F acquired,contended,avg_wait,wait_total

                  Name   acquired  contended     avg wait    total wait

   update_blocked_a...         40         40      3.61 us     144.45 us
   kernfs_fop_open+...          5          5      3.64 us      18.18 us
    _nohz_idle_balance          3          3      2.65 us       7.95 us
   tick_do_update_j...          1          1      6.04 us       6.04 us
    ep_scan_ready_list          1          1      3.93 us       3.93 us
  ...

Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 160 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 156 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 546dad1963c8..c5ca34741561 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -9,6 +9,7 @@
 #include "util/symbol.h"
 #include "util/thread.h"
 #include "util/header.h"
+#include "util/callchain.h"
 
 #include <subcmd/pager.h>
 #include <subcmd/parse-options.h>
@@ -19,6 +20,7 @@
 #include "util/tool.h"
 #include "util/data.h"
 #include "util/string2.h"
+#include "util/map.h"
 
 #include <sys/types.h>
 #include <sys/prctl.h>
@@ -32,6 +34,7 @@
 #include <linux/kernel.h>
 #include <linux/zalloc.h>
 #include <linux/err.h>
+#include <linux/stringify.h>
 
 static struct perf_session *session;
 
@@ -120,6 +123,24 @@ static struct rb_root		thread_stats;
 static bool combine_locks;
 static bool show_thread_stats;
 
+/*
+ * CONTENTION_STACK_DEPTH
+ * Number of stack trace entries to find callers
+ */
+#define CONTENTION_STACK_DEPTH  8
+
+/*
+ * CONTENTION_STACK_SKIP
+ * Number of stack trace entries to skip when finding callers.
+ * The first few entries belong to the locking implementation itself.
+ */
+#define CONTENTION_STACK_SKIP  3
+
+static u64 sched_text_start;
+static u64 sched_text_end;
+static u64 lock_text_start;
+static u64 lock_text_end;
+
 static struct thread_stat *thread_stat_find(u32 tid)
 {
 	struct rb_node *node;
@@ -839,6 +860,116 @@ static int report_lock_release_event(struct evsel *evsel,
 	return 0;
 }
 
+static bool is_lock_function(u64 addr)
+{
+	if (!sched_text_start) {
+		struct machine *machine = &session->machines.host;
+		struct map *kmap;
+		struct symbol *sym;
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__sched_text_start",
+							  &kmap);
+		if (!sym) {
+			/* to avoid retry */
+			sched_text_start = 1;
+			return false;
+		}
+
+		sched_text_start = kmap->unmap_ip(kmap, sym->start);
+
+		/* should not fail from here */
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__sched_text_end",
+							  &kmap);
+		sched_text_end = kmap->unmap_ip(kmap, sym->start);
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__lock_text_start",
+							  &kmap);
+		lock_text_start = kmap->unmap_ip(kmap, sym->start);
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__lock_text_end",
+							  &kmap);
+		lock_text_start = kmap->unmap_ip(kmap, sym->start);
+	}
+
+	/* failed to get kernel symbols */
+	if (sched_text_start == 1)
+		return false;
+
+	/* mutex and rwsem functions are in sched text section */
+	if (sched_text_start <= addr && addr < sched_text_end)
+		return true;
+
+	/* spinlock functions are in lock text section */
+	if (lock_text_start <= addr && addr < lock_text_end)
+		return true;
+
+	return false;
+}
+
+static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sample,
+				  char *buf, int size)
+{
+	struct thread *thread;
+	struct callchain_cursor *cursor = &callchain_cursor;
+	struct symbol *sym;
+	int skip = 0;
+	int ret;
+
+	/* lock names will be replaced to task name later */
+	if (show_thread_stats)
+		return -1;
+
+	thread = machine__findnew_thread(&session->machines.host,
+					 -1, sample->pid);
+	if (thread == NULL)
+		return -1;
+
+	/* use caller function name from the callchain */
+	ret = thread__resolve_callchain(thread, cursor, evsel, sample,
+					NULL, NULL, CONTENTION_STACK_DEPTH);
+	if (ret != 0) {
+		thread__put(thread);
+		return -1;
+	}
+
+	callchain_cursor_commit(cursor);
+	thread__put(thread);
+
+	while (true) {
+		struct callchain_cursor_node *node;
+
+		node = callchain_cursor_current(cursor);
+		if (node == NULL)
+			break;
+
+		/* skip first few entries - for lock functions */
+		if (++skip <= CONTENTION_STACK_SKIP)
+			goto next;
+
+		sym = node->ms.sym;
+		if (sym && !is_lock_function(node->ip)) {
+			struct map *map = node->ms.map;
+			u64 offset;
+
+			offset = map->map_ip(map, node->ip) - sym->start;
+
+			if (offset)
+				scnprintf(buf, size, "%s+%#lx", sym->name, offset);
+			else
+				strlcpy(buf, sym->name, size);
+			return 0;
+		}
+
+next:
+		callchain_cursor_advance(cursor);
+	}
+	return -1;
+}
+
 static int report_lock_contention_begin_event(struct evsel *evsel,
 					      struct perf_sample *sample)
 {
@@ -850,9 +981,18 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
 	if (show_thread_stats)
 		addr = sample->tid;
 
-	ls = lock_stat_findnew(addr, "No name");
-	if (!ls)
-		return -ENOMEM;
+	ls = lock_stat_find(addr);
+	if (!ls) {
+		char buf[128];
+		const char *caller = buf;
+
+		if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
+			caller = "Unknown";
+
+		ls = lock_stat_findnew(addr, caller);
+		if (!ls)
+			return -ENOMEM;
+	}
 
 	ts = thread_stat_findnew(sample->tid);
 	if (!ts)
@@ -1233,6 +1373,7 @@ static int __cmd_report(bool display_info)
 	struct perf_tool eops = {
 		.sample		 = process_sample_event,
 		.comm		 = perf_event__process_comm,
+		.mmap		 = perf_event__process_mmap,
 		.namespaces	 = perf_event__process_namespaces,
 		.ordered_events	 = true,
 	};
@@ -1248,6 +1389,8 @@ static int __cmd_report(bool display_info)
 		return PTR_ERR(session);
 	}
 
+	/* for lock function check */
+	symbol_conf.sort_by_name = true;
 	symbol__init(&session->header.env);
 
 	if (!perf_session__has_traces(session, "lock record"))
@@ -1292,8 +1435,12 @@ static int __cmd_record(int argc, const char **argv)
 	const char *record_args[] = {
 		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
 	};
+	const char *callgraph_args[] = {
+		"--call-graph", "fp," __stringify(CONTENTION_STACK_DEPTH),
+	};
 	unsigned int rec_argc, i, j, ret;
 	unsigned int nr_tracepoints;
+	unsigned int nr_callgraph_args = 0;
 	const char **rec_argv;
 	bool has_lock_stat = true;
 
@@ -1318,8 +1465,10 @@ static int __cmd_record(int argc, const char **argv)
 		}
 	}
 
+	nr_callgraph_args = ARRAY_SIZE(callgraph_args);
+
 setup_args:
-	rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+	rec_argc = ARRAY_SIZE(record_args) + nr_callgraph_args + argc - 1;
 
 	if (has_lock_stat)
 		nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
@@ -1351,6 +1500,9 @@ static int __cmd_record(int argc, const char **argv)
 		rec_argv[i++] = ev_name;
 	}
 
+	for (j = 0; j < nr_callgraph_args; j++, i++)
+		rec_argv[i] = callgraph_args[j];
+
 	for (j = 1; j < (unsigned int)argc; j++, i++)
 		rec_argv[i] = argv[j];
 
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (6 preceding siblings ...)
  2022-06-15 16:32 ` [PATCH 7/7] perf lock: Look up callchain for the contended locks Namhyung Kim
@ 2022-06-24 16:40 ` Namhyung Kim
  2022-06-24 18:59   ` Arnaldo Carvalho de Melo
  2022-07-12 12:57 ` Arnaldo Carvalho de Melo
  8 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-24 16:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Ping!  Any comments?


On Wed, Jun 15, 2022 at 9:32 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> Kernel v5.19 will have a new set of tracepoints to track lock
> contentions for various lock types.  Unlike tracepoints in LOCKDEP and
> LOCK_STAT, it's hit only for contended locks and lock names are not
> available.  So it needs to collect stack traces and display the caller
> function instead.
>
> Changes in v4)
>  * add Acked-by from Ian
>  * more comments on trace_lock_handler
>  * don't create stats in the contention_end handler
>
> Changes in v3)
>  * fix build error
>  * support data from different kernels/machines
>  * skip bad stat unless there's actual bad ones
>
> Changes in v2)
>  * add Acked-by from Ian
>  * print time with a unit for compact output
>  * add some comments  (Ian)
>  * remove already applied patch
>
> This patchset merely adds support for the new tracepoints to the
> existing perf lock commands.  So there's no change to the user.  Later
> I'll add new a sub-command dedicated to the tracepoints to make use of
> the additional information.
>
> Example output:
>
>   $ sudo perf lock record -a sleep 3
>
>   $ perf lock report -F acquired,contended,avg_wait,wait_total
>
>                   Name   acquired  contended     avg wait    total wait
>
>    update_blocked_a...         40         40      3.61 us     144.45 us
>    kernfs_fop_open+...          5          5      3.64 us      18.18 us
>     _nohz_idle_balance          3          3      2.65 us       7.95 us
>    tick_do_update_j...          1          1      6.04 us       6.04 us
>     ep_scan_ready_list          1          1      3.93 us       3.93 us
>   ...
>
> You can find the code in the 'perf/lock-contention-v4' branch at
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (7):
>   perf lock: Print wait times with unit
>   perf lock: Allow to use different kernel symbols
>   perf lock: Skip print_bad_events() if nothing bad
>   perf lock: Add lock contention tracepoints record support
>   perf lock: Handle lock contention tracepoints
>   perf record: Allow to specify max stack depth of fp callchain
>   perf lock: Look up callchain for the contended locks
>
>  tools/perf/Documentation/perf-lock.txt   |   7 +
>  tools/perf/Documentation/perf-record.txt |   5 +
>  tools/perf/builtin-lock.c                | 426 ++++++++++++++++++++++-
>  tools/perf/util/callchain.c              |  18 +-
>  4 files changed, 434 insertions(+), 22 deletions(-)
>
>
> base-commit: 9886142c7a2226439c1e3f7d9b69f9c7094c3ef6
> --
> 2.36.1.476.g0c4daa206d-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
  2022-06-24 16:40 ` [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
@ 2022-06-24 18:59   ` Arnaldo Carvalho de Melo
  2022-06-24 21:38     ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-06-24 18:59 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Em Fri, Jun 24, 2022 at 09:40:05AM -0700, Namhyung Kim escreveu:
>> Ping!  Any comments?

 
I'll take a look and test it soon.

- Arnaldo

> 
> On Wed, Jun 15, 2022 at 9:32 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > Kernel v5.19 will have a new set of tracepoints to track lock
> > contentions for various lock types.  Unlike tracepoints in LOCKDEP and
> > LOCK_STAT, it's hit only for contended locks and lock names are not
> > available.  So it needs to collect stack traces and display the caller
> > function instead.
> >
> > Changes in v4)
> >  * add Acked-by from Ian
> >  * more comments on trace_lock_handler
> >  * don't create stats in the contention_end handler
> >
> > Changes in v3)
> >  * fix build error
> >  * support data from different kernels/machines
> >  * skip bad stat unless there's actual bad ones
> >
> > Changes in v2)
> >  * add Acked-by from Ian
> >  * print time with a unit for compact output
> >  * add some comments  (Ian)
> >  * remove already applied patch
> >
> > This patchset merely adds support for the new tracepoints to the
> > existing perf lock commands.  So there's no change to the user.  Later
> > I'll add new a sub-command dedicated to the tracepoints to make use of
> > the additional information.
> >
> > Example output:
> >
> >   $ sudo perf lock record -a sleep 3
> >
> >   $ perf lock report -F acquired,contended,avg_wait,wait_total
> >
> >                   Name   acquired  contended     avg wait    total wait
> >
> >    update_blocked_a...         40         40      3.61 us     144.45 us
> >    kernfs_fop_open+...          5          5      3.64 us      18.18 us
> >     _nohz_idle_balance          3          3      2.65 us       7.95 us
> >    tick_do_update_j...          1          1      6.04 us       6.04 us
> >     ep_scan_ready_list          1          1      3.93 us       3.93 us
> >   ...
> >
> > You can find the code in the 'perf/lock-contention-v4' branch at
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks,
> > Namhyung
> >
> >
> > Namhyung Kim (7):
> >   perf lock: Print wait times with unit
> >   perf lock: Allow to use different kernel symbols
> >   perf lock: Skip print_bad_events() if nothing bad
> >   perf lock: Add lock contention tracepoints record support
> >   perf lock: Handle lock contention tracepoints
> >   perf record: Allow to specify max stack depth of fp callchain
> >   perf lock: Look up callchain for the contended locks
> >
> >  tools/perf/Documentation/perf-lock.txt   |   7 +
> >  tools/perf/Documentation/perf-record.txt |   5 +
> >  tools/perf/builtin-lock.c                | 426 ++++++++++++++++++++++-
> >  tools/perf/util/callchain.c              |  18 +-
> >  4 files changed, 434 insertions(+), 22 deletions(-)
> >
> >
> > base-commit: 9886142c7a2226439c1e3f7d9b69f9c7094c3ef6
> > --
> > 2.36.1.476.g0c4daa206d-goog
> >

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
  2022-06-24 18:59   ` Arnaldo Carvalho de Melo
@ 2022-06-24 21:38     ` Namhyung Kim
  2022-07-07 17:03       ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-24 21:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Hi Arnaldo,

On Fri, Jun 24, 2022 at 11:59 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Jun 24, 2022 at 09:40:05AM -0700, Namhyung Kim escreveu:
> >> Ping!  Any comments?
>
>
> I'll take a look and test it soon.

Thanks for doing that!
Namhyung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
  2022-06-24 21:38     ` Namhyung Kim
@ 2022-07-07 17:03       ` Namhyung Kim
  0 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-07-07 17:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Genple ping!

On Fri, Jun 24, 2022 at 2:38 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Arnaldo,
>
> On Fri, Jun 24, 2022 at 11:59 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Fri, Jun 24, 2022 at 09:40:05AM -0700, Namhyung Kim escreveu:
> > >> Ping!  Any comments?
> >
> >
> > I'll take a look and test it soon.
>
> Thanks for doing that!
> Namhyung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4)
  2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
                   ` (7 preceding siblings ...)
  2022-06-24 16:40 ` [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
@ 2022-07-12 12:57 ` Arnaldo Carvalho de Melo
  8 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-07-12 12:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Em Wed, Jun 15, 2022 at 09:32:15AM -0700, Namhyung Kim escreveu:
> Hello,
> 
> Kernel v5.19 will have a new set of tracepoints to track lock
> contentions for various lock types.  Unlike tracepoints in LOCKDEP and
> LOCK_STAT, it's hit only for contended locks and lock names are not
> available.  So it needs to collect stack traces and display the caller
> function instead.

Applied to tmp.perf/core, performing some further tests and then will
push to perf/core.

Thanks for you work on this!

- Arnaldo
 
> Changes in v4)
>  * add Acked-by from Ian
>  * more comments on trace_lock_handler
>  * don't create stats in the contention_end handler
>  
> Changes in v3)
>  * fix build error
>  * support data from different kernels/machines
>  * skip bad stat unless there's actual bad ones
>  
> Changes in v2)
>  * add Acked-by from Ian
>  * print time with a unit for compact output
>  * add some comments  (Ian)
>  * remove already applied patch
>  
> This patchset merely adds support for the new tracepoints to the
> existing perf lock commands.  So there's no change to the user.  Later
> I'll add new a sub-command dedicated to the tracepoints to make use of
> the additional information.
> 
> Example output:
> 
>   $ sudo perf lock record -a sleep 3
> 
>   $ perf lock report -F acquired,contended,avg_wait,wait_total
> 
>                   Name   acquired  contended     avg wait    total wait
> 
>    update_blocked_a...         40         40      3.61 us     144.45 us
>    kernfs_fop_open+...          5          5      3.64 us      18.18 us
>     _nohz_idle_balance          3          3      2.65 us       7.95 us
>    tick_do_update_j...          1          1      6.04 us       6.04 us
>     ep_scan_ready_list          1          1      3.93 us       3.93 us
>   ...
> 
> You can find the code in the 'perf/lock-contention-v4' branch at
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> 
> Namhyung Kim (7):
>   perf lock: Print wait times with unit
>   perf lock: Allow to use different kernel symbols
>   perf lock: Skip print_bad_events() if nothing bad
>   perf lock: Add lock contention tracepoints record support
>   perf lock: Handle lock contention tracepoints
>   perf record: Allow to specify max stack depth of fp callchain
>   perf lock: Look up callchain for the contended locks
> 
>  tools/perf/Documentation/perf-lock.txt   |   7 +
>  tools/perf/Documentation/perf-record.txt |   5 +
>  tools/perf/builtin-lock.c                | 426 ++++++++++++++++++++++-
>  tools/perf/util/callchain.c              |  18 +-
>  4 files changed, 434 insertions(+), 22 deletions(-)
> 
> 
> base-commit: 9886142c7a2226439c1e3f7d9b69f9c7094c3ef6
> -- 
> 2.36.1.476.g0c4daa206d-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-12 12:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 16:32 [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
2022-06-15 16:32 ` [PATCH 1/7] perf lock: Print wait times with unit Namhyung Kim
2022-06-15 16:32 ` [PATCH 2/7] perf lock: Allow to use different kernel symbols Namhyung Kim
2022-06-15 16:32 ` [PATCH 3/7] perf lock: Skip print_bad_events() if nothing bad Namhyung Kim
2022-06-15 16:32 ` [PATCH 4/7] perf lock: Add lock contention tracepoints record support Namhyung Kim
2022-06-15 16:32 ` [PATCH 5/7] perf lock: Handle lock contention tracepoints Namhyung Kim
2022-06-15 16:32 ` [PATCH 6/7] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
2022-06-15 16:32 ` [PATCH 7/7] perf lock: Look up callchain for the contended locks Namhyung Kim
2022-06-24 16:40 ` [PATCHSET 0/7] perf lock: New lock contention tracepoints support (v4) Namhyung Kim
2022-06-24 18:59   ` Arnaldo Carvalho de Melo
2022-06-24 21:38     ` Namhyung Kim
2022-07-07 17:03       ` Namhyung Kim
2022-07-12 12:57 ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).