linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1)
@ 2022-06-01  6:58 Namhyung Kim
  2022-06-01  6:58 ` [PATCH 1/5] perf lock: Change to synthesize task events Namhyung Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Hello,

Kernel v5.19 will have a new set of tracepoints to track lock
contentions for various lock types.  Unlike tracepoints in LOCKDEP and
LOCK_STAT, it's hit only for contended locks and lock names are not
available.  So it needs to collect stack traces and display the caller
function instead.

This patchset merely adds support for the new tracepoints to the
existing perf lock commands.  So there's no change to the user.  Later
I'll add new a sub-command dedicated to the tracepoints to make use of
the additional information.

Example output:

  $ sudo perf lock record -a sleep 3
  
  $ sudo perf lock report -F acquired,contended,avg_wait

                  Name   acquired  contended   avg wait (ns)

   update_blocked_a...         40         40            3611
   kernfs_fop_open+...          5          5            3636
    _nohz_idle_balance          3          3            2650
   tick_do_update_j...          1          1            6041
    ep_scan_ready_list          1          1            3930
  ...

You can find the code in the 'perf/lock-contention-v1' branch at

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (5):
  perf lock: Change to synthesize task events
  perf lock: Add lock contention tracepoints record support
  perf lock: Handle lock contention tracepoints
  perf record: Allow to specify max stack depth of fp callchain
  perf lock: Look up callchain for the contended locks

 tools/perf/Documentation/perf-record.txt |   5 +
 tools/perf/builtin-lock.c                | 356 ++++++++++++++++++++++-
 tools/perf/util/callchain.c              |  18 +-
 3 files changed, 364 insertions(+), 15 deletions(-)


base-commit: 9dde6cadb92b5670b23b97ec53091df0530ec38b
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] perf lock: Change to synthesize task events
  2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
@ 2022-06-01  6:58 ` Namhyung Kim
  2022-06-03 19:28   ` Arnaldo Carvalho de Melo
  2022-06-01  6:58 ` [PATCH 2/5] perf lock: Add lock contention tracepoints record support Namhyung Kim
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

With -t/--threads option, it needs to display task names so synthesize
task related events at the beginning.

Fixes: 7c3bcbdf449f ("perf lock: Add -t/--thread option for report")
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index b1200b7340a6..23a33ac15e68 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -1083,7 +1083,7 @@ static int __cmd_report(bool display_info)
 static int __cmd_record(int argc, const char **argv)
 {
 	const char *record_args[] = {
-		"record", "-R", "-m", "1024", "-c", "1", "--synth", "no",
+		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
 	};
 	unsigned int rec_argc, i, j, ret;
 	const char **rec_argv;

base-commit: 9dde6cadb92b5670b23b97ec53091df0530ec38b
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] perf lock: Add lock contention tracepoints record support
  2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
  2022-06-01  6:58 ` [PATCH 1/5] perf lock: Change to synthesize task events Namhyung Kim
@ 2022-06-01  6:58 ` Namhyung Kim
  2022-06-02  6:21   ` Ian Rogers
  2022-06-01  6:58 ` [PATCH 3/5] perf lock: Handle lock contention tracepoints Namhyung Kim
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

When LOCKDEP and LOCK_STAT events are not available, it falls back to
record the new lock contention tracepoints.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 70 +++++++++++++++++++++++++++++++++++----
 1 file changed, 63 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 23a33ac15e68..3e3320b8cede 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -495,6 +495,12 @@ struct trace_lock_handler {
 
 	int (*release_event)(struct evsel *evsel,
 			     struct perf_sample *sample);
+
+	int (*contention_begin_event)(struct evsel *evsel,
+				      struct perf_sample *sample);
+
+	int (*contention_end_event)(struct evsel *evsel,
+				    struct perf_sample *sample);
 };
 
 static struct lock_seq_stat *get_seq(struct thread_stat *ts, u64 addr)
@@ -822,6 +828,20 @@ static int evsel__process_lock_release(struct evsel *evsel, struct perf_sample *
 	return 0;
 }
 
+static int evsel__process_contention_begin(struct evsel *evsel, struct perf_sample *sample)
+{
+	if (trace_handler->contention_begin_event)
+		return trace_handler->contention_begin_event(evsel, sample);
+	return 0;
+}
+
+static int evsel__process_contention_end(struct evsel *evsel, struct perf_sample *sample)
+{
+	if (trace_handler->contention_end_event)
+		return trace_handler->contention_end_event(evsel, sample);
+	return 0;
+}
+
 static void print_bad_events(int bad, int total)
 {
 	/* Output for debug, this have to be removed */
@@ -1023,6 +1043,11 @@ static const struct evsel_str_handler lock_tracepoints[] = {
 	{ "lock:lock_release",	 evsel__process_lock_release,   }, /* CONFIG_LOCKDEP */
 };
 
+static const struct evsel_str_handler contention_tracepoints[] = {
+	{ "lock:contention_begin", evsel__process_contention_begin, },
+	{ "lock:contention_end",   evsel__process_contention_end,   },
+};
+
 static bool force;
 
 static int __cmd_report(bool display_info)
@@ -1086,20 +1111,41 @@ static int __cmd_record(int argc, const char **argv)
 		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
 	};
 	unsigned int rec_argc, i, j, ret;
+	unsigned int nr_tracepoints;
 	const char **rec_argv;
+	bool has_lock_stat = true;
 
 	for (i = 0; i < ARRAY_SIZE(lock_tracepoints); i++) {
 		if (!is_valid_tracepoint(lock_tracepoints[i].name)) {
-				pr_err("tracepoint %s is not enabled. "
-				       "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
-				       lock_tracepoints[i].name);
-				return 1;
+			pr_debug("tracepoint %s is not enabled. "
+				 "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
+				 lock_tracepoints[i].name);
+			has_lock_stat = false;
+			break;
+		}
+	}
+
+	if (has_lock_stat)
+		goto setup_args;
+
+	for (i = 0; i < ARRAY_SIZE(contention_tracepoints); i++) {
+		if (!is_valid_tracepoint(contention_tracepoints[i].name)) {
+			pr_err("tracepoint %s is not enabled.\n",
+			       contention_tracepoints[i].name);
+			return 1;
 		}
 	}
 
+setup_args:
 	rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+
+	if (has_lock_stat)
+		nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
+	else
+		nr_tracepoints = ARRAY_SIZE(contention_tracepoints);
+
 	/* factor of 2 is for -e in front of each tracepoint */
-	rec_argc += 2 * ARRAY_SIZE(lock_tracepoints);
+	rec_argc += 2 * nr_tracepoints;
 
 	rec_argv = calloc(rec_argc + 1, sizeof(char *));
 	if (!rec_argv)
@@ -1108,9 +1154,19 @@ static int __cmd_record(int argc, const char **argv)
 	for (i = 0; i < ARRAY_SIZE(record_args); i++)
 		rec_argv[i] = strdup(record_args[i]);
 
-	for (j = 0; j < ARRAY_SIZE(lock_tracepoints); j++) {
+	for (j = 0; j < nr_tracepoints; j++) {
+		const char *ev_name;
+
+		if (has_lock_stat)
+			ev_name = strdup(lock_tracepoints[j].name);
+		else
+			ev_name = strdup(contention_tracepoints[j].name);
+
+		if (!ev_name)
+			return -ENOMEM;
+
 		rec_argv[i++] = "-e";
-		rec_argv[i++] = strdup(lock_tracepoints[j].name);
+		rec_argv[i++] = ev_name;
 	}
 
 	for (j = 1; j < (unsigned int)argc; j++, i++)
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] perf lock: Handle lock contention tracepoints
  2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
  2022-06-01  6:58 ` [PATCH 1/5] perf lock: Change to synthesize task events Namhyung Kim
  2022-06-01  6:58 ` [PATCH 2/5] perf lock: Add lock contention tracepoints record support Namhyung Kim
@ 2022-06-01  6:58 ` Namhyung Kim
  2022-06-02  6:23   ` Ian Rogers
  2022-06-01  6:58 ` [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
  2022-06-01  6:58 ` [PATCH 5/5] perf lock: Look up callchain for the contended locks Namhyung Kim
  4 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

When the lock contention events are used, there's no tracking of
acquire and release.  So the state machine is simplified to use
UNINITIALIZED -> CONTENDED -> ACQUIRED only.

Note that CONTENDED state is re-entrant since mutex locks can hit two
or more consecutive contention_begin events for optimistic spinning
and sleep.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 125 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 125 insertions(+)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 3e3320b8cede..82cbf0f2e4cd 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -789,6 +789,124 @@ static int report_lock_release_event(struct evsel *evsel,
 	return 0;
 }
 
+static int report_lock_contention_begin_event(struct evsel *evsel,
+					      struct perf_sample *sample)
+{
+	struct lock_stat *ls;
+	struct thread_stat *ts;
+	struct lock_seq_stat *seq;
+	u64 addr = evsel__intval(evsel, sample, "lock_addr");
+
+	if (show_thread_stats)
+		addr = sample->tid;
+
+	ls = lock_stat_findnew(addr, "No name");
+	if (!ls)
+		return -ENOMEM;
+
+	ts = thread_stat_findnew(sample->tid);
+	if (!ts)
+		return -ENOMEM;
+
+	seq = get_seq(ts, addr);
+	if (!seq)
+		return -ENOMEM;
+
+	switch (seq->state) {
+	case SEQ_STATE_UNINITIALIZED:
+	case SEQ_STATE_ACQUIRED:
+		break;
+	case SEQ_STATE_CONTENDED:
+		/*
+		 * It can have nested contention begin with mutex spinning,
+		 * then we would use the original contention begin event and
+		 * ignore the second one.
+		 */
+		goto end;
+	case SEQ_STATE_ACQUIRING:
+	case SEQ_STATE_READ_ACQUIRED:
+	case SEQ_STATE_RELEASED:
+		/* broken lock sequence */
+		if (!ls->broken) {
+			ls->broken = 1;
+			bad_hist[BROKEN_CONTENDED]++;
+		}
+		list_del_init(&seq->list);
+		free(seq);
+		goto end;
+	default:
+		BUG_ON("Unknown state of lock sequence found!\n");
+		break;
+	}
+
+	if (seq->state != SEQ_STATE_CONTENDED) {
+		seq->state = SEQ_STATE_CONTENDED;
+		seq->prev_event_time = sample->time;
+		ls->nr_contended++;
+	}
+end:
+	return 0;
+}
+
+static int report_lock_contention_end_event(struct evsel *evsel,
+					    struct perf_sample *sample)
+{
+	struct lock_stat *ls;
+	struct thread_stat *ts;
+	struct lock_seq_stat *seq;
+	u64 contended_term;
+	u64 addr = evsel__intval(evsel, sample, "lock_addr");
+
+	if (show_thread_stats)
+		addr = sample->tid;
+
+	ls = lock_stat_findnew(addr, "No name");
+	if (!ls)
+		return -ENOMEM;
+
+	ts = thread_stat_findnew(sample->tid);
+	if (!ts)
+		return -ENOMEM;
+
+	seq = get_seq(ts, addr);
+	if (!seq)
+		return -ENOMEM;
+
+	switch (seq->state) {
+	case SEQ_STATE_UNINITIALIZED:
+		goto end;
+	case SEQ_STATE_CONTENDED:
+		contended_term = sample->time - seq->prev_event_time;
+		ls->wait_time_total += contended_term;
+		if (contended_term < ls->wait_time_min)
+			ls->wait_time_min = contended_term;
+		if (ls->wait_time_max < contended_term)
+			ls->wait_time_max = contended_term;
+		break;
+	case SEQ_STATE_ACQUIRING:
+	case SEQ_STATE_ACQUIRED:
+	case SEQ_STATE_READ_ACQUIRED:
+	case SEQ_STATE_RELEASED:
+		/* broken lock sequence */
+		if (!ls->broken) {
+			ls->broken = 1;
+			bad_hist[BROKEN_CONTENDED]++;
+		}
+		list_del_init(&seq->list);
+		free(seq);
+		goto end;
+	default:
+		BUG_ON("Unknown state of lock sequence found!\n");
+		break;
+	}
+
+	seq->state = SEQ_STATE_ACQUIRED;
+	ls->nr_acquired++;
+	ls->avg_wait_time = ls->wait_time_total/ls->nr_acquired;
+end:
+	return 0;
+}
+
 /* lock oriented handlers */
 /* TODO: handlers for CPU oriented, thread oriented */
 static struct trace_lock_handler report_lock_ops  = {
@@ -796,6 +914,8 @@ static struct trace_lock_handler report_lock_ops  = {
 	.acquired_event		= report_lock_acquired_event,
 	.contended_event	= report_lock_contended_event,
 	.release_event		= report_lock_release_event,
+	.contention_begin_event	= report_lock_contention_begin_event,
+	.contention_end_event	= report_lock_contention_end_event,
 };
 
 static struct trace_lock_handler *trace_handler;
@@ -1081,6 +1201,11 @@ static int __cmd_report(bool display_info)
 		goto out_delete;
 	}
 
+	if (perf_session__set_tracepoints_handlers(session, contention_tracepoints)) {
+		pr_err("Initializing perf session tracepoint handlers failed\n");
+		goto out_delete;
+	}
+
 	if (setup_output_field(output_fields))
 		goto out_delete;
 
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain
  2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2022-06-01  6:58 ` [PATCH 3/5] perf lock: Handle lock contention tracepoints Namhyung Kim
@ 2022-06-01  6:58 ` Namhyung Kim
  2022-06-02  6:29   ` Ian Rogers
  2022-06-01  6:58 ` [PATCH 5/5] perf lock: Look up callchain for the contended locks Namhyung Kim
  4 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

Currently it has no interface to specify the max stack depth for perf
record.  Extend the command line parameter to accept a number after
'fp' to specify the depth like '--call-graph fp,32'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-record.txt |  5 +++++
 tools/perf/util/callchain.c              | 18 ++++++++++++------
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index b4e9ef7edfef..d2759e2ccf9e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -275,6 +275,11 @@ OPTIONS
 	User can change the size by passing the size after comma like
 	"--call-graph dwarf,4096".
 
+	When "fp" recording is used, perf tries to save stack enties
+	up to the number specified in sysctl.kernel.perf_event_max_stack
+	by default.  User can change the number by passing it after comma
+	like "--call-graph fp,32".
+
 -q::
 --quiet::
 	Don't print any message, useful for scripting.
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 5c27a4b2e7a7..7e663673f79f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -31,6 +31,7 @@
 #include "callchain.h"
 #include "branch.h"
 #include "symbol.h"
+#include "util.h"
 #include "../perf.h"
 
 #define CALLCHAIN_PARAM_DEFAULT			\
@@ -266,12 +267,17 @@ int parse_callchain_record(const char *arg, struct callchain_param *param)
 	do {
 		/* Framepointer style */
 		if (!strncmp(name, "fp", sizeof("fp"))) {
-			if (!strtok_r(NULL, ",", &saveptr)) {
-				param->record_mode = CALLCHAIN_FP;
-				ret = 0;
-			} else
-				pr_err("callchain: No more arguments "
-				       "needed for --call-graph fp\n");
+			ret = 0;
+			param->record_mode = CALLCHAIN_FP;
+
+			tok = strtok_r(NULL, ",", &saveptr);
+			if (tok) {
+				unsigned long size;
+
+				size = strtoul(tok, &name, 0);
+				if (size < (unsigned) sysctl__max_stack())
+					param->max_stack = size;
+			}
 			break;
 
 		/* Dwarf style */
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] perf lock: Look up callchain for the contended locks
  2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2022-06-01  6:58 ` [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
@ 2022-06-01  6:58 ` Namhyung Kim
  2022-06-02  6:31   ` Ian Rogers
  4 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso

The lock contention tracepoints don't provide lock names.  All we can
do is to get stack traces and show the caller instead.  To minimize
the overhead it's limited to up to 8 stack traces and display the
first non-lock function symbol name as a caller.

  $ perf lock report -F acquired,contended,avg_wait

                  Name   acquired  contended   avg wait (ns)

   update_blocked_a...         40         40            3611
   kernfs_fop_open+...          5          5            3636
    _nohz_idle_balance          3          3            2650
   tick_do_update_j...          1          1            6041
    ep_scan_ready_list          1          1            3930
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-lock.c | 165 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 161 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 82cbf0f2e4cd..46a5b3b50f0b 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -9,6 +9,7 @@
 #include "util/symbol.h"
 #include "util/thread.h"
 #include "util/header.h"
+#include "util/callchain.h"
 
 #include <subcmd/pager.h>
 #include <subcmd/parse-options.h>
@@ -19,6 +20,7 @@
 #include "util/tool.h"
 #include "util/data.h"
 #include "util/string2.h"
+#include "util/map.h"
 
 #include <sys/types.h>
 #include <sys/prctl.h>
@@ -32,6 +34,7 @@
 #include <linux/kernel.h>
 #include <linux/zalloc.h>
 #include <linux/err.h>
+#include <linux/stringify.h>
 
 static struct perf_session *session;
 
@@ -120,6 +123,17 @@ static struct rb_root		thread_stats;
 static bool combine_locks;
 static bool show_thread_stats;
 
+/*
+ * CONTENTION_STACK_DEPTH
+ * Number of stack trace entries to find callers
+ */
+#define CONTENTION_STACK_DEPTH  8
+
+static u64 sched_text_start;
+static u64 sched_text_end;
+static u64 lock_text_start;
+static u64 lock_text_end;
+
 static struct thread_stat *thread_stat_find(u32 tid)
 {
 	struct rb_node *node;
@@ -451,6 +465,18 @@ static struct lock_stat *pop_from_result(void)
 	return container_of(node, struct lock_stat, rb);
 }
 
+static struct lock_stat *lock_stat_find(u64 addr)
+{
+	struct hlist_head *entry = lockhashentry(addr);
+	struct lock_stat *ret;
+
+	hlist_for_each_entry(ret, entry, hash_entry) {
+		if (ret->addr == addr)
+			return ret;
+	}
+	return NULL;
+}
+
 static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
 {
 	struct hlist_head *entry = lockhashentry(addr);
@@ -789,6 +815,116 @@ static int report_lock_release_event(struct evsel *evsel,
 	return 0;
 }
 
+static bool is_lock_function(u64 addr)
+{
+	if (!sched_text_start) {
+		struct machine *machine = &session->machines.host;
+		struct map *kmap;
+		struct symbol *sym;
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__sched_text_start",
+							  &kmap);
+		if (!sym) {
+			/* to avoid retry */
+			sched_text_start = 1;
+			return false;
+		}
+
+		sched_text_start = kmap->unmap_ip(kmap, sym->start);
+
+		/* should not fail from here */
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__sched_text_end",
+							  &kmap);
+		sched_text_end = kmap->unmap_ip(kmap, sym->start);
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__lock_text_start",
+							  &kmap);
+		lock_text_start = kmap->unmap_ip(kmap, sym->start);
+
+		sym = machine__find_kernel_symbol_by_name(machine,
+							  "__lock_text_end",
+							  &kmap);
+		lock_text_start = kmap->unmap_ip(kmap, sym->start);
+	}
+
+	/* failed to get kernel symbols */
+	if (sched_text_start == 1)
+		return false;
+
+	/* mutex and rwsem functions are in sched text section */
+	if (sched_text_start <= addr && addr < sched_text_end)
+		return true;
+
+	/* spinlock functions are in lock text section */
+	if (lock_text_start <= addr && addr < lock_text_end)
+		return true;
+
+	return false;
+}
+
+static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sample,
+				  char *buf, int size)
+{
+	struct thread *thread;
+	struct callchain_cursor *cursor = &callchain_cursor;
+	struct symbol *sym;
+	int skip = 0;
+	int ret;
+
+	/* lock names will be replaced to task name later */
+	if (show_thread_stats)
+		return -1;
+
+	thread = machine__findnew_thread(&session->machines.host,
+					 -1, sample->pid);
+	if (thread == NULL)
+		return -1;
+
+	/* use caller function name from the callchain */
+	ret = thread__resolve_callchain(thread, cursor, evsel, sample,
+					NULL, NULL, CONTENTION_STACK_DEPTH);
+	if (ret != 0) {
+		thread__put(thread);
+		return -1;
+	}
+
+	callchain_cursor_commit(cursor);
+	thread__put(thread);
+
+	while (true) {
+		struct callchain_cursor_node *node;
+
+		node = callchain_cursor_current(cursor);
+		if (node == NULL)
+			break;
+
+		/* skip first 3 entries - for lock functions */
+		if (++skip <= 3)
+			goto next;
+
+		sym = node->ms.sym;
+		if (sym && !is_lock_function(node->ip)) {
+			struct map *map = node->ms.map;
+			u64 offset;
+
+			offset = map->map_ip(map, node->ip) - sym->start;
+
+			if (offset)
+				scnprintf(buf, size, "%s+%#lx", sym->name, offset);
+			else
+				strlcpy(buf, sym->name, size);
+			return 0;
+		}
+
+next:
+		callchain_cursor_advance(cursor);
+	}
+	return -1;
+}
+
 static int report_lock_contention_begin_event(struct evsel *evsel,
 					      struct perf_sample *sample)
 {
@@ -800,9 +936,18 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
 	if (show_thread_stats)
 		addr = sample->tid;
 
-	ls = lock_stat_findnew(addr, "No name");
-	if (!ls)
-		return -ENOMEM;
+	ls = lock_stat_find(addr);
+	if (!ls) {
+		char buf[128];
+		const char *caller = buf;
+
+		if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
+			caller = "Unknown";
+
+		ls = lock_stat_findnew(addr, caller);
+		if (!ls)
+			return -ENOMEM;
+	}
 
 	ts = thread_stat_findnew(sample->tid);
 	if (!ts)
@@ -1176,6 +1321,7 @@ static int __cmd_report(bool display_info)
 	struct perf_tool eops = {
 		.sample		 = process_sample_event,
 		.comm		 = perf_event__process_comm,
+		.mmap		 = perf_event__process_mmap,
 		.namespaces	 = perf_event__process_namespaces,
 		.ordered_events	 = true,
 	};
@@ -1191,6 +1337,8 @@ static int __cmd_report(bool display_info)
 		return PTR_ERR(session);
 	}
 
+	/* for lock function check */
+	symbol_conf.sort_by_name = true;
 	symbol__init(&session->header.env);
 
 	if (!perf_session__has_traces(session, "lock record"))
@@ -1235,8 +1383,12 @@ static int __cmd_record(int argc, const char **argv)
 	const char *record_args[] = {
 		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
 	};
+	const char *callgraph_args[] = {
+		"--call-graph", "fp," __stringify(CONTENTION_STACK_DEPTH),
+	};
 	unsigned int rec_argc, i, j, ret;
 	unsigned int nr_tracepoints;
+	unsigned int nr_callgraph_args = 0;
 	const char **rec_argv;
 	bool has_lock_stat = true;
 
@@ -1261,8 +1413,10 @@ static int __cmd_record(int argc, const char **argv)
 		}
 	}
 
+	nr_callgraph_args = ARRAY_SIZE(callgraph_args);
+
 setup_args:
-	rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+	rec_argc = ARRAY_SIZE(record_args) + nr_callgraph_args + argc - 1;
 
 	if (has_lock_stat)
 		nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
@@ -1294,6 +1448,9 @@ static int __cmd_record(int argc, const char **argv)
 		rec_argv[i++] = ev_name;
 	}
 
+	for (j = 0; j < nr_callgraph_args; j++, i++)
+		rec_argv[i] = callgraph_args[j];
+
 	for (j = 1; j < (unsigned int)argc; j++, i++)
 		rec_argv[i] = argv[j];
 
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] perf lock: Add lock contention tracepoints record support
  2022-06-01  6:58 ` [PATCH 2/5] perf lock: Add lock contention tracepoints record support Namhyung Kim
@ 2022-06-02  6:21   ` Ian Rogers
  2022-06-02 20:06     ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Rogers @ 2022-06-02  6:21 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> When LOCKDEP and LOCK_STAT events are not available, it falls back to
> record the new lock contention tracepoints.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-lock.c | 70 +++++++++++++++++++++++++++++++++++----
>  1 file changed, 63 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index 23a33ac15e68..3e3320b8cede 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -495,6 +495,12 @@ struct trace_lock_handler {
>
>         int (*release_event)(struct evsel *evsel,
>                              struct perf_sample *sample);
> +
> +       int (*contention_begin_event)(struct evsel *evsel,
> +                                     struct perf_sample *sample);
> +
> +       int (*contention_end_event)(struct evsel *evsel,
> +                                   struct perf_sample *sample);

Would it make sense to add a comment here about LOCKDEP and LOCK_STAT?
It could be confusing if the handler isn't called on different
kernels.

Thanks,
Ian

>  };
>
>  static struct lock_seq_stat *get_seq(struct thread_stat *ts, u64 addr)
> @@ -822,6 +828,20 @@ static int evsel__process_lock_release(struct evsel *evsel, struct perf_sample *
>         return 0;
>  }
>
> +static int evsel__process_contention_begin(struct evsel *evsel, struct perf_sample *sample)
> +{
> +       if (trace_handler->contention_begin_event)
> +               return trace_handler->contention_begin_event(evsel, sample);
> +       return 0;
> +}
> +
> +static int evsel__process_contention_end(struct evsel *evsel, struct perf_sample *sample)
> +{
> +       if (trace_handler->contention_end_event)
> +               return trace_handler->contention_end_event(evsel, sample);
> +       return 0;
> +}
> +
>  static void print_bad_events(int bad, int total)
>  {
>         /* Output for debug, this have to be removed */
> @@ -1023,6 +1043,11 @@ static const struct evsel_str_handler lock_tracepoints[] = {
>         { "lock:lock_release",   evsel__process_lock_release,   }, /* CONFIG_LOCKDEP */
>  };
>
> +static const struct evsel_str_handler contention_tracepoints[] = {
> +       { "lock:contention_begin", evsel__process_contention_begin, },
> +       { "lock:contention_end",   evsel__process_contention_end,   },
> +};
> +
>  static bool force;
>
>  static int __cmd_report(bool display_info)
> @@ -1086,20 +1111,41 @@ static int __cmd_record(int argc, const char **argv)
>                 "record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
>         };
>         unsigned int rec_argc, i, j, ret;
> +       unsigned int nr_tracepoints;
>         const char **rec_argv;
> +       bool has_lock_stat = true;
>
>         for (i = 0; i < ARRAY_SIZE(lock_tracepoints); i++) {
>                 if (!is_valid_tracepoint(lock_tracepoints[i].name)) {
> -                               pr_err("tracepoint %s is not enabled. "
> -                                      "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
> -                                      lock_tracepoints[i].name);
> -                               return 1;
> +                       pr_debug("tracepoint %s is not enabled. "
> +                                "Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?\n",
> +                                lock_tracepoints[i].name);
> +                       has_lock_stat = false;
> +                       break;
> +               }
> +       }
> +
> +       if (has_lock_stat)
> +               goto setup_args;
> +
> +       for (i = 0; i < ARRAY_SIZE(contention_tracepoints); i++) {
> +               if (!is_valid_tracepoint(contention_tracepoints[i].name)) {
> +                       pr_err("tracepoint %s is not enabled.\n",
> +                              contention_tracepoints[i].name);
> +                       return 1;
>                 }
>         }
>
> +setup_args:
>         rec_argc = ARRAY_SIZE(record_args) + argc - 1;
> +
> +       if (has_lock_stat)
> +               nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
> +       else
> +               nr_tracepoints = ARRAY_SIZE(contention_tracepoints);
> +
>         /* factor of 2 is for -e in front of each tracepoint */
> -       rec_argc += 2 * ARRAY_SIZE(lock_tracepoints);
> +       rec_argc += 2 * nr_tracepoints;
>
>         rec_argv = calloc(rec_argc + 1, sizeof(char *));
>         if (!rec_argv)
> @@ -1108,9 +1154,19 @@ static int __cmd_record(int argc, const char **argv)
>         for (i = 0; i < ARRAY_SIZE(record_args); i++)
>                 rec_argv[i] = strdup(record_args[i]);
>
> -       for (j = 0; j < ARRAY_SIZE(lock_tracepoints); j++) {
> +       for (j = 0; j < nr_tracepoints; j++) {
> +               const char *ev_name;
> +
> +               if (has_lock_stat)
> +                       ev_name = strdup(lock_tracepoints[j].name);
> +               else
> +                       ev_name = strdup(contention_tracepoints[j].name);
> +
> +               if (!ev_name)
> +                       return -ENOMEM;
> +
>                 rec_argv[i++] = "-e";
> -               rec_argv[i++] = strdup(lock_tracepoints[j].name);
> +               rec_argv[i++] = ev_name;
>         }
>
>         for (j = 1; j < (unsigned int)argc; j++, i++)
> --
> 2.36.1.255.ge46751e96f-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/5] perf lock: Handle lock contention tracepoints
  2022-06-01  6:58 ` [PATCH 3/5] perf lock: Handle lock contention tracepoints Namhyung Kim
@ 2022-06-02  6:23   ` Ian Rogers
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2022-06-02  6:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> When the lock contention events are used, there's no tracking of
> acquire and release.  So the state machine is simplified to use
> UNINITIALIZED -> CONTENDED -> ACQUIRED only.
>
> Note that CONTENDED state is re-entrant since mutex locks can hit two
> or more consecutive contention_begin events for optimistic spinning
> and sleep.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  tools/perf/builtin-lock.c | 125 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 125 insertions(+)
>
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index 3e3320b8cede..82cbf0f2e4cd 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -789,6 +789,124 @@ static int report_lock_release_event(struct evsel *evsel,
>         return 0;
>  }
>
> +static int report_lock_contention_begin_event(struct evsel *evsel,
> +                                             struct perf_sample *sample)
> +{
> +       struct lock_stat *ls;
> +       struct thread_stat *ts;
> +       struct lock_seq_stat *seq;
> +       u64 addr = evsel__intval(evsel, sample, "lock_addr");
> +
> +       if (show_thread_stats)
> +               addr = sample->tid;
> +
> +       ls = lock_stat_findnew(addr, "No name");
> +       if (!ls)
> +               return -ENOMEM;
> +
> +       ts = thread_stat_findnew(sample->tid);
> +       if (!ts)
> +               return -ENOMEM;
> +
> +       seq = get_seq(ts, addr);
> +       if (!seq)
> +               return -ENOMEM;
> +
> +       switch (seq->state) {
> +       case SEQ_STATE_UNINITIALIZED:
> +       case SEQ_STATE_ACQUIRED:
> +               break;
> +       case SEQ_STATE_CONTENDED:
> +               /*
> +                * It can have nested contention begin with mutex spinning,
> +                * then we would use the original contention begin event and
> +                * ignore the second one.
> +                */
> +               goto end;
> +       case SEQ_STATE_ACQUIRING:
> +       case SEQ_STATE_READ_ACQUIRED:
> +       case SEQ_STATE_RELEASED:
> +               /* broken lock sequence */
> +               if (!ls->broken) {
> +                       ls->broken = 1;
> +                       bad_hist[BROKEN_CONTENDED]++;
> +               }
> +               list_del_init(&seq->list);
> +               free(seq);
> +               goto end;
> +       default:
> +               BUG_ON("Unknown state of lock sequence found!\n");
> +               break;
> +       }
> +
> +       if (seq->state != SEQ_STATE_CONTENDED) {
> +               seq->state = SEQ_STATE_CONTENDED;
> +               seq->prev_event_time = sample->time;
> +               ls->nr_contended++;
> +       }
> +end:
> +       return 0;
> +}
> +
> +static int report_lock_contention_end_event(struct evsel *evsel,
> +                                           struct perf_sample *sample)
> +{
> +       struct lock_stat *ls;
> +       struct thread_stat *ts;
> +       struct lock_seq_stat *seq;
> +       u64 contended_term;
> +       u64 addr = evsel__intval(evsel, sample, "lock_addr");
> +
> +       if (show_thread_stats)
> +               addr = sample->tid;
> +
> +       ls = lock_stat_findnew(addr, "No name");
> +       if (!ls)
> +               return -ENOMEM;
> +
> +       ts = thread_stat_findnew(sample->tid);
> +       if (!ts)
> +               return -ENOMEM;
> +
> +       seq = get_seq(ts, addr);
> +       if (!seq)
> +               return -ENOMEM;
> +
> +       switch (seq->state) {
> +       case SEQ_STATE_UNINITIALIZED:
> +               goto end;
> +       case SEQ_STATE_CONTENDED:
> +               contended_term = sample->time - seq->prev_event_time;
> +               ls->wait_time_total += contended_term;
> +               if (contended_term < ls->wait_time_min)
> +                       ls->wait_time_min = contended_term;
> +               if (ls->wait_time_max < contended_term)
> +                       ls->wait_time_max = contended_term;
> +               break;
> +       case SEQ_STATE_ACQUIRING:
> +       case SEQ_STATE_ACQUIRED:
> +       case SEQ_STATE_READ_ACQUIRED:
> +       case SEQ_STATE_RELEASED:
> +               /* broken lock sequence */
> +               if (!ls->broken) {
> +                       ls->broken = 1;
> +                       bad_hist[BROKEN_CONTENDED]++;
> +               }
> +               list_del_init(&seq->list);
> +               free(seq);
> +               goto end;
> +       default:
> +               BUG_ON("Unknown state of lock sequence found!\n");
> +               break;
> +       }
> +
> +       seq->state = SEQ_STATE_ACQUIRED;
> +       ls->nr_acquired++;
> +       ls->avg_wait_time = ls->wait_time_total/ls->nr_acquired;
> +end:
> +       return 0;
> +}
> +
>  /* lock oriented handlers */
>  /* TODO: handlers for CPU oriented, thread oriented */
>  static struct trace_lock_handler report_lock_ops  = {
> @@ -796,6 +914,8 @@ static struct trace_lock_handler report_lock_ops  = {
>         .acquired_event         = report_lock_acquired_event,
>         .contended_event        = report_lock_contended_event,
>         .release_event          = report_lock_release_event,
> +       .contention_begin_event = report_lock_contention_begin_event,
> +       .contention_end_event   = report_lock_contention_end_event,
>  };
>
>  static struct trace_lock_handler *trace_handler;
> @@ -1081,6 +1201,11 @@ static int __cmd_report(bool display_info)
>                 goto out_delete;
>         }
>
> +       if (perf_session__set_tracepoints_handlers(session, contention_tracepoints)) {
> +               pr_err("Initializing perf session tracepoint handlers failed\n");
> +               goto out_delete;
> +       }
> +
>         if (setup_output_field(output_fields))
>                 goto out_delete;
>
> --
> 2.36.1.255.ge46751e96f-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain
  2022-06-01  6:58 ` [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
@ 2022-06-02  6:29   ` Ian Rogers
  2022-06-02 20:09     ` Namhyung Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Rogers @ 2022-06-02  6:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Currently it has no interface to specify the max stack depth for perf
> record.  Extend the command line parameter to accept a number after
> 'fp' to specify the depth like '--call-graph fp,32'.

It is valid to have call-graph=fp as a modifier on an event. To
support a depth with that we couldn't use ',' as it is already the
separator between modifiers. For consistency maybe a different
separator should be used here?

Thanks,
Ian

>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/Documentation/perf-record.txt |  5 +++++
>  tools/perf/util/callchain.c              | 18 ++++++++++++------
>  2 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index b4e9ef7edfef..d2759e2ccf9e 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -275,6 +275,11 @@ OPTIONS
>         User can change the size by passing the size after comma like
>         "--call-graph dwarf,4096".
>
> +       When "fp" recording is used, perf tries to save stack enties
> +       up to the number specified in sysctl.kernel.perf_event_max_stack
> +       by default.  User can change the number by passing it after comma
> +       like "--call-graph fp,32".
> +
>  -q::
>  --quiet::
>         Don't print any message, useful for scripting.
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 5c27a4b2e7a7..7e663673f79f 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -31,6 +31,7 @@
>  #include "callchain.h"
>  #include "branch.h"
>  #include "symbol.h"
> +#include "util.h"
>  #include "../perf.h"
>
>  #define CALLCHAIN_PARAM_DEFAULT                        \
> @@ -266,12 +267,17 @@ int parse_callchain_record(const char *arg, struct callchain_param *param)
>         do {
>                 /* Framepointer style */
>                 if (!strncmp(name, "fp", sizeof("fp"))) {
> -                       if (!strtok_r(NULL, ",", &saveptr)) {
> -                               param->record_mode = CALLCHAIN_FP;
> -                               ret = 0;
> -                       } else
> -                               pr_err("callchain: No more arguments "
> -                                      "needed for --call-graph fp\n");
> +                       ret = 0;
> +                       param->record_mode = CALLCHAIN_FP;
> +
> +                       tok = strtok_r(NULL, ",", &saveptr);
> +                       if (tok) {
> +                               unsigned long size;
> +
> +                               size = strtoul(tok, &name, 0);
> +                               if (size < (unsigned) sysctl__max_stack())
> +                                       param->max_stack = size;
> +                       }
>                         break;
>
>                 /* Dwarf style */
> --
> 2.36.1.255.ge46751e96f-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/5] perf lock: Look up callchain for the contended locks
  2022-06-01  6:58 ` [PATCH 5/5] perf lock: Look up callchain for the contended locks Namhyung Kim
@ 2022-06-02  6:31   ` Ian Rogers
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2022-06-02  6:31 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> The lock contention tracepoints don't provide lock names.  All we can
> do is to get stack traces and show the caller instead.  To minimize
> the overhead it's limited to up to 8 stack traces and display the
> first non-lock function symbol name as a caller.
>
>   $ perf lock report -F acquired,contended,avg_wait
>
>                   Name   acquired  contended   avg wait (ns)
>
>    update_blocked_a...         40         40            3611
>    kernfs_fop_open+...          5          5            3636
>     _nohz_idle_balance          3          3            2650
>    tick_do_update_j...          1          1            6041
>     ep_scan_ready_list          1          1            3930
>   ...
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  tools/perf/builtin-lock.c | 165 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 161 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index 82cbf0f2e4cd..46a5b3b50f0b 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -9,6 +9,7 @@
>  #include "util/symbol.h"
>  #include "util/thread.h"
>  #include "util/header.h"
> +#include "util/callchain.h"
>
>  #include <subcmd/pager.h>
>  #include <subcmd/parse-options.h>
> @@ -19,6 +20,7 @@
>  #include "util/tool.h"
>  #include "util/data.h"
>  #include "util/string2.h"
> +#include "util/map.h"
>
>  #include <sys/types.h>
>  #include <sys/prctl.h>
> @@ -32,6 +34,7 @@
>  #include <linux/kernel.h>
>  #include <linux/zalloc.h>
>  #include <linux/err.h>
> +#include <linux/stringify.h>
>
>  static struct perf_session *session;
>
> @@ -120,6 +123,17 @@ static struct rb_root              thread_stats;
>  static bool combine_locks;
>  static bool show_thread_stats;
>
> +/*
> + * CONTENTION_STACK_DEPTH
> + * Number of stack trace entries to find callers
> + */
> +#define CONTENTION_STACK_DEPTH  8
> +
> +static u64 sched_text_start;
> +static u64 sched_text_end;
> +static u64 lock_text_start;
> +static u64 lock_text_end;
> +
>  static struct thread_stat *thread_stat_find(u32 tid)
>  {
>         struct rb_node *node;
> @@ -451,6 +465,18 @@ static struct lock_stat *pop_from_result(void)
>         return container_of(node, struct lock_stat, rb);
>  }
>
> +static struct lock_stat *lock_stat_find(u64 addr)
> +{
> +       struct hlist_head *entry = lockhashentry(addr);
> +       struct lock_stat *ret;
> +
> +       hlist_for_each_entry(ret, entry, hash_entry) {
> +               if (ret->addr == addr)
> +                       return ret;
> +       }
> +       return NULL;
> +}
> +
>  static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
>  {
>         struct hlist_head *entry = lockhashentry(addr);
> @@ -789,6 +815,116 @@ static int report_lock_release_event(struct evsel *evsel,
>         return 0;
>  }
>
> +static bool is_lock_function(u64 addr)
> +{
> +       if (!sched_text_start) {
> +               struct machine *machine = &session->machines.host;
> +               struct map *kmap;
> +               struct symbol *sym;
> +
> +               sym = machine__find_kernel_symbol_by_name(machine,
> +                                                         "__sched_text_start",
> +                                                         &kmap);
> +               if (!sym) {
> +                       /* to avoid retry */
> +                       sched_text_start = 1;
> +                       return false;
> +               }
> +
> +               sched_text_start = kmap->unmap_ip(kmap, sym->start);
> +
> +               /* should not fail from here */
> +               sym = machine__find_kernel_symbol_by_name(machine,
> +                                                         "__sched_text_end",
> +                                                         &kmap);
> +               sched_text_end = kmap->unmap_ip(kmap, sym->start);
> +
> +               sym = machine__find_kernel_symbol_by_name(machine,
> +                                                         "__lock_text_start",
> +                                                         &kmap);
> +               lock_text_start = kmap->unmap_ip(kmap, sym->start);
> +
> +               sym = machine__find_kernel_symbol_by_name(machine,
> +                                                         "__lock_text_end",
> +                                                         &kmap);
> +               lock_text_start = kmap->unmap_ip(kmap, sym->start);
> +       }
> +
> +       /* failed to get kernel symbols */
> +       if (sched_text_start == 1)
> +               return false;
> +
> +       /* mutex and rwsem functions are in sched text section */
> +       if (sched_text_start <= addr && addr < sched_text_end)
> +               return true;
> +
> +       /* spinlock functions are in lock text section */
> +       if (lock_text_start <= addr && addr < lock_text_end)
> +               return true;
> +
> +       return false;
> +}
> +
> +static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sample,
> +                                 char *buf, int size)
> +{
> +       struct thread *thread;
> +       struct callchain_cursor *cursor = &callchain_cursor;
> +       struct symbol *sym;
> +       int skip = 0;
> +       int ret;
> +
> +       /* lock names will be replaced to task name later */
> +       if (show_thread_stats)
> +               return -1;
> +
> +       thread = machine__findnew_thread(&session->machines.host,
> +                                        -1, sample->pid);
> +       if (thread == NULL)
> +               return -1;
> +
> +       /* use caller function name from the callchain */
> +       ret = thread__resolve_callchain(thread, cursor, evsel, sample,
> +                                       NULL, NULL, CONTENTION_STACK_DEPTH);
> +       if (ret != 0) {
> +               thread__put(thread);
> +               return -1;
> +       }
> +
> +       callchain_cursor_commit(cursor);
> +       thread__put(thread);
> +
> +       while (true) {
> +               struct callchain_cursor_node *node;
> +
> +               node = callchain_cursor_current(cursor);
> +               if (node == NULL)
> +                       break;
> +
> +               /* skip first 3 entries - for lock functions */
> +               if (++skip <= 3)
> +                       goto next;
> +
> +               sym = node->ms.sym;
> +               if (sym && !is_lock_function(node->ip)) {
> +                       struct map *map = node->ms.map;
> +                       u64 offset;
> +
> +                       offset = map->map_ip(map, node->ip) - sym->start;
> +
> +                       if (offset)
> +                               scnprintf(buf, size, "%s+%#lx", sym->name, offset);
> +                       else
> +                               strlcpy(buf, sym->name, size);
> +                       return 0;
> +               }
> +
> +next:
> +               callchain_cursor_advance(cursor);
> +       }
> +       return -1;
> +}
> +
>  static int report_lock_contention_begin_event(struct evsel *evsel,
>                                               struct perf_sample *sample)
>  {
> @@ -800,9 +936,18 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
>         if (show_thread_stats)
>                 addr = sample->tid;
>
> -       ls = lock_stat_findnew(addr, "No name");
> -       if (!ls)
> -               return -ENOMEM;
> +       ls = lock_stat_find(addr);
> +       if (!ls) {
> +               char buf[128];
> +               const char *caller = buf;
> +
> +               if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
> +                       caller = "Unknown";
> +
> +               ls = lock_stat_findnew(addr, caller);
> +               if (!ls)
> +                       return -ENOMEM;
> +       }
>
>         ts = thread_stat_findnew(sample->tid);
>         if (!ts)
> @@ -1176,6 +1321,7 @@ static int __cmd_report(bool display_info)
>         struct perf_tool eops = {
>                 .sample          = process_sample_event,
>                 .comm            = perf_event__process_comm,
> +               .mmap            = perf_event__process_mmap,
>                 .namespaces      = perf_event__process_namespaces,
>                 .ordered_events  = true,
>         };
> @@ -1191,6 +1337,8 @@ static int __cmd_report(bool display_info)
>                 return PTR_ERR(session);
>         }
>
> +       /* for lock function check */
> +       symbol_conf.sort_by_name = true;
>         symbol__init(&session->header.env);
>
>         if (!perf_session__has_traces(session, "lock record"))
> @@ -1235,8 +1383,12 @@ static int __cmd_record(int argc, const char **argv)
>         const char *record_args[] = {
>                 "record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
>         };
> +       const char *callgraph_args[] = {
> +               "--call-graph", "fp," __stringify(CONTENTION_STACK_DEPTH),
> +       };
>         unsigned int rec_argc, i, j, ret;
>         unsigned int nr_tracepoints;
> +       unsigned int nr_callgraph_args = 0;
>         const char **rec_argv;
>         bool has_lock_stat = true;
>
> @@ -1261,8 +1413,10 @@ static int __cmd_record(int argc, const char **argv)
>                 }
>         }
>
> +       nr_callgraph_args = ARRAY_SIZE(callgraph_args);
> +
>  setup_args:
> -       rec_argc = ARRAY_SIZE(record_args) + argc - 1;
> +       rec_argc = ARRAY_SIZE(record_args) + nr_callgraph_args + argc - 1;
>
>         if (has_lock_stat)
>                 nr_tracepoints = ARRAY_SIZE(lock_tracepoints);
> @@ -1294,6 +1448,9 @@ static int __cmd_record(int argc, const char **argv)
>                 rec_argv[i++] = ev_name;
>         }
>
> +       for (j = 0; j < nr_callgraph_args; j++, i++)
> +               rec_argv[i] = callgraph_args[j];
> +
>         for (j = 1; j < (unsigned int)argc; j++, i++)
>                 rec_argv[i] = argv[j];
>
> --
> 2.36.1.255.ge46751e96f-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] perf lock: Add lock contention tracepoints record support
  2022-06-02  6:21   ` Ian Rogers
@ 2022-06-02 20:06     ` Namhyung Kim
  0 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-02 20:06 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Hi Ian,

On Wed, Jun 1, 2022 at 11:21 PM Ian Rogers <irogers@google.com> wrote:
>
> On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > When LOCKDEP and LOCK_STAT events are not available, it falls back to
> > record the new lock contention tracepoints.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/builtin-lock.c | 70 +++++++++++++++++++++++++++++++++++----
> >  1 file changed, 63 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> > index 23a33ac15e68..3e3320b8cede 100644
> > --- a/tools/perf/builtin-lock.c
> > +++ b/tools/perf/builtin-lock.c
> > @@ -495,6 +495,12 @@ struct trace_lock_handler {
> >
> >         int (*release_event)(struct evsel *evsel,
> >                              struct perf_sample *sample);
> > +
> > +       int (*contention_begin_event)(struct evsel *evsel,
> > +                                     struct perf_sample *sample);
> > +
> > +       int (*contention_end_event)(struct evsel *evsel,
> > +                                   struct perf_sample *sample);
>
> Would it make sense to add a comment here about LOCKDEP and LOCK_STAT?
> It could be confusing if the handler isn't called on different
> kernels.

Sure, I'll add some comments here.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain
  2022-06-02  6:29   ` Ian Rogers
@ 2022-06-02 20:09     ` Namhyung Kim
  0 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2022-06-02 20:09 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

On Wed, Jun 1, 2022 at 11:30 PM Ian Rogers <irogers@google.com> wrote:
>
> On Tue, May 31, 2022 at 11:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Currently it has no interface to specify the max stack depth for perf
> > record.  Extend the command line parameter to accept a number after
> > 'fp' to specify the depth like '--call-graph fp,32'.
>
> It is valid to have call-graph=fp as a modifier on an event. To
> support a depth with that we couldn't use ',' as it is already the
> separator between modifiers. For consistency maybe a different
> separator should be used here?

But it's already there for dwarf stack dump size like
'--call-graph dwarf,4096'.  Maybe we can add a new config term
for the event parser.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] perf lock: Change to synthesize task events
  2022-06-01  6:58 ` [PATCH 1/5] perf lock: Change to synthesize task events Namhyung Kim
@ 2022-06-03 19:28   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-06-03 19:28 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
	Davidlohr Bueso

Em Tue, May 31, 2022 at 11:58:42PM -0700, Namhyung Kim escreveu:
> With -t/--threads option, it needs to display task names so synthesize
> task related events at the beginning.

Cherry picked this one.
 
> Fixes: 7c3bcbdf449f ("perf lock: Add -t/--thread option for report")
> Acked-by: Ian Rogers <irogers@google.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-lock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index b1200b7340a6..23a33ac15e68 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -1083,7 +1083,7 @@ static int __cmd_report(bool display_info)
>  static int __cmd_record(int argc, const char **argv)
>  {
>  	const char *record_args[] = {
> -		"record", "-R", "-m", "1024", "-c", "1", "--synth", "no",
> +		"record", "-R", "-m", "1024", "-c", "1", "--synth", "task",
>  	};
>  	unsigned int rec_argc, i, j, ret;
>  	const char **rec_argv;
> 
> base-commit: 9dde6cadb92b5670b23b97ec53091df0530ec38b
> -- 
> 2.36.1.255.ge46751e96f-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-06-03 19:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-01  6:58 [PATCHSET 0/5] perf lock: New lock contention tracepoints support (v1) Namhyung Kim
2022-06-01  6:58 ` [PATCH 1/5] perf lock: Change to synthesize task events Namhyung Kim
2022-06-03 19:28   ` Arnaldo Carvalho de Melo
2022-06-01  6:58 ` [PATCH 2/5] perf lock: Add lock contention tracepoints record support Namhyung Kim
2022-06-02  6:21   ` Ian Rogers
2022-06-02 20:06     ` Namhyung Kim
2022-06-01  6:58 ` [PATCH 3/5] perf lock: Handle lock contention tracepoints Namhyung Kim
2022-06-02  6:23   ` Ian Rogers
2022-06-01  6:58 ` [PATCH 4/5] perf record: Allow to specify max stack depth of fp callchain Namhyung Kim
2022-06-02  6:29   ` Ian Rogers
2022-06-02 20:09     ` Namhyung Kim
2022-06-01  6:58 ` [PATCH 5/5] perf lock: Look up callchain for the contended locks Namhyung Kim
2022-06-02  6:31   ` Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).