* [PATCHSET 0/5] perf lock: Add contention subcommand (v2)
@ 2022-07-25 18:31 Namhyung Kim
2022-07-25 18:31 ` [PATCH 1/5] perf lock: Add flags field in the lock_stat Namhyung Kim
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
Hello,
It's to add a new subcommand 'contention' (shortly 'con') to perf lock.
Changes in v2)
* bugfix is merged already
* fix build error in patch 2
The new subcommand is to handle the new lock:contention_{begin,end}
tracepoints and shows lock type and caller address like below:
$ perf lock contention
contended total wait max wait avg wait type caller
238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c
1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd
81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172
2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d
24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56
2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa
6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d
1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b
3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c
...
where rwsem:R stands for read access (down_read) for a rw-semaphore.
Other types of lock access is obvious and it doesn't detect optimistic
spinning on mutex yet. This is just a base work for lock contention
analysis and more will come later.
You can get this from 'perf/lock-subcmd-v2' branch on
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (5):
perf lock: Add flags field in the lock_stat
perf lock: Add lock aggregation enum
perf lock: Add 'contention' subcommand
perf lock: Add -k and -F options to 'contention' subcommand
perf lock: Support -t option for 'contention' subcommand
tools/perf/Documentation/perf-lock.txt | 23 +-
tools/perf/builtin-lock.c | 401 +++++++++++++++++++++++--
2 files changed, 390 insertions(+), 34 deletions(-)
base-commit: 9fe9b252c7c022df8e503435e778f15c04dfa3bf
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/5] perf lock: Add flags field in the lock_stat
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
@ 2022-07-25 18:31 ` Namhyung Kim
2022-07-25 20:53 ` Arnaldo Carvalho de Melo
2022-07-25 18:31 ` [PATCH 2/5] perf lock: Add lock aggregation enum Namhyung Kim
` (4 subsequent siblings)
5 siblings, 1 reply; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
For lock contention tracepoint analysis, it needs to keep the flags.
As nr_readlock and nr_trylock fields are not used for it, let's make
it a union.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-lock.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 0aae88fdf93a..1de459198b99 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -59,7 +59,10 @@ struct lock_stat {
unsigned int nr_contended;
unsigned int nr_release;
- unsigned int nr_readlock;
+ union {
+ unsigned int nr_readlock;
+ unsigned int flags;
+ };
unsigned int nr_trylock;
/* these times are in nano sec. */
@@ -516,7 +519,7 @@ static struct lock_stat *lock_stat_find(u64 addr)
return NULL;
}
-static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
+static struct lock_stat *lock_stat_findnew(u64 addr, const char *name, int flags)
{
struct hlist_head *entry = lockhashentry(addr);
struct lock_stat *ret, *new;
@@ -531,13 +534,13 @@ static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
goto alloc_failed;
new->addr = addr;
- new->name = zalloc(sizeof(char) * strlen(name) + 1);
+ new->name = strdup(name);
if (!new->name) {
free(new);
goto alloc_failed;
}
- strcpy(new->name, name);
+ new->flags = flags;
new->wait_time_min = ULLONG_MAX;
hlist_add_head(&new->hash_entry, entry);
@@ -624,7 +627,7 @@ static int report_lock_acquire_event(struct evsel *evsel,
if (show_thread_stats)
addr = sample->tid;
- ls = lock_stat_findnew(addr, name);
+ ls = lock_stat_findnew(addr, name, 0);
if (!ls)
return -ENOMEM;
@@ -696,7 +699,7 @@ static int report_lock_acquired_event(struct evsel *evsel,
if (show_thread_stats)
addr = sample->tid;
- ls = lock_stat_findnew(addr, name);
+ ls = lock_stat_findnew(addr, name, 0);
if (!ls)
return -ENOMEM;
@@ -758,7 +761,7 @@ static int report_lock_contended_event(struct evsel *evsel,
if (show_thread_stats)
addr = sample->tid;
- ls = lock_stat_findnew(addr, name);
+ ls = lock_stat_findnew(addr, name, 0);
if (!ls)
return -ENOMEM;
@@ -813,7 +816,7 @@ static int report_lock_release_event(struct evsel *evsel,
if (show_thread_stats)
addr = sample->tid;
- ls = lock_stat_findnew(addr, name);
+ ls = lock_stat_findnew(addr, name, 0);
if (!ls)
return -ENOMEM;
@@ -985,11 +988,12 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
if (!ls) {
char buf[128];
const char *caller = buf;
+ unsigned int flags = evsel__intval(evsel, sample, "flags");
if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
caller = "Unknown";
- ls = lock_stat_findnew(addr, caller);
+ ls = lock_stat_findnew(addr, caller, flags);
if (!ls)
return -ENOMEM;
}
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/5] perf lock: Add lock aggregation enum
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
2022-07-25 18:31 ` [PATCH 1/5] perf lock: Add flags field in the lock_stat Namhyung Kim
@ 2022-07-25 18:31 ` Namhyung Kim
2022-07-25 18:31 ` [PATCH 3/5] perf lock: Add 'contention' subcommand Namhyung Kim
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
Introduce the aggr_mode variable to prepare the later code change.
The default is LOCK_AGGR_ADDR which aggregate the result for the lock
instances. When -t/--threads option is given, it'd be set to
LOCK_AGGR_TASK. The LOCK_AGGR_CALLER is for the contention analysis
and it'd aggregate the stat by comparing the callstacks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-lock.c | 114 +++++++++++++++++++++++++++++++-------
1 file changed, 94 insertions(+), 20 deletions(-)
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 1de459198b99..b4e7a6e5c54e 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -126,6 +126,12 @@ static struct rb_root thread_stats;
static bool combine_locks;
static bool show_thread_stats;
+static enum {
+ LOCK_AGGR_ADDR,
+ LOCK_AGGR_TASK,
+ LOCK_AGGR_CALLER,
+} aggr_mode = LOCK_AGGR_ADDR;
+
/*
* CONTENTION_STACK_DEPTH
* Number of stack trace entries to find callers
@@ -622,12 +628,22 @@ static int report_lock_acquire_event(struct evsel *evsel,
const char *name = evsel__strval(evsel, sample, "name");
u64 addr = evsel__intval(evsel, sample, "lockdep_addr");
int flag = evsel__intval(evsel, sample, "flags");
+ u64 key;
- /* abuse ls->addr for tid */
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_findnew(addr, name, 0);
+ ls = lock_stat_findnew(key, name, 0);
if (!ls)
return -ENOMEM;
@@ -695,11 +711,22 @@ static int report_lock_acquired_event(struct evsel *evsel,
u64 contended_term;
const char *name = evsel__strval(evsel, sample, "name");
u64 addr = evsel__intval(evsel, sample, "lockdep_addr");
+ u64 key;
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_findnew(addr, name, 0);
+ ls = lock_stat_findnew(key, name, 0);
if (!ls)
return -ENOMEM;
@@ -757,11 +784,22 @@ static int report_lock_contended_event(struct evsel *evsel,
struct lock_seq_stat *seq;
const char *name = evsel__strval(evsel, sample, "name");
u64 addr = evsel__intval(evsel, sample, "lockdep_addr");
+ u64 key;
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_findnew(addr, name, 0);
+ ls = lock_stat_findnew(key, name, 0);
if (!ls)
return -ENOMEM;
@@ -812,11 +850,22 @@ static int report_lock_release_event(struct evsel *evsel,
struct lock_seq_stat *seq;
const char *name = evsel__strval(evsel, sample, "name");
u64 addr = evsel__intval(evsel, sample, "lockdep_addr");
+ u64 key;
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_findnew(addr, name, 0);
+ ls = lock_stat_findnew(key, name, 0);
if (!ls)
return -ENOMEM;
@@ -980,11 +1029,22 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
struct thread_stat *ts;
struct lock_seq_stat *seq;
u64 addr = evsel__intval(evsel, sample, "lock_addr");
+ u64 key;
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_find(addr);
+ ls = lock_stat_find(key);
if (!ls) {
char buf[128];
const char *caller = buf;
@@ -993,7 +1053,7 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
caller = "Unknown";
- ls = lock_stat_findnew(addr, caller, flags);
+ ls = lock_stat_findnew(key, caller, flags);
if (!ls)
return -ENOMEM;
}
@@ -1050,11 +1110,22 @@ static int report_lock_contention_end_event(struct evsel *evsel,
struct lock_seq_stat *seq;
u64 contended_term;
u64 addr = evsel__intval(evsel, sample, "lock_addr");
+ u64 key;
- if (show_thread_stats)
- addr = sample->tid;
+ switch (aggr_mode) {
+ case LOCK_AGGR_ADDR:
+ key = addr;
+ break;
+ case LOCK_AGGR_TASK:
+ key = sample->tid;
+ break;
+ case LOCK_AGGR_CALLER:
+ default:
+ pr_err("Invalid aggregation mode: %d\n", aggr_mode);
+ return -EINVAL;
+ }
- ls = lock_stat_find(addr);
+ ls = lock_stat_find(key);
if (!ls)
return 0;
@@ -1416,6 +1487,9 @@ static int __cmd_report(bool display_info)
if (select_key())
goto out_delete;
+ if (show_thread_stats)
+ aggr_mode = LOCK_AGGR_TASK;
+
err = perf_session__process_events(session);
if (err)
goto out_delete;
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/5] perf lock: Add 'contention' subcommand
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
2022-07-25 18:31 ` [PATCH 1/5] perf lock: Add flags field in the lock_stat Namhyung Kim
2022-07-25 18:31 ` [PATCH 2/5] perf lock: Add lock aggregation enum Namhyung Kim
@ 2022-07-25 18:31 ` Namhyung Kim
2022-07-25 18:31 ` [PATCH 4/5] perf lock: Add -k and -F options to " Namhyung Kim
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
The 'perf lock contention' processes the lock contention events and
displays the result like perf lock report. Right now, there's not
much difference between the two but the lock contention specific
features will come soon.
$ perf lock contention
contended total wait max wait avg wait type caller
238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c
1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd
81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172
2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d
24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56
2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa
6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d
1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b
3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-lock.txt | 4 +-
tools/perf/builtin-lock.c | 212 ++++++++++++++++++++++++-
2 files changed, 214 insertions(+), 2 deletions(-)
diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt
index 4b8568f0c53b..da0bcbc676a3 100644
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@@ -8,7 +8,7 @@ perf-lock - Analyze lock events
SYNOPSIS
--------
[verse]
-'perf lock' {record|report|script|info}
+'perf lock' {record|report|script|info|contention}
DESCRIPTION
-----------
@@ -27,6 +27,8 @@ and statistics with this 'perf lock' command.
'perf lock info' shows metadata like threads or addresses
of lock instances.
+ 'perf lock contention' shows contention statistics.
+
COMMON OPTIONS
--------------
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index b4e7a6e5c54e..64f635273660 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -145,6 +145,18 @@ static enum {
*/
#define CONTENTION_STACK_SKIP 3
+/*
+ * flags for lock:contention_begin
+ * Imported from include/trace/events/lock.h.
+ */
+#define LCB_F_SPIN (1U << 0)
+#define LCB_F_READ (1U << 1)
+#define LCB_F_WRITE (1U << 2)
+#define LCB_F_RT (1U << 3)
+#define LCB_F_PERCPU (1U << 4)
+#define LCB_F_MUTEX (1U << 5)
+
+
static u64 sched_text_start;
static u64 sched_text_end;
static u64 lock_text_start;
@@ -1022,6 +1034,51 @@ static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sampl
return -1;
}
+static u64 callchain_id(struct evsel *evsel, struct perf_sample *sample)
+{
+ struct callchain_cursor *cursor = &callchain_cursor;
+ struct thread *thread;
+ u64 hash = 0;
+ int skip = 0;
+ int ret;
+
+ thread = machine__findnew_thread(&session->machines.host,
+ -1, sample->pid);
+ if (thread == NULL)
+ return -1;
+
+ /* use caller function name from the callchain */
+ ret = thread__resolve_callchain(thread, cursor, evsel, sample,
+ NULL, NULL, CONTENTION_STACK_DEPTH);
+ thread__put(thread);
+
+ if (ret != 0)
+ return -1;
+
+ callchain_cursor_commit(cursor);
+
+ while (true) {
+ struct callchain_cursor_node *node;
+
+ node = callchain_cursor_current(cursor);
+ if (node == NULL)
+ break;
+
+ /* skip first few entries - for lock functions */
+ if (++skip <= CONTENTION_STACK_SKIP)
+ goto next;
+
+ if (node->ms.sym && is_lock_function(node->ip))
+ goto next;
+
+ hash ^= hash_long((unsigned long)node->ip, 64);
+
+next:
+ callchain_cursor_advance(cursor);
+ }
+ return hash;
+}
+
static int report_lock_contention_begin_event(struct evsel *evsel,
struct perf_sample *sample)
{
@@ -1039,6 +1096,8 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
key = sample->tid;
break;
case LOCK_AGGR_CALLER:
+ key = callchain_id(evsel, sample);
+ break;
default:
pr_err("Invalid aggregation mode: %d\n", aggr_mode);
return -EINVAL;
@@ -1120,6 +1179,8 @@ static int report_lock_contention_end_event(struct evsel *evsel,
key = sample->tid;
break;
case LOCK_AGGR_CALLER:
+ key = callchain_id(evsel, sample);
+ break;
default:
pr_err("Invalid aggregation mode: %d\n", aggr_mode);
return -EINVAL;
@@ -1183,6 +1244,12 @@ static struct trace_lock_handler report_lock_ops = {
.contention_end_event = report_lock_contention_end_event,
};
+static struct trace_lock_handler contention_lock_ops = {
+ .contention_begin_event = report_lock_contention_begin_event,
+ .contention_end_event = report_lock_contention_end_event,
+};
+
+
static struct trace_lock_handler *trace_handler;
static int evsel__process_lock_acquire(struct evsel *evsel, struct perf_sample *sample)
@@ -1428,6 +1495,67 @@ static void sort_result(void)
}
}
+static const char *get_type_str(struct lock_stat *st)
+{
+ static const struct {
+ unsigned int flags;
+ const char *name;
+ } table[] = {
+ { 0, "semaphore" },
+ { LCB_F_SPIN, "spinlock" },
+ { LCB_F_SPIN | LCB_F_READ, "rwlock:R" },
+ { LCB_F_SPIN | LCB_F_WRITE, "rwlock:W"},
+ { LCB_F_READ, "rwsem:R" },
+ { LCB_F_WRITE, "rwsem:W" },
+ { LCB_F_RT, "rtmutex" },
+ { LCB_F_RT | LCB_F_READ, "rwlock-rt:R" },
+ { LCB_F_RT | LCB_F_WRITE, "rwlock-rt:W"},
+ { LCB_F_PERCPU | LCB_F_READ, "pcpu-sem:R" },
+ { LCB_F_PERCPU | LCB_F_WRITE, "pcpu-sem:W" },
+ { LCB_F_MUTEX, "mutex" },
+ { LCB_F_MUTEX | LCB_F_SPIN, "mutex" },
+ };
+
+ for (unsigned int i = 0; i < ARRAY_SIZE(table); i++) {
+ if (table[i].flags == st->flags)
+ return table[i].name;
+ }
+ return "unknown";
+}
+
+static void sort_contention_result(void)
+{
+ sort_result();
+}
+
+static void print_contention_result(void)
+{
+ struct lock_stat *st;
+ struct lock_key *key;
+ int bad, total;
+
+ list_for_each_entry(key, &lock_keys, list)
+ pr_info("%*s ", key->len, key->header);
+
+ pr_info(" %10s %s\n\n", "type", "caller");
+
+ bad = total = 0;
+ while ((st = pop_from_result())) {
+ total++;
+ if (st->broken)
+ bad++;
+
+ list_for_each_entry(key, &lock_keys, list) {
+ key->print(key, st);
+ pr_info(" ");
+ }
+
+ pr_info(" %10s %s\n", get_type_str(st), st->name);
+ }
+
+ print_bad_events(bad, total);
+}
+
static const struct evsel_str_handler lock_tracepoints[] = {
{ "lock:lock_acquire", evsel__process_lock_acquire, }, /* CONFIG_LOCKDEP */
{ "lock:lock_acquired", evsel__process_lock_acquired, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
@@ -1508,6 +1636,68 @@ static int __cmd_report(bool display_info)
return err;
}
+static int __cmd_contention(void)
+{
+ int err = -EINVAL;
+ struct perf_tool eops = {
+ .sample = process_sample_event,
+ .comm = perf_event__process_comm,
+ .mmap = perf_event__process_mmap,
+ .ordered_events = true,
+ };
+ struct perf_data data = {
+ .path = input_name,
+ .mode = PERF_DATA_MODE_READ,
+ .force = force,
+ };
+
+ session = perf_session__new(&data, &eops);
+ if (IS_ERR(session)) {
+ pr_err("Initializing perf session failed\n");
+ return PTR_ERR(session);
+ }
+
+ /* for lock function check */
+ symbol_conf.sort_by_name = true;
+ symbol__init(&session->header.env);
+
+ if (!perf_session__has_traces(session, "lock record"))
+ goto out_delete;
+
+ if (!evlist__find_evsel_by_str(session->evlist, "lock:contention_begin")) {
+ pr_err("lock contention evsel not found\n");
+ goto out_delete;
+ }
+
+ if (perf_session__set_tracepoints_handlers(session, contention_tracepoints)) {
+ pr_err("Initializing perf session tracepoint handlers failed\n");
+ goto out_delete;
+ }
+
+ if (setup_output_field("contended,wait_total,wait_max,avg_wait"))
+ goto out_delete;
+
+ sort_key = "wait_total";
+ if (select_key())
+ goto out_delete;
+
+ aggr_mode = LOCK_AGGR_CALLER;
+
+ err = perf_session__process_events(session);
+ if (err)
+ goto out_delete;
+
+ setup_pager();
+
+ sort_contention_result();
+ print_contention_result();
+
+out_delete:
+ perf_session__delete(session);
+ return err;
+}
+
+
static int __cmd_record(int argc, const char **argv)
{
const char *record_args[] = {
@@ -1626,12 +1816,17 @@ int cmd_lock(int argc, const char **argv)
OPT_PARENT(lock_options)
};
+ const struct option contention_options[] = {
+ OPT_PARENT(lock_options)
+ };
+
const char * const info_usage[] = {
"perf lock info [<options>]",
NULL
};
const char *const lock_subcommands[] = { "record", "report", "script",
- "info", NULL };
+ "info", "contention",
+ "contention", NULL };
const char *lock_usage[] = {
NULL,
NULL
@@ -1640,6 +1835,10 @@ int cmd_lock(int argc, const char **argv)
"perf lock report [<options>]",
NULL
};
+ const char * const contention_usage[] = {
+ "perf lock contention [<options>]",
+ NULL
+ };
unsigned int i;
int rc = 0;
@@ -1675,6 +1874,17 @@ int cmd_lock(int argc, const char **argv)
/* recycling report_lock_ops */
trace_handler = &report_lock_ops;
rc = __cmd_report(true);
+ } else if (strlen(argv[0]) > 2 && strstarts("contention", argv[0])) {
+ trace_handler = &contention_lock_ops;
+ if (argc) {
+ argc = parse_options(argc, argv, contention_options,
+ contention_usage, 0);
+ if (argc) {
+ usage_with_options(contention_usage,
+ contention_options);
+ }
+ }
+ rc = __cmd_contention();
} else {
usage_with_options(lock_usage, lock_options);
}
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/5] perf lock: Add -k and -F options to 'contention' subcommand
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
` (2 preceding siblings ...)
2022-07-25 18:31 ` [PATCH 3/5] perf lock: Add 'contention' subcommand Namhyung Kim
@ 2022-07-25 18:31 ` Namhyung Kim
2022-07-25 18:31 ` [PATCH 5/5] perf lock: Support -t option for " Namhyung Kim
2022-07-25 21:01 ` [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Arnaldo Carvalho de Melo
5 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
Like perf lock report, add -k/--key and -F/--field options to control
output formatting and sorting. Note that it has slightly different
default options as some fields are not available and to optimize the
screen space.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-lock.txt | 15 ++++++++
tools/perf/builtin-lock.c | 51 ++++++++++++++++++++------
2 files changed, 55 insertions(+), 11 deletions(-)
diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt
index da0bcbc676a3..ae78e04346bd 100644
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@@ -105,6 +105,21 @@ INFO OPTIONS
--map::
dump map of lock instances (address:name table)
+CONTENTION OPTIONS
+--------------
+
+-k::
+--key=<value>::
+ Sorting key. Possible values: contended, wait_total (default),
+ wait_max, wait_min, avg_wait.
+
+-F::
+--field=<value>::
+ Output fields. By default it shows all but the wait_min fields
+ and users can customize that using this. Possible values:
+ contended, wait_total, wait_max, wait_min, avg_wait.
+
+
SEE ALSO
--------
linkperf:perf[1]
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 64f635273660..ad3b191db2e7 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -362,7 +362,7 @@ static const char *output_fields;
#define DEF_KEY_LOCK(name, header, fn_suffix, len) \
{ #name, header, len, lock_stat_key_ ## fn_suffix, lock_stat_key_print_ ## fn_suffix, {} }
-struct lock_key keys[] = {
+static struct lock_key report_keys[] = {
DEF_KEY_LOCK(acquired, "acquired", nr_acquired, 10),
DEF_KEY_LOCK(contended, "contended", nr_contended, 10),
DEF_KEY_LOCK(avg_wait, "avg wait", avg_wait_time, 12),
@@ -374,9 +374,24 @@ struct lock_key keys[] = {
{ }
};
-static int select_key(void)
+static struct lock_key contention_keys[] = {
+ DEF_KEY_LOCK(contended, "contended", nr_contended, 10),
+ DEF_KEY_LOCK(wait_total, "total wait", wait_time_total, 12),
+ DEF_KEY_LOCK(wait_max, "max wait", wait_time_max, 12),
+ DEF_KEY_LOCK(wait_min, "min wait", wait_time_min, 12),
+ DEF_KEY_LOCK(avg_wait, "avg wait", avg_wait_time, 12),
+
+ /* extra comparisons much complicated should be here */
+ { }
+};
+
+static int select_key(bool contention)
{
int i;
+ struct lock_key *keys = report_keys;
+
+ if (contention)
+ keys = contention_keys;
for (i = 0; keys[i].name; i++) {
if (!strcmp(keys[i].name, sort_key)) {
@@ -394,9 +409,13 @@ static int select_key(void)
return -1;
}
-static int add_output_field(struct list_head *head, char *name)
+static int add_output_field(bool contention, char *name)
{
int i;
+ struct lock_key *keys = report_keys;
+
+ if (contention)
+ keys = contention_keys;
for (i = 0; keys[i].name; i++) {
if (strcmp(keys[i].name, name))
@@ -404,7 +423,7 @@ static int add_output_field(struct list_head *head, char *name)
/* prevent double link */
if (list_empty(&keys[i].list))
- list_add_tail(&keys[i].list, head);
+ list_add_tail(&keys[i].list, &lock_keys);
return 0;
}
@@ -413,10 +432,14 @@ static int add_output_field(struct list_head *head, char *name)
return -1;
}
-static int setup_output_field(const char *str)
+static int setup_output_field(bool contention, const char *str)
{
char *tok, *tmp, *orig;
int i, ret = 0;
+ struct lock_key *keys = report_keys;
+
+ if (contention)
+ keys = contention_keys;
/* no output field given: use all of them */
if (str == NULL) {
@@ -433,7 +456,7 @@ static int setup_output_field(const char *str)
return -ENOMEM;
while ((tok = strsep(&tmp, ",")) != NULL){
- ret = add_output_field(&lock_keys, tok);
+ ret = add_output_field(contention, tok);
if (ret < 0)
break;
}
@@ -1609,10 +1632,10 @@ static int __cmd_report(bool display_info)
goto out_delete;
}
- if (setup_output_field(output_fields))
+ if (setup_output_field(false, output_fields))
goto out_delete;
- if (select_key())
+ if (select_key(false))
goto out_delete;
if (show_thread_stats)
@@ -1674,11 +1697,10 @@ static int __cmd_contention(void)
goto out_delete;
}
- if (setup_output_field("contended,wait_total,wait_max,avg_wait"))
+ if (setup_output_field(true, output_fields))
goto out_delete;
- sort_key = "wait_total";
- if (select_key())
+ if (select_key(true))
goto out_delete;
aggr_mode = LOCK_AGGR_CALLER;
@@ -1817,6 +1839,10 @@ int cmd_lock(int argc, const char **argv)
};
const struct option contention_options[] = {
+ OPT_STRING('k', "key", &sort_key, "wait_total",
+ "key for sorting (contended / wait_total / wait_max / wait_min / avg_wait)"),
+ OPT_STRING('F', "field", &output_fields, "contended,wait_total,wait_max,avg_wait",
+ "output fields (contended / wait_total / wait_max / wait_min / avg_wait)"),
OPT_PARENT(lock_options)
};
@@ -1876,6 +1902,9 @@ int cmd_lock(int argc, const char **argv)
rc = __cmd_report(true);
} else if (strlen(argv[0]) > 2 && strstarts("contention", argv[0])) {
trace_handler = &contention_lock_ops;
+ sort_key = "wait_total";
+ output_fields = "contended,wait_total,wait_max,avg_wait";
+
if (argc) {
argc = parse_options(argc, argv, contention_options,
contention_usage, 0);
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/5] perf lock: Support -t option for 'contention' subcommand
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
` (3 preceding siblings ...)
2022-07-25 18:31 ` [PATCH 4/5] perf lock: Add -k and -F options to " Namhyung Kim
@ 2022-07-25 18:31 ` Namhyung Kim
2022-07-25 21:01 ` [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Arnaldo Carvalho de Melo
5 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2022-07-25 18:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Will Deacon, Waiman Long, Boqun Feng, Davidlohr Bueso
Like perf lock report, it can report lock contention stat of each task.
$ perf lock contention -t
contended total wait max wait avg wait pid comm
5 945.20 us 902.08 us 189.04 us 316167 EventManager_De
33 98.17 us 6.78 us 2.97 us 766063 kworker/0:1-get
7 92.47 us 61.26 us 13.21 us 316170 EventManager_De
14 76.31 us 12.87 us 5.45 us 12949 timedcall
24 76.15 us 12.27 us 3.17 us 767992 sched-pipe
15 75.62 us 11.93 us 5.04 us 15127 switchto-defaul
24 71.84 us 5.59 us 2.99 us 629168 kworker/u513:2-
17 67.41 us 7.94 us 3.96 us 13504 coroner-
1 59.56 us 59.56 us 59.56 us 316165 EventManager_De
14 56.21 us 6.89 us 4.01 us 0 swapper
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-lock.txt | 4 ++++
tools/perf/builtin-lock.c | 22 ++++++++++++++++++++--
2 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt
index ae78e04346bd..8f4e34f924d5 100644
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@@ -119,6 +119,10 @@ CONTENTION OPTIONS
and users can customize that using this. Possible values:
contended, wait_total, wait_max, wait_min, avg_wait.
+-t::
+--threads::
+ Show per-thread lock contention stat
+
SEE ALSO
--------
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index ad3b191db2e7..041801d8b6ac 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -1560,7 +1560,10 @@ static void print_contention_result(void)
list_for_each_entry(key, &lock_keys, list)
pr_info("%*s ", key->len, key->header);
- pr_info(" %10s %s\n\n", "type", "caller");
+ if (show_thread_stats)
+ pr_info(" %10s %s\n\n", "pid", "comm");
+ else
+ pr_info(" %10s %s\n\n", "type", "caller");
bad = total = 0;
while ((st = pop_from_result())) {
@@ -1573,6 +1576,16 @@ static void print_contention_result(void)
pr_info(" ");
}
+ if (show_thread_stats) {
+ struct thread *t;
+ int pid = st->addr;
+
+ /* st->addr contains tid of thread */
+ t = perf_session__findnew(session, pid);
+ pr_info(" %10d %s\n", pid, thread__comm_str(t));
+ continue;
+ }
+
pr_info(" %10s %s\n", get_type_str(st), st->name);
}
@@ -1703,7 +1716,10 @@ static int __cmd_contention(void)
if (select_key(true))
goto out_delete;
- aggr_mode = LOCK_AGGR_CALLER;
+ if (show_thread_stats)
+ aggr_mode = LOCK_AGGR_TASK;
+ else
+ aggr_mode = LOCK_AGGR_CALLER;
err = perf_session__process_events(session);
if (err)
@@ -1843,6 +1859,8 @@ int cmd_lock(int argc, const char **argv)
"key for sorting (contended / wait_total / wait_max / wait_min / avg_wait)"),
OPT_STRING('F', "field", &output_fields, "contended,wait_total,wait_max,avg_wait",
"output fields (contended / wait_total / wait_max / wait_min / avg_wait)"),
+ OPT_BOOLEAN('t', "threads", &show_thread_stats,
+ "show per-thread lock stats"),
OPT_PARENT(lock_options)
};
--
2.37.1.359.gd136c6c3e2-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/5] perf lock: Add flags field in the lock_stat
2022-07-25 18:31 ` [PATCH 1/5] perf lock: Add flags field in the lock_stat Namhyung Kim
@ 2022-07-25 20:53 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-07-25 20:53 UTC (permalink / raw)
To: Namhyung Kim
Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
Davidlohr Bueso
Em Mon, Jul 25, 2022 at 11:31:20AM -0700, Namhyung Kim escreveu:
> For lock contention tracepoint analysis, it needs to keep the flags.
> As nr_readlock and nr_trylock fields are not used for it, let's make
> it a union.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/builtin-lock.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index 0aae88fdf93a..1de459198b99 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -59,7 +59,10 @@ struct lock_stat {
> unsigned int nr_contended;
> unsigned int nr_release;
>
> - unsigned int nr_readlock;
> + union {
> + unsigned int nr_readlock;
> + unsigned int flags;
> + };
> unsigned int nr_trylock;
>
> /* these times are in nano sec. */
> @@ -516,7 +519,7 @@ static struct lock_stat *lock_stat_find(u64 addr)
> return NULL;
> }
>
> -static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
> +static struct lock_stat *lock_stat_findnew(u64 addr, const char *name, int flags)
> {
> struct hlist_head *entry = lockhashentry(addr);
> struct lock_stat *ret, *new;
> @@ -531,13 +534,13 @@ static struct lock_stat *lock_stat_findnew(u64 addr, const char *name)
> goto alloc_failed;
>
> new->addr = addr;
> - new->name = zalloc(sizeof(char) * strlen(name) + 1);
> + new->name = strdup(name);
> if (!new->name) {
> free(new);
> goto alloc_failed;
> }
>
> - strcpy(new->name, name);
I'm applying to speed things up, but please separate such unrelated bits
in another patch next time.
- Arnaldo
> + new->flags = flags;
> new->wait_time_min = ULLONG_MAX;
>
> hlist_add_head(&new->hash_entry, entry);
> @@ -624,7 +627,7 @@ static int report_lock_acquire_event(struct evsel *evsel,
> if (show_thread_stats)
> addr = sample->tid;
>
> - ls = lock_stat_findnew(addr, name);
> + ls = lock_stat_findnew(addr, name, 0);
> if (!ls)
> return -ENOMEM;
>
> @@ -696,7 +699,7 @@ static int report_lock_acquired_event(struct evsel *evsel,
> if (show_thread_stats)
> addr = sample->tid;
>
> - ls = lock_stat_findnew(addr, name);
> + ls = lock_stat_findnew(addr, name, 0);
> if (!ls)
> return -ENOMEM;
>
> @@ -758,7 +761,7 @@ static int report_lock_contended_event(struct evsel *evsel,
> if (show_thread_stats)
> addr = sample->tid;
>
> - ls = lock_stat_findnew(addr, name);
> + ls = lock_stat_findnew(addr, name, 0);
> if (!ls)
> return -ENOMEM;
>
> @@ -813,7 +816,7 @@ static int report_lock_release_event(struct evsel *evsel,
> if (show_thread_stats)
> addr = sample->tid;
>
> - ls = lock_stat_findnew(addr, name);
> + ls = lock_stat_findnew(addr, name, 0);
> if (!ls)
> return -ENOMEM;
>
> @@ -985,11 +988,12 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
> if (!ls) {
> char buf[128];
> const char *caller = buf;
> + unsigned int flags = evsel__intval(evsel, sample, "flags");
>
> if (lock_contention_caller(evsel, sample, buf, sizeof(buf)) < 0)
> caller = "Unknown";
>
> - ls = lock_stat_findnew(addr, caller);
> + ls = lock_stat_findnew(addr, caller, flags);
> if (!ls)
> return -ENOMEM;
> }
> --
> 2.37.1.359.gd136c6c3e2-goog
--
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHSET 0/5] perf lock: Add contention subcommand (v2)
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
` (4 preceding siblings ...)
2022-07-25 18:31 ` [PATCH 5/5] perf lock: Support -t option for " Namhyung Kim
@ 2022-07-25 21:01 ` Arnaldo Carvalho de Melo
5 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-07-25 21:01 UTC (permalink / raw)
To: Namhyung Kim
Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
linux-perf-users, Will Deacon, Waiman Long, Boqun Feng,
Davidlohr Bueso
Em Mon, Jul 25, 2022 at 11:31:19AM -0700, Namhyung Kim escreveu:
> Hello,
>
> It's to add a new subcommand 'contention' (shortly 'con') to perf lock.
>
> Changes in v2)
> * bugfix is merged already
> * fix build error in patch 2
>
> The new subcommand is to handle the new lock:contention_{begin,end}
> tracepoints and shows lock type and caller address like below:
>
> $ perf lock contention
> contended total wait max wait avg wait type caller
>
> 238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c
> 1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd
> 81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172
> 2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d
> 24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56
> 2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa
> 6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d
> 1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b
> 3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c
> ...
>
> where rwsem:R stands for read access (down_read) for a rw-semaphore.
> Other types of lock access is obvious and it doesn't detect optimistic
> spinning on mutex yet. This is just a base work for lock contention
> analysis and more will come later.
>
> You can get this from 'perf/lock-subcmd-v2' branch on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks, applied locally, pushed to tmp.perf/core, will go to perf/core
after further automated tests.
- Arnaldo
> Thanks,
> Namhyung
>
>
> Namhyung Kim (5):
> perf lock: Add flags field in the lock_stat
> perf lock: Add lock aggregation enum
> perf lock: Add 'contention' subcommand
> perf lock: Add -k and -F options to 'contention' subcommand
> perf lock: Support -t option for 'contention' subcommand
>
> tools/perf/Documentation/perf-lock.txt | 23 +-
> tools/perf/builtin-lock.c | 401 +++++++++++++++++++++++--
> 2 files changed, 390 insertions(+), 34 deletions(-)
>
>
> base-commit: 9fe9b252c7c022df8e503435e778f15c04dfa3bf
> --
> 2.37.1.359.gd136c6c3e2-goog
--
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-07-25 21:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-25 18:31 [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Namhyung Kim
2022-07-25 18:31 ` [PATCH 1/5] perf lock: Add flags field in the lock_stat Namhyung Kim
2022-07-25 20:53 ` Arnaldo Carvalho de Melo
2022-07-25 18:31 ` [PATCH 2/5] perf lock: Add lock aggregation enum Namhyung Kim
2022-07-25 18:31 ` [PATCH 3/5] perf lock: Add 'contention' subcommand Namhyung Kim
2022-07-25 18:31 ` [PATCH 4/5] perf lock: Add -k and -F options to " Namhyung Kim
2022-07-25 18:31 ` [PATCH 5/5] perf lock: Support -t option for " Namhyung Kim
2022-07-25 21:01 ` [PATCHSET 0/5] perf lock: Add contention subcommand (v2) Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.