From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org,
Hari Bathini <hbathini@linux.vnet.ibm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Alexei Starovoitov <ast@fb.com>,
Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
Aravinda Prasad <aravinda@linux.vnet.ibm.com>,
Brendan Gregg <brendan.d.gregg@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Eric Biederman <ebiederm@xmission.com>,
Jiri Olsa <jolsa@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Sargun Dhillon <sargun@sargun.me>,
Steven Rostedt <rostedt@goodmis.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 13/19] perf tools: Add 'cgroup_id' sort order keyword
Date: Tue, 14 Mar 2017 15:50:17 -0300 [thread overview]
Message-ID: <20170314185023.31303-14-acme@kernel.org> (raw)
In-Reply-To: <20170314185023.31303-1-acme@kernel.org>
From: Hari Bathini <hbathini@linux.vnet.ibm.com>
This patch introduces a cgroup identifier entry field in perf report to
identify or distinguish data of different cgroups. It uses the device
number and inode number of cgroup namespace, included in perf data with
the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
With the assumption that each container is created with it's own cgroup
namespace, this allows assessment/analysis of multiple containers at
once.
A simple test for this would be to clone a few processes passing
SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
different workloads on each of those contexts, while running perf
record command with --namespaces option.
Shown below is the output of perf report, sorted with cgroup identifier,
on perf.data generated with the above test scenario, clearly indicating
one context's considerable use of kernel memory in comparison with
others:
$ perf report -s cgroup_id,sample --stdio
#
# Total Lost Samples: 0
#
# Samples: 5K of event 'kmem:kmalloc'
# Event count (approx.): 5965
#
# Overhead cgroup id (dev/inode) Samples
# ........ ..................... ............
#
81.27% 3/0xeffffffb 4848
16.24% 3/0xf00000d0 969
1.16% 3/0xf00000ce 69
0.82% 3/0xf00000cf 49
0.50% 0/0x0 30
While this is a start, there is further scope of improving this. For
example, instead of cgroup namespace's device and inode numbers, dev
and inode numbers of some or all namespaces may be used to distinguish
which processes are running in a given container context.
Also, scripts to map device and inode info to containers sounds
plausible for better tracing of containers.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 4 +++-
tools/perf/util/hist.c | 7 ++++++
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 41 ++++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 7 ++++++
5 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 672b149aa80a..e9a61f5485eb 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -72,7 +72,8 @@ OPTIONS
--sort=::
Sort histogram entries by given key(s) - multiple keys can be specified
in CSV format. Following sort keys are available:
- pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, local_weight.
+ pid, comm, dso, symbol, parent, cpu, socket, srcline, weight,
+ local_weight, cgroup_id.
Each key has following meaning:
@@ -92,6 +93,7 @@ OPTIONS
- weight: Event specific weight, e.g. memory latency or transaction
abort cost. This is the global weight.
- local_weight: Local weight version of the weight above.
+ - cgroup_id: ID derived from cgroup namespace device and inode numbers.
- transaction: Transaction abort flags.
- overhead: Overhead percentage of sample
- overhead_sys: Overhead percentage of sample running in system mode
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index eaf72a938fb4..e3b38f629504 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -3,6 +3,7 @@
#include "hist.h"
#include "map.h"
#include "session.h"
+#include "namespaces.h"
#include "sort.h"
#include "evlist.h"
#include "evsel.h"
@@ -169,6 +170,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
}
+ hists__new_col_len(hists, HISTC_CGROUP_ID, 20);
hists__new_col_len(hists, HISTC_CPU, 3);
hists__new_col_len(hists, HISTC_SOCKET, 6);
hists__new_col_len(hists, HISTC_MEM_LOCKED, 6);
@@ -574,9 +576,14 @@ __hists__add_entry(struct hists *hists,
bool sample_self,
struct hist_entry_ops *ops)
{
+ struct namespaces *ns = thread__namespaces(al->thread);
struct hist_entry entry = {
.thread = al->thread,
.comm = thread__comm(al->thread),
+ .cgroup_id = {
+ .dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
+ .ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
+ },
.ms = {
.map = al->map,
.sym = al->sym,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2e839bf40bdd..ee3670a388df 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -30,6 +30,7 @@ enum hist_column {
HISTC_DSO,
HISTC_THREAD,
HISTC_COMM,
+ HISTC_CGROUP_ID,
HISTC_PARENT,
HISTC_CPU,
HISTC_SOCKET,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 93f755ac60ca..8b0d4e39f640 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -536,6 +536,46 @@ struct sort_entry sort_cpu = {
.se_width_idx = HISTC_CPU,
};
+/* --sort cgroup_id */
+
+static int64_t _sort__cgroup_dev_cmp(u64 left_dev, u64 right_dev)
+{
+ return (int64_t)(right_dev - left_dev);
+}
+
+static int64_t _sort__cgroup_inode_cmp(u64 left_ino, u64 right_ino)
+{
+ return (int64_t)(right_ino - left_ino);
+}
+
+static int64_t
+sort__cgroup_id_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ int64_t ret;
+
+ ret = _sort__cgroup_dev_cmp(right->cgroup_id.dev, left->cgroup_id.dev);
+ if (ret != 0)
+ return ret;
+
+ return _sort__cgroup_inode_cmp(right->cgroup_id.ino,
+ left->cgroup_id.ino);
+}
+
+static int hist_entry__cgroup_id_snprintf(struct hist_entry *he,
+ char *bf, size_t size,
+ unsigned int width __maybe_unused)
+{
+ return repsep_snprintf(bf, size, "%lu/0x%lx", he->cgroup_id.dev,
+ he->cgroup_id.ino);
+}
+
+struct sort_entry sort_cgroup_id = {
+ .se_header = "cgroup id (dev/inode)",
+ .se_cmp = sort__cgroup_id_cmp,
+ .se_snprintf = hist_entry__cgroup_id_snprintf,
+ .se_width_idx = HISTC_CGROUP_ID,
+};
+
/* --sort socket */
static int64_t
@@ -1464,6 +1504,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_TRANSACTION, "transaction", sort_transaction),
DIM(SORT_TRACE, "trace", sort_trace),
DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size),
+ DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f583325a3743..baf20a399f34 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -54,6 +54,11 @@ struct he_stat {
u32 nr_events;
};
+struct namespace_id {
+ u64 dev;
+ u64 ino;
+};
+
struct hist_entry_diff {
bool computed;
union {
@@ -91,6 +96,7 @@ struct hist_entry {
struct map_symbol ms;
struct thread *thread;
struct comm *comm;
+ struct namespace_id cgroup_id;
u64 ip;
u64 transaction;
s32 socket;
@@ -212,6 +218,7 @@ enum sort_type {
SORT_TRANSACTION,
SORT_TRACE,
SORT_SYM_SIZE,
+ SORT_CGROUP_ID,
/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.9.3
next prev parent reply other threads:[~2017-03-14 18:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-14 18:50 [GIT PULL 00/19] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 01/19] perf report: Hide tip message when -q option is given Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 02/19] perf c2c: Clarify help message of --stats option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 03/19] perf c2c: Fix display bug when using pipe Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 04/19] perf tools: Missing c2c command in command-list Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 05/19] perf tools: Ignore generated files pmu-events/{jevents,pmu-events.c} for git Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 06/19] perf sort: Fix segfault with basic block 'cycles' sort dimension Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 07/19] perf report: Document +field style argument support for --field option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 08/19] perf hists browser: Fix typo in function switch_data_file Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 09/19] perf: Add PERF_RECORD_NAMESPACES to include namespaces related info Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 10/19] perf tools: " Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 11/19] perf record: Synthesize namespace events for current processes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 12/19] perf script: Add script print support for namespace events Arnaldo Carvalho de Melo
2017-03-14 18:50 ` Arnaldo Carvalho de Melo [this message]
2017-03-14 18:50 ` [PATCH 14/19] perf sched timehist: Add --next option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 15/19] perf probe: Factor out the ftrace README scanning Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 16/19] perf kretprobes: Offset from reloc_sym if kernel supports it Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 17/19] perf powerpc: Choose local entry point with kretprobes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 18/19] doc: trace/kprobes: add information about NOKPROBE_SYMBOL Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 19/19] kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL Arnaldo Carvalho de Melo
2017-03-15 18:29 ` [GIT PULL 00/19] perf/core improvements and fixes Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170314185023.31303-14-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=ananth@linux.vnet.ibm.com \
--cc=aravinda@linux.vnet.ibm.com \
--cc=ast@fb.com \
--cc=brendan.d.gregg@gmail.com \
--cc=daniel@iogearbox.net \
--cc=ebiederm@xmission.com \
--cc=hbathini@linux.vnet.ibm.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sargun@sargun.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).