linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Hari Bathini <hbathini@linux.vnet.ibm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Alexei Starovoitov <ast@fb.com>,
	Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>,
	Aravinda Prasad <aravinda@linux.vnet.ibm.com>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Biederman <ebiederm@xmission.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Sargun Dhillon <sargun@sargun.me>,
	Steven Rostedt <rostedt@goodmis.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 13/19] perf tools: Add 'cgroup_id' sort order keyword
Date: Tue, 14 Mar 2017 15:50:17 -0300	[thread overview]
Message-ID: <20170314185023.31303-14-acme@kernel.org> (raw)
In-Reply-To: <20170314185023.31303-1-acme@kernel.org>

From: Hari Bathini <hbathini@linux.vnet.ibm.com>

This patch introduces a cgroup identifier entry field in perf report to
identify or distinguish data of different cgroups. It uses the device
number and inode number of cgroup namespace, included in perf data with
the new PERF_RECORD_NAMESPACES event, as cgroup identifier.

With the assumption that each container is created with it's own cgroup
namespace,  this allows assessment/analysis of multiple containers at
once.

A simple test for this would be to clone a few processes passing
SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
different workloads  on each of those contexts,  while running perf
record command with --namespaces option.

Shown below is the output of perf report, sorted with cgroup identifier,
on perf.data generated with the above test scenario, clearly indicating
one context's considerable use of kernel memory in comparison with
others:

	$ perf report -s cgroup_id,sample --stdio
	#
	# Total Lost Samples: 0
	#
	# Samples: 5K of event 'kmem:kmalloc'
	# Event count (approx.): 5965
	#
	# Overhead  cgroup id (dev/inode)       Samples
	# ........  .....................  ............
	#
	    81.27%  3/0xeffffffb                   4848
	    16.24%  3/0xf00000d0                    969
	     1.16%  3/0xf00000ce                     69
	     0.82%  3/0xf00000cf                     49
	     0.50%  0/0x0                            30

While this is a start, there is further scope of improving this. For
example, instead of cgroup namespace's device and inode numbers, dev
and inode numbers of some or all namespaces may be used to distinguish
which processes are running in a given container context.

Also, scripts to map device and inode info to containers sounds
plausible for better tracing of containers.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt |  4 +++-
 tools/perf/util/hist.c                   |  7 ++++++
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/sort.c                   | 41 ++++++++++++++++++++++++++++++++
 tools/perf/util/sort.h                   |  7 ++++++
 5 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 672b149aa80a..e9a61f5485eb 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -72,7 +72,8 @@ OPTIONS
 --sort=::
 	Sort histogram entries by given key(s) - multiple keys can be specified
 	in CSV format.  Following sort keys are available:
-	pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, local_weight.
+	pid, comm, dso, symbol, parent, cpu, socket, srcline, weight,
+	local_weight, cgroup_id.
 
 	Each key has following meaning:
 
@@ -92,6 +93,7 @@ OPTIONS
 	- weight: Event specific weight, e.g. memory latency or transaction
 	abort cost. This is the global weight.
 	- local_weight: Local weight version of the weight above.
+	- cgroup_id: ID derived from cgroup namespace device and inode numbers.
 	- transaction: Transaction abort flags.
 	- overhead: Overhead percentage of sample
 	- overhead_sys: Overhead percentage of sample running in system mode
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index eaf72a938fb4..e3b38f629504 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -3,6 +3,7 @@
 #include "hist.h"
 #include "map.h"
 #include "session.h"
+#include "namespaces.h"
 #include "sort.h"
 #include "evlist.h"
 #include "evsel.h"
@@ -169,6 +170,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 		hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
 	}
 
+	hists__new_col_len(hists, HISTC_CGROUP_ID, 20);
 	hists__new_col_len(hists, HISTC_CPU, 3);
 	hists__new_col_len(hists, HISTC_SOCKET, 6);
 	hists__new_col_len(hists, HISTC_MEM_LOCKED, 6);
@@ -574,9 +576,14 @@ __hists__add_entry(struct hists *hists,
 		   bool sample_self,
 		   struct hist_entry_ops *ops)
 {
+	struct namespaces *ns = thread__namespaces(al->thread);
 	struct hist_entry entry = {
 		.thread	= al->thread,
 		.comm = thread__comm(al->thread),
+		.cgroup_id = {
+			.dev = ns ? ns->link_info[CGROUP_NS_INDEX].dev : 0,
+			.ino = ns ? ns->link_info[CGROUP_NS_INDEX].ino : 0,
+		},
 		.ms = {
 			.map	= al->map,
 			.sym	= al->sym,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2e839bf40bdd..ee3670a388df 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -30,6 +30,7 @@ enum hist_column {
 	HISTC_DSO,
 	HISTC_THREAD,
 	HISTC_COMM,
+	HISTC_CGROUP_ID,
 	HISTC_PARENT,
 	HISTC_CPU,
 	HISTC_SOCKET,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 93f755ac60ca..8b0d4e39f640 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -536,6 +536,46 @@ struct sort_entry sort_cpu = {
 	.se_width_idx	= HISTC_CPU,
 };
 
+/* --sort cgroup_id */
+
+static int64_t _sort__cgroup_dev_cmp(u64 left_dev, u64 right_dev)
+{
+	return (int64_t)(right_dev - left_dev);
+}
+
+static int64_t _sort__cgroup_inode_cmp(u64 left_ino, u64 right_ino)
+{
+	return (int64_t)(right_ino - left_ino);
+}
+
+static int64_t
+sort__cgroup_id_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	int64_t ret;
+
+	ret = _sort__cgroup_dev_cmp(right->cgroup_id.dev, left->cgroup_id.dev);
+	if (ret != 0)
+		return ret;
+
+	return _sort__cgroup_inode_cmp(right->cgroup_id.ino,
+				       left->cgroup_id.ino);
+}
+
+static int hist_entry__cgroup_id_snprintf(struct hist_entry *he,
+					  char *bf, size_t size,
+					  unsigned int width __maybe_unused)
+{
+	return repsep_snprintf(bf, size, "%lu/0x%lx", he->cgroup_id.dev,
+			       he->cgroup_id.ino);
+}
+
+struct sort_entry sort_cgroup_id = {
+	.se_header      = "cgroup id (dev/inode)",
+	.se_cmp	        = sort__cgroup_id_cmp,
+	.se_snprintf    = hist_entry__cgroup_id_snprintf,
+	.se_width_idx	= HISTC_CGROUP_ID,
+};
+
 /* --sort socket */
 
 static int64_t
@@ -1464,6 +1504,7 @@ static struct sort_dimension common_sort_dimensions[] = {
 	DIM(SORT_TRANSACTION, "transaction", sort_transaction),
 	DIM(SORT_TRACE, "trace", sort_trace),
 	DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size),
+	DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f583325a3743..baf20a399f34 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -54,6 +54,11 @@ struct he_stat {
 	u32			nr_events;
 };
 
+struct namespace_id {
+	u64			dev;
+	u64			ino;
+};
+
 struct hist_entry_diff {
 	bool	computed;
 	union {
@@ -91,6 +96,7 @@ struct hist_entry {
 	struct map_symbol	ms;
 	struct thread		*thread;
 	struct comm		*comm;
+	struct namespace_id	cgroup_id;
 	u64			ip;
 	u64			transaction;
 	s32			socket;
@@ -212,6 +218,7 @@ enum sort_type {
 	SORT_TRANSACTION,
 	SORT_TRACE,
 	SORT_SYM_SIZE,
+	SORT_CGROUP_ID,
 
 	/* branch stack specific sort keys */
 	__SORT_BRANCH_STACK,
-- 
2.9.3

  parent reply	other threads:[~2017-03-14 18:51 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-14 18:50 [GIT PULL 00/19] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 01/19] perf report: Hide tip message when -q option is given Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 02/19] perf c2c: Clarify help message of --stats option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 03/19] perf c2c: Fix display bug when using pipe Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 04/19] perf tools: Missing c2c command in command-list Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 05/19] perf tools: Ignore generated files pmu-events/{jevents,pmu-events.c} for git Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 06/19] perf sort: Fix segfault with basic block 'cycles' sort dimension Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 07/19] perf report: Document +field style argument support for --field option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 08/19] perf hists browser: Fix typo in function switch_data_file Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 09/19] perf: Add PERF_RECORD_NAMESPACES to include namespaces related info Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 10/19] perf tools: " Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 11/19] perf record: Synthesize namespace events for current processes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 12/19] perf script: Add script print support for namespace events Arnaldo Carvalho de Melo
2017-03-14 18:50 ` Arnaldo Carvalho de Melo [this message]
2017-03-14 18:50 ` [PATCH 14/19] perf sched timehist: Add --next option Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 15/19] perf probe: Factor out the ftrace README scanning Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 16/19] perf kretprobes: Offset from reloc_sym if kernel supports it Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 17/19] perf powerpc: Choose local entry point with kretprobes Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 18/19] doc: trace/kprobes: add information about NOKPROBE_SYMBOL Arnaldo Carvalho de Melo
2017-03-14 18:50 ` [PATCH 19/19] kprobes: Convert kprobe_exceptions_notify to use NOKPROBE_SYMBOL Arnaldo Carvalho de Melo
2017-03-15 18:29 ` [GIT PULL 00/19] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170314185023.31303-14-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ananth@linux.vnet.ibm.com \
    --cc=aravinda@linux.vnet.ibm.com \
    --cc=ast@fb.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=ebiederm@xmission.com \
    --cc=hbathini@linux.vnet.ibm.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sargun@sargun.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).